Master list of grievances from an experienced Mandarin learner

UPDATE: Through a little experimentation and the power of positive thinking, I found workarounds to both Chinese problems listed below. The solutions are at the bottom of this post.

Hi all,

tl,dr: I used to use LingQ with Chinese but fizzled out because I was a beginner. Now that I’m intermediate, I’ve discovered it’s still one of the best learning tools available to me. LingQ IS amazing, I’ve had premium for 2 years, and plan on keeping it forever, but there’s still some really painful issues with it, some general, and some specific to Chinese. So, I thought I’d make a master list for the sake of development and improving the product.

First, a little back story. I’ve been learning Chinese for about 3.5 years, have gotten myself solidly to an intermediate level, and have just come around to using LingQ again. I started using it in 2019, and absolutely loved the idea behind it, having been a fan of Steve’s, Stephen Krashen, and CI. I was still a little too much of a beginner, so my use of it fizzled out, instead opting for learning vocabulary with Anki and Pleco.

I recently started an HSK 4 course, and found myself frustrated with the tools I’d been using to learn. Too much memorization, not enough reading/content. Long story short, I’ve been importing my HSK 4 lesson texts into LingQ. It’s been great, with the exception of some really significant issues. So without further ado, here’s the list:

Specific to Chinese:

  1. Character grouping into words is broken. If possible, please just copy Pleco :-).

Chinese works like legos – you put a common block (character) with a more specific block, and you make a word. The problem with lingq is that their text importer combines words in a way that very frequently contains errors. And I mean “happens every 1-2 sentences” frequently. Very, very, very frequently. This pretty fundamentally breaks one of the core functions of lingq: tagging words, tracking how well you know them, and using that to improve your reading. Pleco can do this very well, not perfectly, but very well. For those who know Chinese and get how complements and particles work (到,好,过,着,了,etc.), this is a really really serious issue.

example:

I imported the following (the spaces didn’t exist in the original, but I’ll add spaces to show where the word divisions would normally be)

取 得 成功 的 人 往往 都 经历过 许多 失败 (People who attain success often experience many failures)

When I imported the text, it came out like this,

取得成功 的 人 往往 都 经历 过许 多 失败

Now, 过许 is not a word in Chinese (although you can add 过 to all kinds of stuff). It doesn’t show up in any of my dictionaries. Nevertheless, you can click on it, and there are user added definitions for this non-word. So in this case this bug creates two problems, it could misinform a learner who doesn’t know where the proper division is, and it breaks up the correct word 许多, preventing a learner from correctly tagging their target word. There is no possible way to override the incorrect grouping. When I’m reading a new text, probably 40% of my tagging blue items is simply ignoring incorrectly grouped words that don’t exist.

I get that dev work is hard, and you have to make tools that somehow work for every single language, but a way to override groupings, or have this be something that’s decided by a human being, rather than automated, would be huge. I honestly want to make a bounty on fixing this problem, because this one is massive for any Chinese learner. If there’s no plan to fix this, but it’s theoretically fixable, devs pm me about a bounty.

  1. Spaces and paragraphs. When I import text that is correctly formatted, I lose two things. I lose paragraphs/line breaks, and I lose spaces after punctuation. I get how you can fix paragraphs with the sentence end markers, but it’s tedious. The space after punctuation thing I still don’t know how to fix, and makes text look cramped. (This could have to do with how Chinese typing handles punctuation. It adds spaces after punctuation automatically. I’ve tested some and gotten inconsistent results.)

General (not specific to Chinese):

  1. On the desktop browser version, the right half of the screen (picking definitions and such) takes up a totally massive amount of space. There’s still no “full screen” experience available on the desktop version.

That’s all I’ve got for now. Fix this stuff, lingq is an unbeatable tool for learning Chinese. I know devs are pushing the 5.0 update and probably busy with other stuff, but I still want to document that these are significant issues for learning Chinese. And last, I still love lingq and think it’s a great product.

Cheers.

UPDATE:

  1. To fix character spacing issues, edit the lesson text. When you import Chinese text (which will have zero spaces by default), lingq will group characters into words, and separate each grouped word with a space. If characters are grouped incorrectly, just delete or add spaces.

  2. For paragraphs, this is also a text importing issue. When I imported the text, there were spaces and the punctuation was entered as if with a non-Chinese keyboard. Delete the bad punctuation and re-enter it with a Chinese keyboard.

7 Likes

I fully agree with your general comment on “no full screen” experience. This is one of my grievances with reading on Lingq.

I do not care abut the “paragraph” thing.

However, I cannot confirm your comment " The problem with lingq is that their text importer combines words in a way that very frequently contains errors. And I mean “happens every 1-2 sentences” frequently. " at all. I guess, yes, it does happen, but in my experience maybe once in a text of 500 words. Are you importing from a variety of sources or just one source?

2 Likes

Really? That’s very surprising to me, since that’s the number 1 complaint I have. I deal mostly with just plain text. I might copy a lesson text over from a textbook, sometimes pasting in subtitles to a tv show or movie. I don’t use the importing tools too often. Do you use spaces in the input text to separate things?

Subjectively to me the character spacing issue seems to have gotten worse at some point!! I did most of my Chinese in 2015 2016, and have since only taken peeks into the library, including now.
Back in 16 I had to hit x for ignore on only the occasional wrongly grouped non-word, but it was largely working. Since, it seems to have gotten worse almost to the point of randomness.
Not sure if it happened around the glorious 2016 4.0 update?
I agree that this is a big problem now.

1 Like

I just played around and realized what’s going on. When you import plain Chinese text to lingq, it automatically groups characters into words. In my experience, this is where things get messed up very frequently. The good news, however, is that you CAN go into the lesson text, and edit the groupings by adding or deleting spaces. I just checked, and it works. I’m going to update the original post so others see. Assuming you have the power to edit the lesson text, it’s a workable solution all the way around.

I actually just found the solution to the character spacing issue. See my reply to JanFinster (I’m going to add an update in the original post too explaining how to do it).

Unfortunately that’s not a solution at all. It means that I have to go through and read the whole text before I read it, and know Chinese well enough on top. That defeats the whole point.

I haven’t gotten into much importing in Chinese. The available lessons have been plentiful enough, but they are just infected with a pandemic of bad word spacing.

Not only would manually editing the spacing be an inordinate amount of work up front, which I frankly won’t do, but as you wrote this won’t help any beginners who don’t have a clue as to which words are correctly spaced, which aren’t.

LingQ should fix this, and it can’t be the biggest challenge, since other products already have the solution integrated. As I wrote it seems to have worked far better before.

Let’s say you have a lesson that would be 10 min of reading and lingqing work. Do you expect people to spend therefore at least 5 min more or less speed reading that lesson without dictionary aid, manually editing it, only to then actually read and lingq it? Now imagine that with a 30x10 min book.

Then lingq has made itself redundant, beacuse it doesn’t have the automatic funcitonality of highlighting real words, and I might as well just read that thing off lingq with the Zhongwen pop-up dictionary browser extension. Lingq becomes only a word counter and reviewing reader, making you work extra for things it should properly have done automatically.

2 Likes

I think #2 is not specific to Chinese.

The import tool seems to treat line breaks differently depending on the format you’re importing from. If you import formatted text (Word document, epub book, etc) , it will recognize paragraphs. However, if you import plain text it will treat adjacent lines as belonging to a single paragraph while an empty line will be treated as the end of the paragraph.

Everything within the paragraph is broken into sencentces then. The text between the end of the previous sentence and either a period, a question mark or an exclamation mark will be put into a single sentence (so if you have something like a dialog where lines do not end with punctuation marks - it will be merged into a single sentence).

As far as the spaces after words and punctuation are concerned, it seems annoyingly consistent to me. If there are more then one space between two words, punctuation marks or a punctuation mark and a word, just a single space will be displayed in the lesson. If there is no space between a punctuation mark and the next words, it will be added (thus butchering things like website addresses). I don’t think there is something we can do about it.

2 Likes

Did not see your update about paragraphs when I was writing this

1 Like

I agree with your list, those are probably my two biggest issues as well. In fact, I just posted about it the other day asking for a fix. That said, I also agree, overall Lingq is my daily go-to and has very successfully got me to intermediate in Chinese, so I am a big fan all said and done.

I wanted to share my work around for the word spaces issue, which is a bit more efficient than just adding the spaces manually. What I do is:

  1. Import from source (netflix, book, youtube, etc) using the Chrome plugin. I do this just to create the lesson “shell”
  2. Go to the source and copy the text directly. This will be un-segmented.
  3. Use MDBG Chinese Dictionary to parse the words in to proper segementation (uses CC-CEDICT which I think is the same as pleco).
  4. On the MDGB site, make sure to click on “Look up All Chinese Words in a Text?” to be able to enter your full text (can handle quite a large number of lines/characters)
  5. Copy the segmented output from here
  6. Overwrite the text in the “shell” from step 1 with this

And boom, perfectly segmented text, somewhat in bulk. Its still annoying that you have to do the steps, but once you get used to it, only takes a minute or two.

1 Like

Your suggestion to manually edit the text is well known, but this is waaaayyy to much hassle. If I had to do this every other line, I would quit Lingq.

How about you import this free lesson from TheChairMansBao (https://www.thechairmansbao.com/intelligent-bins-for-sorting-rubbish-appear-in-beijing/ ) and tell us how many words are incorrectly assigned. I tested it and I have zero!

Another workaround is to simply mark more than just 2 characters. Most Lingqs that I create are parts of sentences since it is recommended to study words in context anyway.
E.g. I would create a lingq for “四个不同颜色的垃圾箱” instead of just “垃圾箱” .

2 Likes

I sent an email to LingQ support about this issue a couple months back. I’ve had the same struggle with Chinese. Also, I was worried it would be replicated in Vietnamese as Vietnamese assembles words the same way Chinese does. If the problem exists for Chinese, then it will exist for Vietnamese as well.

This was the reply I got:

"Our new reader, in LingQ 5.0 version we are working on should bring huge improvement regarding spacing in Asian languages, so hopefully issues you noticed will be gone in new update.
Also, lesson editing for Premium users will be much easier to do in this new update and you will be able to make changes directly in the reader, not on the Edit Lesson screen.
Regards,
Zoran "

As a relatively new LingQ user (2 months) and Mandarin learner for about a year at almost 1.500 words now on LingQ, the faulty word spacing is my number one issue. I’ve started to really like LingQ a lot, thus I really hope they fix this!
What baffles me most is that it happens quite a bit even in the Mini Stories. This is sort of besides the point, but I would’ve expected those to be manually edited on a somewhat higher level than the average self-imported text.
Another (probably unrelated, sorry) grievance for me is that relatively often, for characters that have multiple pronunciations (i.e. 倒 which can be either dǎo or dào) the Pinyin transcription will sometimes be the wrong option for that context – while it seems that the automated voice reader gets it right pretty much all the time! So it seems like the correct info has to be there, somewhere, but the Pinyin transcription isn’t accessing it?

2 Likes

Late replying, but thanks for your comments. I haven’t imported anything new since this post, but I’ll keep a look out for rate of incorrect segmenting. Maybe it’s source dependent and I just got unlucky. Now that I at least have a way to fix errors, I’m not as concerned anymore. I tend to thoroughly go through a text anyway, so I don’t mind being meticulous. Hopefully 5.0 makes some improvements.

Also, I definitely will start looking at lingqs along the line of phrases. Very good advice! The alchemy of Chinese characters and how they combine is definitely a long learning process beyond just learning individual characters.

That’s really terrific news! Thanks for sharing.

It makes sense that they’d be able to solve it to some degree. Especially for learners who are still learning and translating one word at a time, having really accurate segmentation by default is pretty important. The potential for confusion can be big. Thanks again for sharing.

The longer I’ve studied, the more I’ve come to appreciate that this is a particularly hard problem with Chinese. You can have 看来 (seems like) 来到 (came/arrived) 看到 (see) 看来 到… (seems like to…) and you can see how things get really confusing.

The pinyin thing is another annoyance, but it’s also just part of learning this particular language. It certainly seems that lingq picks one pinyin for a character and it never changes. I think there’s an issue with 着 if I’m remembering correctly. Just today, I read 随随便便 (sui2 sui2 bian4 bian4, which means easy-going or effortless) and lingq wrote it as sui2 pian4, because this is the same character as 便宜 (pianyi, cheap). Idk, I feel like it’s going to be a struggle no matter what learning method you use. There’s others I guess, 行 (xing2) and 银行 (yin1 hang2). I’m sure they could improve it but it’s maybe a difficult issue to work on.

1 Like

The big mistakes people seem to be making in this thread appear to be:

  1. people use lingq for their chinese reading
  2. people expect that lingq chinese reader will ever improve.

Breaking news: we’ve been waiting ten years for a workable chinese reader - and it’s never going to get good (just calling it the way I see it).

When I started I used : Chinese Text Annotation - MandarinSpot , along with chrome’s zhingwen popup dictionary - and a finer method for starting out to read chinese you will not find anywhere on the internet.

Also see : Language Log » How to learn to read Chinese

Nowadays, I just use pleco reader, and it’s search history function for review of words.

3 Likes

It may separate words better than Lingq (?), but it does not save your known words and Lingqs, etc.

People here are really over-complicating the problem. Just create Lingqs that are more than 2 characters (i.e. 3-4 words) as they are much more useful to review anyway.

As for the spacing between words, you could argue that such a clutch is as harmful as having the pinyin over the characters. Because, if you start reading real Chinese you may not have any of those two. I have seen native texts that were just long strings of characters and you just need to know which characters belong together and form words.

As for Pleco reader, as far as I know it does not work on desktop, so it is not helpful to people, who study with a PC.

Almost stopped reading at “harmful as pinyin over the characters” (absolutely terrible advice by the way) - but if you want pleco on pc just use bluestacks.

Thanks regarding bluestacks :slight_smile:
Regarding pinyin over characters: everyone may have a different opinion. This is certainly the case when it comes to language learning. I am definitely not alone with my opinion and in good company of people, who are way more advanced than I am and most likely ever will be. In fact, I was in your camp, but then converted :wink: I started with pinyin over characters and could read lots of text quite easily. The cold shower came when I tried to read the text without pinyin I suddenly did not recognise the very same characters anymore. It is a bit like believing you can understand an audio clip because you can follow when you see the transcript in front of you. Take the transcript away and you are lost…
Lingq makes it so easy to look up the pinyin that there is really no need for pinyin over text. Anyway, to each his own…