Strage behaviour of LingQ?

In working through several Lessons from U Iowa, I’m noticing that LingQ is tending to group characters in a somewhat arbitrary and occasionally senseless manner.

For example in the text the groups which are pre-selected by to make lingQ’s include

我请
我给
给他们
问我

Notice, particularly the third line. The meaning is clear, but why can’t we separate the verb from the pronoun?
Then we find the characters separated in the fourth and fifth lines when it should really read 问我们 or, in my opinion, even better 问 我们 so that the verb and the pronoun are kept separate.

This eccentricity of lingo creates words for our known word lists which are rather misleading because 我请 is not a third word … it is really just two words 我 and 请.

Over time, if this continues it would report that we are learning a large number of extra words when all we are doing is combining them into short phrases like, I give, you give, he gives, she gives, it gives …etc.

Is there some linguistic reason to present these groupings that I am not aware of,
or is this just an arbitrary choice or eccencentricity of LingQ?

Another point:

The thing that is really annoying about this automatic grouping of Lingq is that it is impossible to create a lingq for the components of the pre-selected phrase --that is separating the verb from the pronoun.

in the example I used above, it does not seem possible to create a Lingq for the verb 请 from the grouping 我请. Does anyone know of a way to do this?

answers in old threads.

TroisRoyaumes, that isn’t a very helpful answer.

@jfeka - In short, this is a byproduct of the splitter we use for Chinese. I don’t remember exactly how the algorithm works, but it goes through word lists and determines where a space should be added in text. It’s a bit tricky to make this work 100%, since the same string of 10 characters can often be split different ways, and there isn’t a surefire way to know. The splitter as it is now does seem to work pretty well, correctly splitting about 90-95% of words, though there are some that end up getting split incorrectly.

For now we’ve focused our attention on other parts of the site, since the splitter does do a fairly good job of splitting the text. We do at some point hope to improve it further by perhaps manually allowing our learners to adjust the word boundaries, though this would be a major project and isn’t something that is on the immediate horizon.

Hopefully this helps clarify things a bit!

That helps me understand the situation.

At least it helps me feel more sure that some of the anomalies I’ve run into are not things that I’ve missed in understanding the structure of the language.

@jfeka - Glad to hear it :slight_smile:

Of course, if you do have questions about specific words or phrases, you’re more than welcome to ask on the forum and someone will hopefully come along to help!