97k known words in Russian

I am a native Russian speaker. It was just interesting to me how it would look like if I had about 100k known words in a Slavic language.
I confirm I ignored all names and foreign words (writen in latin alphabet). All marked known words are real words.
The learners should not be afraid if they have lessons with more than 20% of unknown words. The classical literature can have a lot of words which are not used every day. Just look at a Dostoyevski’s text on the picture. A transcription of a radio episode has more unknown words but the percentage is only 10%.
The conclusion: sometimes 100k is not enough to have lessons with green percentage :wink:

10 Likes

A similar situation is with my 53k known words in Polish.
I do not create LingQs in Polish, I just leave the unknown words blue.
I must say Polish is one of the easiest foreign languages for me as a native Russian and Belarusian speaker.
So, the lessons before reading: 155 (20%), 122 (16%), 284 (24%).
Other already read lessons in the same course: 43 (6%), 61 (8%), 78 (10%).

Thanks, Ress, that’s interesting. I have a few thoughts, which will be nothing new to you but which might interest the beginner/intermediate learners. How many words do they need to know before they “know” the language, if you find that many new words at your level?

First, the grammar of course inflates the number of “words” as counted by Lingq, and you can go a long ways without seeing every possible form of a particular word. For example, you may have long ago learned a simple word like кидать. But you can go a very long time without ever seeing кидавшими in print. Lingq will count that as a new, unknown word even though you already know it if you know the base form and are familiar with the grammar. You may well not have to look up this word.

Secondly, when you reach a certain level you will often be able to recognize true new words from their roots and affixes aided by the context. A simple example: I recently paused for a second over “достроен”, as in “мой дом достроен”. I recognized -строен as probably a short-form adjective/participle from строить, which fits the context, and I made an educated guess as to how до- colors the meaning. It turns out that I already “knew” a word that I had never seen before. We do this this all the time in our native languages. I think it’s probably easier in Russian with its relatively consistent Slavic roots and affixes than it is with a mongrel language like English.

tl;dnr – You already know more words than either you or Lingq think you know. :slight_smile:

4 Likes

I am and was always very sceptical about measuring of known words general and especially in Lingq. For a native speaker it is of no use, but also for a language learner it isn’t very useful.
For example, we have in Russian 6 cases, so a noun can have 12 alternations (6 in Sg and 6 in Pl) -and every of them Lingq will count as a new word!..
However, I agree with Khardy that after some time of learning of a foreign language, we’re able to guess the most of the ‘unknown’ words because we’ve read before some of the forms of these words or other related, allied words.

2 Likes

Same with my 62.5k words in Korean. I know enough words to comfortably read through light novels without spending too much time buried in a dictionary. Newspapers are still somewhat difficult and require a lot more digging through the dictionary. I feel like there will be a huge difference if I can hit 80k.

1 Like

I think it’s useful for some things and not for others.

It’s not useful for comparing knowledge of a single learner of two different languages, for the reason you mentioned. It’s not useful for comparing knowledge of the same language between two different learners, because they may have different LingQing habits (quicker or slower to mark a work ‘known’, etc.). So all such comparisons are apples and oranges.

But for an individual learner, for a given language, I think it’s a great kind of personal benchmark and motivator. I know that if I get my Russian ‘known words’ from 10k, to 20k, then 40k, then 80k… over that time, I will have made some serious progress! I like to watch the number grow. I don’t really care that it’s not a scientific metric and not comparable across users or languages. If I want that, I’ll go sign up for a language exam.

5 Likes