How many words to be fluent?

Boundaries are invariably artificial. When it comes to the common question ‘How many words do I need to know in a language to be fluent?’ then it becomes even harder to define a boundary line. There was an interesting study on vocabulary size published in the journal Frontiers in Psychology based on an analysis by Ghent University of the existing international literature and a large-scale ‘crowdsourcing’ study. The researchers say that the answer to this persistent question ‘usually starts with a deep sigh, followed by the explanation that the number depends on how a word is defined!’
The various answers in the literature for a ‘typical native speaker’ go from ‘less than 10 thousand to over 200 thousand’. Their conclusion is that ‘an average 20-year-old native speaker of American English knows 42,000 lemmas and 4,200 non-transparent multiword expressions, derived from 11,000 word families’. Inevitably this gets rather technical, but essentially a ‘lemma’ is a dictionary entry of the basic form of a word, with of course any number of derivatives, compounds, inflections, phrases and ‘word families’ kaleidoscopically revolving around that core.
The Oxford English Dictionary lists 171,476 words in current use but a lot of people seem to get by in much less in the English language; with respect, the current President of the United States does not seem to have added many words since his time at the University of Pennsylvania…
No wonder experts give a ‘deep sigh’ before they proffer an answer to the time-honoured question! Most languages seem to have a colourful expression for an impossible task; in English you can be ‘flogging a dead horse’.
But on the other hand … having a target is a useful staging post in most endeavours. The Ghent University study is well-argued and gives some solid ground in the linguistic quicksands. And while the lingq ‘known words’ structure necessarily does not match on precisely to ‘lemmas’ it still gives a useful yardstick to aim for. So having reached my admittedly amorphous target in lingq ‘known words’, roughly equating to that notional ‘average 20-year-old native speaker of American English’ I can now have a quiet celebration, such as Covid-19 circumstances allow, and set myself up a new very notional unscientific target ahead!

’How many words do we know? Practical estimates of vocabulary size dependent on word definition, the degree of language input and the participant’s age’;
by Marc Brysbaert, Michaël Stevens, Pawel Mandela and Emmanuel Keuleer.
Front Psychol. 2016; 7: 1116.

2 Likes

It depends on your goal. You need fewer words to be fluent in speaking and much more words to be fluent in reading because we use fewer words by speaking than we can find by reading.
But you can understand 50,000 or even 80,000 words by speaking and yet you will speak not very well because for speaking we need words ‘on the tip of our tongue’ to operate them quickly and correctly.
In fact, we need for speaking for everyday topics not more than 3,000 words and for more advanced topics 10,000 good, actively known words are quite enough.

5 Likes

Thank you for sharing that info. It is very interesting.
However, let me point out that the question of how many words a native speaker understands and the question of how many words you need to become fluent in your target language are very different and have very different answers.
The factor that they have in common is that they depend on your definition of “word” and, more crucially, on your goals.
I know a lot of people that I would consider fluent because they function perfectly well in another language but who may get lost when reading official documents, legalese, etc. I don’t think that makes them less fluent. When confronted with such texts they have all the time in the world to look up words they don’t understand or ask some native.
On the other hand, it is not very rare to encounter foreigners who in fact know a lot of vocabulary that typical native speakers ignore because they are expert in some domain (science, art, craftmanship, …). Those same speakers may get absolutely lost if you talk to them in slang. Just as native speakers may not understand slang from another area or that of a different age group.
The end result is that you can be perfectly fluent with a very different vocabulary from that of the average native speaker, both in terms of quantity and of lexical composition.

I think 30K known words here at LingQ is a solid number to shoot for. For romance languages.

1 Like

I am at 31000 in French and still far from (reading) fluency, which I will say I have achieved when I can sit down with Proust without the help of the software.

The thing is, of those 31000 words I ‘know’, a significant portion (10%?, 20%? more?) are proper nouns (names, places), even English or Latin words - or fall under some other category making them not legitimate French words - that found their way into certain French texts I was reading and which I just logged as known, as it was less time consuming than clicking on them and ‘ignoring’ them.

So for me I think I will need at least 50,000 with the way I use LingQ before I can sit down with a paperback of Proust.

1 Like

In my limited experience (Spanish and French from English) I have found that about 5000 "key’ words is enough for fluent reading. This is why one of my first tasks it to get rapidly through a 5000 word frequency list. At about 1000 I can read simple things, and at 2000-3000 most ordindary books become possible with effort (and a lot of dictionary support sometimes.)

By 5000, I can read with nothing but a paper dictionary for the occasional word if I must know it.

This however has at least 2 caveats:

  1. Languages with a high number of cognates and loan words to a language we already know fluently.
  2. These aren’t “LinqQ words” with a large number of inflected forms but are closer to head words in a typical dictionary.

Listening and speaking is much harder. Though speaking can work with fewer words if we have a sufficient vocabulary of both words and “phrases” that serve as structural patterns. (People don’t really speak “in words” much more than they speak “in syllables”.)

Generally we speak using phrases and insert substitution words here and there.

In French, I now have 38,500 LingQ “known words” at this point and my Anki 5000 deck is 97.5% mature so reading is pretty straight forward at this point.

My remaining problems are figures of speech (not weird idioms) or periphrases (multiple words with the meaining of a single word or idea.)

I cannot yet comprehend random, full speed, spoken French reliably. Though some audio books and some videos are mostly ok.

2 Likes