The top 2000 words in Russian

I came across a good resource for learners of Russian just now: this page, which has the top 2000 words used in modern Russian. The words provided are based on The frequency dictionary for Russian.

According to the frequency dictionary, the top 2000 most used words in Russian account for 72% of the word forms used in texts, so if you learn these, you’ll be well on your way to being able to (slowly) work your way through many Russian texts. The site provides both lists of the words, coupled with their usage frequency, along with their parts of speech, and of course, the translations. Also available are quizes for all of the words.

While the frequency dictionary page doesn’t offer any definitions, they offer lists of Russian words beyond the top 2000. They offer one list of “32,000 words with frequency greater than 1 ipm (one instance per million).” They offer a second list, with the top 5,000 most often used words in Russian. I’d say the latter would be more useful for learners of Russian.

There’s a bit of “interesting” data on the frequency dictionary page which I enjoyed reading:

  • The average word length is 5.28 characters.
  • The average sentence length is 10.38 words.
  • 1000 most frequent lemmas cover 64.0708% of word forms in texts.
  • 2000 most frequent lemmas cover 71.9521% of word forms in texts.
  • 3000 most frequent lemmas cover 76.6824% of word forms in texts.
  • 5000 most frequent lemmas cover 82.0604% of word forms in texts.

I think it’s interesting to note that the first 2000 words gets you to 72%, and yet learning another three thousand words will only gain another 10%. Diminishing returns, ineed. :)

Hi Josh, excellent blog mate! Just found it a few nights ago and have been reading through.

I’ve often thought about frequency lists and have tried to gauge how many words are necessary to have a strong foundation in a language. Originally I thought 70% (about 2000 words) might be adequate, but I know people from my Russian classes who know many more and still have great difficulty following natural conversations, movies, radio etc.

For example (and this is just a really rough example!), the following sentence has 20% of the words removed and subsequently loses its meaning:

Olya doesn’t like ????, she prefers ???? because it is easier to ???? and doesn’t make her feel ????.

The original might have gone something like:

Olya doesn’t like beer, she prefers vodka because it is easier to swallow and doesn’t make her feel too full.

or:

Olya doesn’t like vodka, she prefers mineral water because it is easier to drink and doesn’t make her feel dizzy.

Of course these example are quite artificial, but you get my drift.

So in reading around the net there seems to be some opinions that about 10,000 words make a good base. But that’s a heck of a lot of words and a big chunk of neural real estate (well, at least in my old brain!). Not to mention the time investment involved - I can do about 2 new words per day, on a consistent basis, so to reach the goal of 10,000 would take me approximately 14 years!!! :)

But on your advice I’ve recently installed Anki (great word learning software), so hopefully I can reduce that down to 10 years or so :)

All the best,

Jon.

Jon: Welcome to the blog, and thanks for commenting! I too thought (I read it somewhere, actually) that knowing 2000 core words was enough to follow basic conversations and what not. After learning German for so long now, I don’t agree with the 2000 thing. I’m not sure how many words I know in German, but I’d bet my left arm I know more than 2000 words, and I still struggle through many news articles.

10,000 words is indeed a large number, but I have to admit that after my experiences with German, I’d say 10,000 is closer to the mark than 2,000. And, agreed - the idea of needing to know 10,000 words as a “good base” in a language is daunting. But, really, it’s not that surprising. I’ve read that the average (whatever that means, heh) person knows 10,000-20,000 words. This site says that one fellow figured that the average college graduate might have an active vocabulary of 60,000 words, and a passive one of 75,000. I don’t know if that’s too high or not, but I do know that 2,000 words just isn’t going to cut it.

Glad you’re liking Anki. It’s a welcome change from SuperMemo.