Chinese Language Learning – 2021: A Year of Intensive Vocabulary Study in Review


*Previous posts in this series:*

1. [I’ve been reading “The Witches” in Chinese since the beginning of the year. I am almost halfway through!](
2. [Statistics and Future Vocabulary Acquisition](
3. [I have officially finished reading my first book in Chinese!](
4. [Reading List as a Curriculum](
5. [I finished my second book in Chinese!](
6. [Today, my vocabulary hit 10,000 words](
7. [I finished reading the fourth book in my reading list. Here is my progress:](


*Glossary of terms:*

1. Active vocabulary – *Words and phrases which one can recall and use correctly on one’s own*
2. Passive vocabulary – *Words and phrases which one can recognize and understand*
3. Extensive reading – *Reading without looking things up, or only rarely doing so*
4. Intensive reading – *Reading with frequent assistance from a dictionary*
5. Collocation – *Groups of words that are statistically likely to occur together, often in a specific order*


I figured posting this on the last day of Jul would be appropriate. By the way – happy holidays, y’all! Out with one year and in with another.

At the beginning of this year I decided to dedicate myself very seriously to the task of amassing a proper vocabulary in Chinese. This is something I had long neglected, and it showed – after about six years of learning, I only had an estimated vocabulary of ~5000 words (according to Chinese Text Analyser). I was tired of not really being able to understand anything. I was tired of not being able to comfortably express myself. I was tired of *almost* being able to speak Chinese, but not quite.

Well, I’m happy to say that this year has been a smashing success for me. My vocabulary now stands at an estimated ~15000 words, triple the size of what it was on January 1st of this year. Here’s how I accomplished this:

1. I used Chinese Text Analyser to compare a variety of novels to find a text that was most suitable for my level. I found (disappointingly, but not surprisingly) that even the most basic children’s novels were far beyond my reach for extensive reading, and even intensive reading was a stretch. I decided to opt for an even more intensive approach than you normally hear about.
2. I settled on a translation of Roald Dahl’s *The Witches* (女巫), because of all the novels I ran through Chinese Text Analyzer, this one had the fewest unknown words (2000+)
3. I used Chinese Text Analyser to identify each unknown word in the first chapter, and made flashcards for those words in Anki. I set myself a pace of 10 new words per day, and read the first chapter once I had learned all of the words. I worked through the book one chapter at a time, and thanks to my study method was able to read each chapter quite smoothly. After my first 2-3 books, I upped my vocabulary intake to 30 words per day, and I have maintained that pace since then.

I structured my Anki cards as follows:

1. **Front Side:** *Chinese term*
2. **Reverse Side (one of the following, in order of preference):**
1. *nearest Chinese synonym(s)*
2. *nearest English synonym(s)*
3. *short explanation in Chinese*
4. *short explanation in English*
5. *picture*

Now, my Anki routine *has* had some unfortunate downsides. I have successfully amassed a truly massive passive vocabulary in a relatively short amount of time, but I have had to pay a price for that. Namely, that my active vocabulary remains fairly weak. I designed my Anki regimen to give me the ability to recognise and understand these new words in context, but for most of them, my ability to recall them and use them correctly on my own is limited or nonexistent. I often do not entirely understand how to use a given word in a sentence (do I use 把? Does it usually appear in the negative? With what words does it usually collocate? etc.), and I usually don’t understand what makes a word different from its synonyms. However, I was happy to accept this trade-off, and I am still happy with that decision. I believe that, over time, as I consume more and more media in Chinese, my understanding of these words will deepen naturally. I also believe that more and more words will cross over into my active vocabulary from my passive vocabulary without overt effort on my part, although of course my active vocabulary will never reach a truly good size without significant speaking practice.

**So, what books did I read this year? I read:**

1. *The Witches*, by Roald Dahl(女巫)
2. *Charlie and the Chocolate Factory*, by Roald Dahl(查了和巧克力工厂)
3. *The Giver*, by Lois Lowry(记忆传授人)
4. *The Lion, the Witch, and the Wardrobe*, by C.S. Louis(狮子·女巫和魔衣柜)
5. *Prince Caspian*, by C.S. Louis(凯斯宾王子)
6. *Voyage of the Dawn Treader*, by C.S. Louis(黎明踏浪号)
7. *The Magician’s Nephew*, by C.S. Louis(魔法师的外甥)

I am currently working my way through *The Hunger Games*, by Suzanne Collins.

**What have I actually accomplished this year? What do my skills look like now?**

As I said above, I have added ~10000 words to my passive vocabulary (for a total of ~15000), an unknown number of which have also become part of my active vocabulary.

I watched my first movies with near-complete comprehension. Those movies were《大鱼海棠》and《哪吒闹海》。Neither movie is heavy on dialogue, but it was still a cool accomplishment!

I went from barely being able to browse the web in Chinese at all, to regularly browsing articles in Chinese, watching native content on YouKu 优酷,and I can even read many articles on Wikipedia. I have read articles about solar flares, the Holocaust, the eight major cuisines of China (八大菜系), Hokkaido, and tetanus. My comprehension varies, but is often (to me, at least) surprisingly good. I no longer feel like I’m drowning when I engage with native media.

It feels like I can understand most song lyrics now.

I am starting to feel more and more comfortable watching e.g. vlogs, casual interviews, and other media featuring native speakers speaking casually in a relaxed environment. Whether or not I can truly understand the content, though, depends heavily on the content being discussed.

At the beginning of the year, even a single chapter of *The Witches* (a book aimed at ~5-7 year olds) would contain 150-200 unknown words. Now, I can breeze through books of an equivalent level with virtually no difficulty at all. I read through a few chapters of *James and the Giant Peach* at the laundromat the other day and was pleased to find that there were very, very few unknown words, and never enough to impact comprehension of the story.

I also feel much, MUCH more comfortable expressing myself now compared with at the beginning of the year – even though my active vocabulary is not as large as it would have been with regular practice. Also, reading those seven novels over the course of this year has made me a lot more familiar and a lot more comfortable with how to phrase my thoughts in Chinese. When I’m texting my friend in Chinese, I find that I can express much more complicated thoughts now than was the case twelve months ago.

**What can I NOT do with ~15000 words? And what do I still struggle with?**

As far as I can tell, documentaries are still off the table for me. They are fine in the beginning, when they speak in generalities and are introducing the topic at hand, but as soon as they dive into the thick of things, I’m lost.

I can’t read anything more advanced than *Roald Dahl* extensively, at least not yet. The % comprehension milestones highlighted by Imron [here]( and [here]( remain stubbornly out of reach, at least for now.

Even for intensive reading, a *lot* of books remain out of reach for me at my current level. That’s going to be the case for fewer and fewer books over the course of the next 6-12 months, but if you are at a lower level than I am, and you are looking forward to 15000 words, you should know that it won’t be enough for you to comfortably tackle books like *Dune*, *Ender’s Shadow*, or *The Joy Luck Club* even if you study them intensively like I do. Books like *Ender’s Game, Heroes Don’t Cry,* or *The Three Body Problem* will likely be easier, though still iffy.

I can’t smoothly and easily read academic articles (like Wikipedia, informative online articles, etc). The fact that I can pick my way through them doesn’t necessarily mean that it’s a pleasant experience!

I still come across unknown hanzi pretty frequently – although the frequency is much, much lower than it used to be.

I also cannot watch the news.

One area where I unfortunately did not make the progress I wanted to this year was in my productive skills. I never got to a point financially where I felt comfortable paying for regular tutoring sessions – which is what I really want – so I never got the practice speaking that would have allowed me to dramatically expand my active vocabulary. I know there are other avenues that I could have explored, but personally, I really do want to start out with a private tutor and not some kind of language exchange. What this means in practice is that, when texting in Chinese, I relatively frequently have to consult Pleco to remind myself of a word that I’ve forgotten. The key thing here is that I immediately recognize the word when I see it, so it truly is a *reminder*, as opposed to looking up a new word. Nevertheless, this *is* something I am hoping to start tackling sometime next year.

**Some choice statistics!**

I started using Chinese Text Analyzer to collect and analyze data pretty exhaustively in March of this year. You can see my spreadsheets and graphs [here]( I’m pretty proud of the progress shown in some of those graphs! Here are some choice numbers from this year:

1. At the beginning of the year, the most advanced book I was keeping track of was *Ender’s Shadow*, by Orson Scott Card, clocking in at 6948 unknown words. That number has now fallen to 3667.
2. The least advanced book that I am still keeping of is *James and the Giant Peach*, by Roald Dahl. It started the year with 1429 unknown words. That number now rests at 354.
3. The first book I read (*The Witches*, by Roald Dahl) was 31,926 words long. The book I am currently reading (*The Hunger Games*, by Suzanne Collins) is 97,228 words long.
4. In terms of unknown words per page, the worst book that I was keeping track of at the beginning of the year was *Animal Farm*, by George Orwell. At the beginning of the year, it averaged 25 unknown words per page. Currently, it averages 11.5 unknown words per page.
5. Throughout this year, I have generally set a limit of 2000 unknown words, beyond which I will not pick up a book. When I first started keeping track of statistics, there were 3 books in my reading list that fell within that range. Now, there are 23 — and that’s not counting the books I already finished reading.

**What are things looking like for 2022?**

I’ve got a *lot* to look forward to in 2022. 2021 was all about children’s novels, because that’s all I could feasibly read, even with intensive study. I was also very limited in my reading material. Now, I’m finally moving beyond that. 2022 is going to be a year of really good, serious literature. Excellent characters, great plotlines, and amazing worldbuilding. Here are some of the books I will have the opportunity to read through next year:

1. *The Wandering Earth*, by Cixin Liu(流浪地球)
2. *1988: I Would Like to Speak With the World*, by Han Han(1988:我想和这个世界聊天)
3. *The Andromeda Strain*, by Michael Crichton(天外病菌)
4. *Animal Farm*, by George Orwell(动物庄园)
5. *The Secret Garden*, by Frances Burnett(秘密花园)
6. *Cat Country*, by Lao She(猫城记)
7. *Howl’s Moving Castle*, by Diana Jones(魔幻城堡)
8. *Never Let Me Go*, by Kazuo Ishiguro(别让我走)
9. The *Harry Potter* Series, by J.K. Rowling(哈利波特)
10. *Heroes Don’t Cry*, by Gu Long(英雄无泪)
11. *Ender’s Game*, by Orson Card(安德的游戏)
12. *The End of Eternity*, by Isaac Asimov(永恒的终结)
13. *Foundation*, by Isaac Asimov(基地)
14. *Rendezvous with Rama*, by Arthur Clarke(与拉玛相会)
15. *The Man in the High Castle*, by Philip Dick(高堡奇人)
16. *Dune,* by Frank Herbert(沙丘)
17. The *His Dark Materials* Series, by Philip Pullman(黑暗物质)
18. *The Mandate of Heaven*, by Qian Lifang(天意)

So yeah, to say I’m excited would be a massive understatement. I’M SO EXCITED Y’ALL. Most of these are books I’ve never read before, ever. I am chomping at the bit to read these. Of course, I won’t be able to read all of them in one year – each book will take me about a month to complete – but I’ll be able to get through a fair few!

**What are my goals for next year?**

In 2022, I want to:

1. Add another 10000 words to my vocabulary
2. Read at least 11 more novels cover to cover
3. Watch more movies in Chinese
4. Start incorporating Chinese into my everyday life. For example, when I look something up, I want to be looking it up in Chinese. And I don’t want it to be because I’m forcing myself to, but rather because it feels natural to. I want Chinese to start becoming a daily habit.

