One of the questions that comes up frequently as you study Japanese is: how many kanji are actually used in a given piece of Japanese?
To try to answer this, we analyzed the Wikipedia and Tatoeba corpora, both of which are freely available for download. We ran the entire text content through our parser to first identify separate sentences, and then identify kanji and other characters within each sentence.
We found that on average, sentences are 18 Japanese characters long, and of those 18 characters, 5 characters are kanji (the rest are kana or punctuation).
In practice, other types of reading material will have different averages. For example, analysis of a handful of articles in the Nikkei suggests somewhat longer sentences and much higher kanji counts. Analysis of a few sentences in a contemporary novel suggests much longer sentences, but a similar ratio of kanji to other characters. Since a typical page in a Japanese paperback has space for 5-600 characters, this implies you would encounter over 100 kanji per page on average.
We'll update this article as we gather more data on this topic.
