Kanji Core Meanings, Software Lexicons

by Lynne E. Riggs

Launching “SWET on Saturdays,” Jack Halpern told how one student of Japanese wound up creating a Japanese-English character dictionary and massive Chinese, Japanese, and Korean lexical databases now used to aid Internet search engines.

Starting with an encounter with a Japanese language textbook on an Israeli kibbutz, the story of how one man’s fascination with kanji led to sixteen years of toil to produce the New Japanese-English Character Dictionary (Kenkyusha, 1990) and then to creating lexicons for computers was far too much material for a two-hour SWET presentation on a Saturday afternoon. But on February 22 we made a start, as Jack Halpern explained that his dictionary is aimed at mastering the meaning of kanji and discussed how his years of study of the core meanings of Chinese characters led him to work in the highly specialized area of lexicography for databases and Internet search engines.

Even the busiest translator who dares not dally long in pursuit of kanji readings and meanings would find it difficult to resist the delight Halpern takes in exploring the meanings of characters that form the shared orthography of Chinese, Japanese, and Korean (the CJK sphere). Polyglot Halpern speaks or reads twelve languages and is working on his thirteenth (Arabic), but Japanese is his all-absorbing favorite. After completing the dictionary, which has been published in a variety of editions, he founded the CJK Dictionary Institute, Inc., and now presides over a database of some six million entries.

The challenge of learning kanji inspired Halpern’s twenty-eight-year career as a CJK lexicographer, and he is well versed in the fine lines that distinguish the some two hundred readings of the character 生 and the characters used for such homophones as kaeru (帰る, 変える, 替える, 換える, 代える, etc.). He is also a student of the formation of such compounds as keizai (from the phrase keikoku saimin 経国済民 “administering the country and relieving the sufferings of the people”) that testify to the often arduous history of words coined for concepts and inventions arising during the Meiji era (1868–1912). He noted that those efforts went far; many Japanese-coined compounds were exported to China, where they are still in use.

If you think that kanji-English dictionaries exist only for finding words, Halpern’s stout defense of a dictionary focusing on “understanding the components of meaning” is worth lingering over. The New Japanese-English Character Dictionary is a monument to the appreciation of kanji and in-depth character meanings. Unlike Nelson’s Modern Reader’s Japanese-English Character Dictionary, his dictionary, Halpern stresses, is not meant to “look up words” but to make learning kanji easier and more logical. Once you understand the core meaning of a character and its elements, he believes, you will find that kanji are easy to learn. How the dictionary works and what it offers, including the SKIP method of locating characters, can best be understood by examining the dictionary itself.

Halpern touched on the work of Professor Hashimoto Mantaro, who conducted a government-funded study of Japanese and a number of other languages and arrived at the perhaps surprising conclusion that “Japanese is easier to read than any other language.” Asked what they thought of this notion, a number in the audience agreed. Halpern explained that the key lies in the heterogeneity of the script, combining hiragana, katakana, and kanji. Studies by Kaiho Hiroyuki at Tsukuba University show how kanji are processed by the brain and demonstrate why Japanese can be called “easy to read.”

Halpern noted that the complexity of kanji can be an aid to learning. As has been demonstrated in research on early education, children have been shown to learn even highly complex kanji more quickly than hiragana, many of which appear somewhat similar. Halpern used an eighty-four-stroke character, the name of a Buddhist prelate consisting of a tiny constellation of the characters for “cloud” and “dragon,” as an example and said that we would probably not forget it, even though we had seen it only once.

Japanese texts using a traditional mix of kanji, katakana, and hiragana may be easy to read, but the recent proliferation of katakana words, said Halpern, has made Japanese harder to read, particularly in much-maligned software and other technical manuals. He decried the faddish use of katakana words when there are perfectly good Japanese words and kanji, for example, erubō エルボー for “elbow” instead of hiji 肘 and shorudaa ショルダー for “shoulder” instead of kata 肩.

In the final half hour of the meeting, Halpern discussed the work of the CJK Dictionary Institute (Nitchūkan Jiten Kenkyūjo), the name of which is written in characters still common to the three languages, despite many changes in orthography in China, Japan, and Korea. At the institute, he and his colleagues are creating custom databases of words and characters for computer systems and language-processing systems. Among the institute’s clients is Babylon, a provider of online translation tools. CJKI also serves as a consultant to Google, whose Chinese and Japanese search engines are powered by CJKI’s large-scale databases. It also provides databases designed to enhance the internal dictionaries of computational linguistics tools, such as machine translation software and morphological analyzers.

On a recent trip to present his work in China, Halpern found very receptive audiences on a lecture tour of several universities. He is also involved in work relating to “lexicon-based orthographic disambiguation,” which seeks to enable search engines to understand the variety of ways in which the same sentence can be written in Japanese. He showed a chart of twenty-four ways to write the sentence “Kin no tamago o umu niwatori” (a chicken that laid the golden egg) using combinations of syllabic characters and kanji. His current challenge is to develop a system for orthographic processing to enable search engines to find what is needed no matter what the orthography used.

Publications:

New Japanese-English Character Dictionary (Kenkyusha, 1990).
New Japanese-English Character Dictionary, eBook edition (Nichigai Associates, 1995).
The Kodansha Kanji Learner’s Dictionary (Kodansha, 1999).
Dictionary of Unified CJK Characters (forthcoming, Toho Book Publishing Co., Ltd.).

For more information on the dictionaries, see the Kanji Dictionary Publishing Society (Kanji Jiten Kankōkai) website.

More information on the CJK Dictionary Institute is on their website.

Jack Halpern’s personal website is here.