Chinese Language and Languages of Northern Europe: Discoveries and Researches of Common Origins of Chinese, Uralic and Indo-European Languages.

The translation of the title of Gao Jingyi s book in Chinese is "Chinese Language and Languages of Northern Europe: Discoveries and Researches of Common Origins of Chinese, Uralic and IndoEuropean Languages" (Beijing 2008). The young Chinese linguist Gao has previously published three research papers in Estonia which can be seen as preludes to this book. For this book, the forewords have been written by two professors of Chinese linguistics: Zhengzhang Shangfang and Feng Zheng. This book consists of eight chapters.

Chapter 1 (pp. 1-22) observes the history of the Chinese language affinity studies and introduces some fundamental conceptions of language comparisons. There have been five routes in Chinese language affinity studies. The four traditional routes are those of the Sino-Tibetan, the Sino-Austronesian, the Sino-Yeniseian and the Sino-Indo-European (principally Sino-Germanic). The only new route is the Sino-Uralic (substantially Sino-Finnic) route. The author emphasizes that one half of this monograph addresses the Sino-Indo-European studies, while the other half is going to treat of the Sino-Uralic studies.

The author gives an overview of the traditional routes, devoting larger paragraphs to Sino-Germanic. The overview offers a picture to the readers in which the commonly advertized Sino-Tibetan hypothesis has always been criticized within the academic circles. The author believes that in many aspects the less explored Sino-Indo-European hypothesis is actually more credible than the Sino-Tibetan hypothesis.

Then the author moves on to explore the Sino-Uralic route. He concludes that in addition to the sound change of Vowel Apocope, which is already introduced in the Sino-Germanic section, the sound change of Stem Vowel Metathesis is another very useful conception in the Sino-Uralic studies. It means that when converting in Chinese a root from Uralic-type [C.sub.1] [V.sub.1] [C.sub.2] [V.sub.2] to [C.sub.1] [V.sub.1] [V.sub.2] [C.sub.2] we receive the general root for Chinese dialects. The author also points out that the Uralic stem vowel ([V.sub.2]) is comparable to the rhyme group vowel in Old Chinese rhymed writings. The author suggests that all the roots in Primitive (oldest) Chinese and even in Old (older than 2000 years) Chinese could be disyllabic similar to Uralic types. This suggestion is cited by the author of the second foreword Feng as a valuable new conception for solving disputes of Old Chinese phonology.

Chapter 2 (pp. 23-29) introduces the methods of DOM or Chinese historical comparative etymology. The methods follow the traditions of applying Chinese characters in Chinese dialects and in Japanese, Korean and Vietnamese for their words of Chinese origin. The DOM research is a reversed application of the Chinese diachronic phonology. The Chinese diachronic phonology works on old pronunciations of the Chinese characters. The DOM research chains of dialect pronunciations with the Chinese characters were based on old pronunciations. Generally, a DOM equals to a basic Chinese character (not to those that were simplified or wrongly used in the contemporary era), and is a Chinese etymological unit. The DOM research is used to demonstrate common etymological units (regardless of cognates or loanwords) across languages or dialects. The Chinese characters contain attested proto-meanings. These are to be compared to meanings of common etymological units in non-Chinese target languages or dialects. The Chinese characters also contain attested pronunciations. These, in turn, are to be compared to pronunciations of common etymological units in non-Chinese target languages or dialects.

Remarkably, the proposed common etymological units (DOMs) are required to be re-examined in order to rule out coincidences by applying Semantic Chain Extension and Sound Factor Extension. The Semantic Chain Extension assumes that some other etymological units with related meanings are also detected in the same target languages or dialects. The Sound Factor Extension assumes that DOMs with the same Sound Factor have identical phonological structures in the target languages.

The author observes the attested pronunciations of the equivalents in European languages. In fact, the consistency is well presented in these languages because these languages have not lost the most of non-initial consonants that the Chinese dialects used to lose. The author proves that the Old Chinese pronunciations are not found in Chinese dialects but can be found in European dialects. The author believes that it is indeed a breakthrough that can first positively shock some Chinese linguists with a sufficient knowledge in Chinese diachronic phonology. European pronunciations can directly be read by them, while fewer Western linguists manage to read long stories and equations expressed by the Chinese characters.

The author indicates that the Sound Factor Extension proves that the proposed common etymological units are not only linked in pairs but also linked in groups of etymologically unrelated although phonologically related pairs. It achieves the same goals as the Sound Laws of the Comparative Method claim to achieve. The DOM Method is even more objective than the Comparative Method because one can always subjectively bring about a semantic change or a sound change, yet never any graph change. The graphs are variables in equations. The variables are related to both proto-pronunciations and proto-meanings, but not influenced by the contemporary ones. One can either chain or unchain a word from a graph, and then review the consequences at once. It is a 1 or 0 mathematical process. If the consequences fit well together, the solutions will just become corrected.

The author writes that he takes the earlier etymological approaches more seriously than the comparative reconstructions and nodes. This conception is very prudent from my point of view. The author sees the target European languages and Chinese dialects as equal language units. Consequently, the author must chain a word from an attested non-Chinese dialect or language to the DOM, the same as relevant approaches do when targeting Chinese dialects or languages. The author observes the target languages from the grass-root level through all etymological units including loanwords, since the Chinese method is used to demonstrate common etymological units. The Chinese method neglects the definitions of loanwords. Here comes an interesting question: Since the Chinese method neglects the question, meanwhile there is no reconstructed Proto-Chinese language by the Western method, are the mutually unintelligible Chinese dialects or languages not genetically related? It really poses a wider question of all the language affinity studies.

The author insists that the Chinese method is more effective and objective in exploring common etymological units. The author is more interested in fundamental etymological studies than in theoretical debates. No matter where further studies will lead us, the author has become eminently the first person who has detected common etymological units along the routes. This point is acknowledged also by the authors of the forewords.

In any Chinese linguistic work, a given Chinese character is never just a simple word for a dialect but a general word, an etymological unit for all the dialect points and historical document points. There is no way how Chinese linguists could give all the equivalent forms at once. Meanwhile, there is no such match where a personal, reconstructed "proto-form" can be used to replace the Chinese character.

Chapter 3 (pp. 30-88) contains the descriptive studies of European target languages following Chinese linguistic ways. Separate single descriptions of Finnish, Estonian, Danish and Swedish are actually the first short grammars of the languages in the Chinese ways. The conception of Chinese linguistics is first to analyze morphemes and so to express the morphemes with separate Chinese characters. Both lexical and grammatical morphemes are treated equally.

Grammatical suffixes are analyzed as weakened equivalents of the full lexical forms, the same practices are attested in Chinese dialects. E.g., in Standard Mandarin, zi (stressed, with the third Mandarin tone) is a full lexical form, a noun meaning 'son', while -zi (unstressed, with the neutral tone, not occurring independently) is its weakened form, a nominal suffix for derivatives. The zi and -zi belong to the same DOM. In Chinese characters, they are written identically. In Standard Mandarin, liao (stressed, with the third Mandarin tone) is a full lexical form, a noun meaning 'infant', while -la or -le (unstressed, with the neutral tone, not occurring independently) is its weakened form, a verbal suffix for the perfect aspect. They are also written identically in Chinese as they belong to the same DOM. Full lexical forms are considered as primary. Suffixes are considered as secondary. There are cases in which the full lexical forms are not yet known but it is not allowed to have an independent suffix. Grammatical agreements are analyzed as Cumbrous Repeats. E.g., an Estonian/ Finnish phrase mina tean/mina tiedan 'I know' is analyzed and written in four DOMs altogether: (#1) I _ (#2) KNOW _ (#3) Genitive mark _ (#4) I. The repeating of the same semantic unit, meaning the first person, is called a Cumbrous Repeat. The author indicates that the repetition is acceptable in Chinese but for an appropriate Chinese usage, one of them has to be elided. Similarly, a Finnish phrase sinun talosi 'your house' is analyzed and written in five DOMs altogether: (#1) YOU _ (#2) Genitive mark _ (#3) HOUSE _ (#4) YOU _ (#5) Genitive mark. The repeating of #1 and #2 in places of #4 and #5 form the Cumbrous Repeat. A pair of them has to be deleted in an appropriate Chinese usage.

Some differences in pronunciations are not represented in Chinese characters, e.g., the vowel apocope in Estonian, the gradation in Estonian and Finnish, the ablaut in Danish and Swedish. These are classified as sound changes within the rooted morphemes. Meanwhile, some Chinese characters for suffixes are not pronounced independently, e.g., the partitive designation in Estonian and Finnish. These are classified as sound changes within the suffixed morphemes. Sound changes within morphemes do not cause any practical confusion. We can see, e.g., that some grammatical cases in Estonian and Finnish can identically be written down in Chinese characters while corresponding to common morphemes.

The whole system reflects the earlier suggested typological circle of universal languages by Western linguistics: analytic > agglutinative > inflecting > analytic. From the author's point of view, the analytic system of written Chinese appears as an origin, the agglutinative system of Finnic languages follows as the former's downstream.

Chapter 4 (pp. 89-100) brings out a lexical comparison of Morris Swadesh's 100 lexical items in Chinese, Estonian, Finnish, Danish, Swedish, English and German. Exact common ratios (Swadesh's items with the exactly same etymological units) between the European languages and Chinese are respectively: 35/96, 33/96, 24/96, 23/96, 20/96 and 20/96. Four Swadesh's items #3 'we', #4 'this', #5 'that', #6 'who' are concluded as not comparable, since these words are not mono-morphemic.

Remarkably, points are not given to non-primary existence of the common etymological units. E.g. the etymological unit of Estonian/Finnish koer/koira 'dog' exists in Chinese (e.g. Mandarin gou), but the point goes to Danish, Swedish and German because in written Chinese, the primary etymological unit for dog is the Germanic hund/Hund 'dog' (cf. Mandarin quan). The Finnic 'dog' is synonymous to it in Chinese.

The author summarizes in a mildly humorous vein, on page 100: The Chinese words for feather, wing and peck are found in Finnic languages but used for human hair, human arm and human lip instead. It reminds of an old Chinese myth that the forefather of a dynasty is the son of a woman and a bird.

Chapter 5 (pp. 101-108) summarizes indicators of surrounding subjects that support the affinities between Chinese nation and nations of the target European languages. Clues of molecular biology hold forth a paternal lineage N-M231 that is characteristically shared by the most Uralic nations and the Chinese nation. The author suggests that the Chinese and Uralic nations are rooted in the paternal lineage N-M231, thus Chinese and the most Uralic nations share a recent common history of paternal lineages. The author also mentions that the Chinese and Finnic nations are very different in maternal lineages. The structure of maternal lineages of Finnic nations is in common with the other Europeans. In conclusion, the author suggests that the Finnic nations were established by male migrations from the Far East together with female aborigines of Europe. The languages are inherited from the paternal lineage, while the anthropologic appearances are inherited from the maternal lineage. The author recalls that the similar populating outlines of Finnic nations have been suggested by Estonian geneticists.

Chapter 6 (pp. 109-116) consists of final discussions, summaries and suggestions.

The author demonstrates and claims: Of any synonymous pair of etymological units in Chinese language, one belongs to a Sino-Uralic corpus, while the other belongs to a Sino-Indo-European corpus. In case there is a third synonymy, its etymology is expected to be solved in language affinity studies between Chinese and some other languages in the future.

The author summaries three major origins of the Chinese language to form three layers: The root layer is of Sino-Uralic, it results in the Sino-Uralic corpus of common etymological units. The second layer, possibly loaded since the Chalcolithic Age is of Indo-European, it results in the Sino-Indo-European corpus of common etymological units. The third layer loaded since the Bronze Age could be of Yeniseian, it could result in a Sino-Yeniseian corpus of common etymological units. The author remarks that the primitive Chinese and Finnic could separately have got loanwords from Germanic languages, resulting in a corpus of common Germanic loanwords in Chinese and Finnic languages. The author observes that a few etymological units are natively presented in both the Sino-Uralic corpus and the Sino-Indo-European corpus. It could be seen as traces of a remote common origin.

In the final suggestions, the author calls for relevant attention to other Indo-European language groups, first of all to Tocharian and Baltic. It is important to find out if there is a potential inventory of Sino-Indo-European corpus that does not present in Germanic. Similarly, the author suggests further etymological studies between Chinese and other Uralic languages, first of all with Saamic, Samoyedic and Ob-Ugric. It is important to know whether there is a potential inventory of Sino-Uralic corpus that is not presented in Finnic.

Chapter 7 (pp. 119-126) consists of mapping charts of non-initial consonants between Old Chinese and target European languages.

Chapter 8 (pp. 127-241) is a huge DOM list of equivalents between Chinese and the target European languages. A statistical summary is given at the end. In the DOM list, there are 581 common etymological units between Chinese and Estonian, of which 202 have been defined as loanwords into Estonian by Western linguists. There are 603 common etymological units between Chinese and Finnish, of which, 203 have been defined as loanwords into Finnish by Western linguists. There are 691 common etymological units between Chinese and Danish, of which 167 have been defined as loanwords into Danish by Western linguists. There are 685 common etymological units between Chinese and Swedish, of which 158 have been defined as loanwords into Swedish by Western linguists. There are 363 common etymological units of Sino-Uralic corpus, of which 122 are found only in Finnic, 50 are found in Proto-Finno-Volgaic, 28 are found in Proto-Finno-Permic, 91 are found in Proto-Finno-Ugric, 72 are found in Proto-Uralic. There are about 750 common etymological units of Sino-Indo-European corpus, of which 29 are found only in Scandinavian languages, 458 are found in Proto-Germanic, 482 are found in Proto-Indo-European (there are some overlaps with Proto-Germanic).

At the end of the book, there are appendixes (pp. 242-264), bibliography (pp. 265-281) and the author s note (pp. 282-283) that includes acknowledgements to these people who in various ways have helped the author during the enterprise of this book.

The reviewed book of Gao offers a great interest in the viewpoints of scientific approaches and brave scientific innovations. Hopefully, we can apply the findings of his research in the field under discussion also in languages that are more familiar to us.


This publication is supported by the Estonian Science Foundation, grant No. 7724.


Ago Kunnap

University of Tartu



G a o, J. 2004, Finnic-Sinic Comparison.--FU 26, 54-109.

--2005a, Comparison of Finnic Sinic Distinctive Suprasegmental Features.--FU 27, 56-68.

--2005b, Comparison of Swadesh 100 Words in Finnic, Hungarian, Sinic and Tibetan: Introduction to Finno-Sinic Languages, Tallinn.
