VOTIC AND INGRIAN CORE LEXICON IN THE FINNIC CONTEXT: SWADESH LISTS OF FIVE RELATED VARIETIES.
The core idea of lexicostatistics is the analysis and comparison of wordlists compiled from the most stable part of the lexicon. The compilation of such lists is not a trivial task that can be solved through simple searching of translation equivalents in a dictionary. Synonymy, dialectal variation, and other factors significantly influence the composition of the list and correspondingly the final result of the research.
In this article, we present the 111-word Swadesh lists for five Finnic idioms. The core of the research are the wordlists for three minor varieties: a dialect of the Votic language, and two dialects of the Ingrian language. These are analysed and compared with the wordlists for two major Finnic languages: standard Estonian and standard Finnish.
The research has the following goals:
(1) to compile wordlists of five Finnic varieties applying the same methodology;
(2) to analyse and compare the materials from the minor varieties with the data from the major languages;
(3) provide comments on particular words in order to make the content of the lists and the differences between the analysed varieties more transparent;
(4) to draw a lexicostatistic picture of the minor varieties in the context of the major Finnic languages;
(5) to make some other preliminary observations based on the compiled wordlists.
The existing lexicostatistical research on the Uralic languages rarely uses explicit Swadesh lists. In most cases the compiled list is not accessible to a reader (see, for example, Taagepera 1994; Syrjanen, Honkola, Korhonen, Lehtinen, Vesakoski, Wahlberg 2013 (1)). The paper Hofirkova, Blazek 2012 is an exception as it gives the wordlists for many languages including Finnish, Estonian and Votic. However, the method of compilation of a wordlist as well as sources of the data are not always transparent there. For example, the list of sources does not contain any dictionary of the Votic language that leads us to conclude that secondary sources (such as etymological dictionaries) were used to obtain data. Many flaws both in transcription (2) and the choice of words (3) increase this impression. Another piece of recent research that uses Swadesh lists is Tillinger 2014. Tillinger analyses Saami languages and, among other things, gives the Swadesh lists of several European languages including Finnish and Estonian. In the Appendix, we comment on the differences between Tillinger's and our Swadesh lists.
In the current article we present both the explicit wordlists and the transparent methodology of their compilation.
The article consists of four sections. Section 1 provides the basic information: (a) the main facts about the Votic and the Ingrian languages, (b) a description of data and methods of the research, (c) transcription conventions. Section 2 presents the annotated wordlists. In Section 3 (Discussion), we formulate our preliminary observations of the wordlists. Section 4 (Conclusions) contains a short summary of the results.
Votic and Ingrian are minor Finnic languages on the verge of extinction. Votic belongs to the southern branch of the Finnic languages and is the closest relative of Estonian; Ingrian belongs to the northern branch and is the closest relative of Finnish and Karelian.
The last generation of Votic and Ingrian fluent speakers was born in the early 1930s. Their deportation to Finland during the Second World War, a ban on living in their native settlements after the war and the negative attitude of Russian people towards speakers of minority languages led to the rapid extinction of both Votic and Ingrian (see more details in [phrase omitted] 2013). At the moment, the most optimistic calculations give no more than five Votic and twenty Ingrian speakers (representing two dialects).
Most Votic dialects are extinct. Krevin--the language of the Votic population relocated to Latvia in the 15th century--died out by the middle of the 19th century. The last speaker of Eastern Votic died in 1976 (Ernits 2005 : 87), and the last recordings of Central Votic were made in the 1970s. Most probably there are no fluent speakers of the mixed Votic-Ingrian Kukkuzi variety (Suhonen 1985; Markus, Rozhanskiy 2012), though there were a few in the mid-2000s. The last speakers of Votic represent the Western dialect (Vaipoli Votic), which shows some contact-induced Ingrian influence (Rozhanskiy, Markus 2015).
The last speakers of Ingrian represent the Soikkola and Lower Luga dialects. Two other traditionally distinguished Ingrian dialects are already extinct. Oredezi Ingrian died out in the second half of the 20th century (Laanest ([phrase omitted] 1993 : 62) considered it already moribund in the 1960s); Hevaha Ingrian became extinct around the turn of the millennium.
In the current research, we use Votic data from the Western dialect and Ingrian data from both the Soikkola and Lower Luga dialects. The analysis of two Ingrian dialects is not redundant but is one of the key goals of the research. According to the hypothesis formulated in Rozhanskiy, Markus 2014, the Lower Luga dialect (traditionally described as the most specific Ingrian dialect (4)) is in fact a very specific convergent variety based on Ingrian and Votic but also influenced by Ingrian Finnish and Estonian. Many Votes shifted completely to this variety and changed their identity.
The contact situation for the analysed minor languages can be briefly described as follows. Votic has had intensive contact with Russian during the last millennium. The Western Votic variety discussed in this article was influenced by Ingrian (in fact, all Votic villages in the Lower Luga area had a mixed Votic-Ingrian population in the 20th century, and there were plenty of mixed Votic-Ingrian families). Central Votic was in close contact with Ingrian Finnish; however, due to the difference in religion, mixed marriages were not typical.
Ingrians also had contact with the Russian population but it seems that Soikkola Ingrian was less influenced by Russian than Votic. Also, there are no evident traces of Votic or some other Finnic influence on Soikkola Ingrian.
On the contrary, Lower Luga Ingrian had intensive contact not only with Russian but also with Ingrian Finnish, Votic, and (in the southern part of the area) with Estonian.
As most of the Finnic population from Western Ingria was deported to Finland during the Second World War, most of the speakers had some experience in the Finnish language.
1.2. Data and methods
The last decades have witnessed a renewal of interest in lexicostatistics and glottochronology. Different scholars use different mathematic algorithms: some work with the classical Swadesh method or its modified versions (Hofirkova, Blazek 2012), some use methods borrowed from evolutionary biology, such as maximum parsimony or Bayesian phylogenetic inference (Chang, Cathcart, Hall, Garrett 2015; Honkola 2016). All these studies have one thing in common: they use lists of basic lexemes with fixed meanings. Different authors use different wordlists, for example, the original Swadesh 200-word list (Swadesh 1952), the Swadesh 100-word list (Swadesh 1971) and various modifications thereof (Kassian, Starostin, Dybo, Chernov 2010), the ASJP 40-word list (Holman, Wichmann, Brown, Velupillai, Muller, Bakker 2008), the Leipzig-Jakarta list (Tadmor 2009), and others. A useful catalogue of such lists can be found on the Concepticon site (http://concepticon.clld.org).
We are convinced that the key problem with lexicostatistics lies not so much in the mathematics, as in the lexicography. Whatever algorithm we choose to apply, if our initial data are not sufficiently accurate, the wellknown maxim "garbage in--garbage out" will aptly describe the result. The core problem while compiling the wordlists is synonymy. For every meaning on the list in every language included in the comparison we must use the most neutral basic word representing this meaning. However, the standard meanings in the various lists of basic vocabulary are usually represented by English words (or words of some other natural language). It is clear that English words need not have one-to-one sematic equivalents in other human languages. For example, English 'hand' may be translated into Russian as '[phrase omitted]' or '[phrase omitted]', depending on the context. So, the compilation of a reliable basic vocabulary list requires a semantic specification of items on this list. Such a specification can be done in several ways:
(1) using more than one language for the list of basic meanings counting on more specific meanings in at least one of the languages;
(2) giving additional comments and explanations to narrow the meaning of the basic word;
(3) choosing a specific context to narrow down the meaning of an item on the list.
We chose the Swadesh wordlist and its particular modification because it is one of a few, if not the only, basic lexicon list for which a detailed semantic specification is available (Kassian, Starostin, Dybo, Chernov 2010). This standard was already used in hundreds of wordlists compiled for the Global Lexicostatistical Database project (GLD 2018), as well as in publications not affiliated with this project (Gruntov, Mazo 2015).
The selected standard allows us to use all the mentioned methods of resolving the choice between synonymic variants. First, the list of basic words is given in both English and Russian. Second, the comments specifying the meaning of the basic words are given. Third, every basic word has several contexts that narrow down the meaning. Thus, it becomes possible with minimal exceptions to choose the most neutral word that is not too general or too specific, is not stylistically marked, and is not too bookish or too colloquial.
In this article, we present the 111-word modified Swadesh lists for five Finnic idioms, compiled on the basis of the following methodology. The compilation of lists had two stages. During the first (preliminary) stage, the lists were compiled with the help of dictionaries (5) and/or the authors' competence. During this stage, some items had several variants in case there were no evident reasons to select the most suitable word. During the second stage, the lists were checked with the help of native speakers (see Acknowledgements section) and (for the minor languages) corpora of elicitations and narratives were also used. (6) The native speakers annotated the meaning and usage of words and translated the sentences with the contexts. The final decision of which word should be added to the list was made exclusively by F. Rozhanskiy.
Etymological comments are based on standard etymological dictionaries (SSA; EES; UEW; LAGLOS) and other sources on Finnic and Uralic etymology. The final decisions on etymology were made exclusively by M. Zhivlov.
1.3. Transcription conventions
Two of the analysed languages, Estonian and Finnish, have a literary tradition; Ingrian had a literary tradition only for a short period in the 1930s; and Votic has always been an unwritten language. (7) In this paper, we use the following transcription conventions.
For Estonian and Finnish, the standard orthography is used.
Our Votic transcription is similar to the one used by Tsvetkov (1995) but has some minor differences. First, we use j instead of the traditional Finnic i as the second part of diphthongs, e.g. kejg 'all' not keig (see the discussion in [phrase omitted], [phrase omitted] 2017 : 351-352). Second, the final reduced vowels are spelled as [??] (back vowel) and [??] (front vowel) instead of [??] and E respectively, e.g. rint[??] 'breast', tsulm[??] 'cold'. The long vowels are transcribed with double letters for comparability with other languages.
The Soikkola Ingrian transcription is close to Nirvi 1971 but the short geminates are transcribed with double letters and a breve in all phonetic contexts, e.g. valkkia 'white', not valkia. The long mid-high vowels of the first syllable (that in some idiolects merged with the long high vowels uu, uu, ii) are marked with a circumflex accent below: koori 'bark', oo 'night', seemen 'seed'. The sibilant fricatives are s and z (instead of s and z), e.g. suur 'big', meez 'man'.
There are no authoritative sources for the transcription of Lower Luga Ingrian, which also exhibits very significant phonetic variation between different varieties. We represent long mid vowels in the first syllable as diphthongs, e.g. kuorI 'bark', uo 'night', siemen 'seed', although their diphthongization is usually much weaker than in Finnish. The final reduced vowels that can be realized as short, voiceless, or dropped are marked with small capital letters: savvU 'smoke' [savvu ~ savvu ~ savv], hantA 'tail' [hanta ~ hanta ~ hant].
Proto-Finnic and Proto-Uralic reconstructions are written in a system based on the UPA. In this system affricates are written as single symbols.
2. The wordlists
1. all [BCe] Est. koik Vot. kejg L-L. kai Soi. kaig Fin. kaikki
This word exists in most Finnic languages. It goes back to Proto-Finnic *kai kki 'all', possibly of Baltic origin (cf. SSA 1 : 275; EES 199).
2. ashes [[phrase omitted]] Est. tuhk Vot. tuhk[??] L-L. tuhkA Soi. tuhka Fin. tuhka
This word exists in most Finnic languages. It goes back to Proto-Finnic * tuhka 'ashes', borrowed from Germanic (cf. SSA 3 : 319; LAGLOS III : 307).
3. bark [[phrase omitted]] Est. koor Vot. koori L-L. kuorI Soi. koori Fin. kaarna
Proto-Finnic *koori 'bark' goes back to Proto-Uralic *kari 'surface, crust, skin, bark' (Aikio 2015 : 52), which is certainly not the main word for 'bark' in Proto-Uralic (the meaning 'bark' is represented only in Finnic). Fin. kaarna exists in some Finnic languages and possibly has a Baltic origin (cf. SKES 1987 : 135; SSA 1 : 265-266). The word kuori also exists in Finnish but has a more general meaning, and the word kaarna looks more natural in the test contexts. The word kaarna is known in Ingrian with the meaning 'cork; fir or pine bark' (Nirvi 1971 : 148).
4. belly [[phrase omitted]] Est. koht Vot. vatts[??] L-L. vatsA Soi. vatsa Fin. vatsa
Votic, Ingrian and Finnish preserve Proto-Finnic *vacca 'belly'. Pace Redei (UEW 547), this word has no acceptable etymology: the proposed Mansi cognate has irregular vocalism and is restricted to North Mansi. Estonian koht goes back to Proto-Finnic *koktu 'belly' (cf. EES 199); vats exists as a dialectal variant. In colloquial Finnish, the word maha looks more natural in the test sentences. The difference between *vacca and *koktu in Proto-Finnic may have been that of '(external) belly' vs '(internal) bell y/stomach'.
5. big, large [[phrase omitted]] Est. suur Vot. suur(i) L-L. suur Soi. suur Fin. iso
This word exists in most Finnic languages and goes back to Proto-Finnic *suuri 'big', borrowed form Germanic (cf. SSA 3 : 224-225; EES 491; LAGLOS III 253-254). In Finnish, the word suuri which is semantically close also exists. However, our Finnish consultant considered iso to be the main word. The Finnish word may be an archaism, replaced in other Finnic languages by a Germanic loanword. Proto-Finnic *iso 'big', derived from *isa 'father', has a striking parallel in Moksha ocu 'big', derived from oca 'paternal uncle' (cf. SSA 1 : 228; UEW 78), cf. also Finnish eno '(maternal) uncle' and enemman 'more'.
6. bird [[phrase omitted]] Est. lind Vot. lintu L-L. lintU Soi. lindu Fin. lintu
Proto-Finnic *lintu 'bird' is either an isolated word or an irregular reflex of Proto-Uralic *lunta 'bird, goose' (cf. SSA 2 : 80; EES 242; UEW 254). Livonian and dialectal Estonian data show that Proto-Finnic *lintu was originally polysemous 'bird, flying insect, wild animal'. The polysemous word 'bird / wild animal' is found also in Samoyed and Ob-Ugric, although Finn ic, Ob-Ugric, and Samoyed words with these meanings are not related
7. to bite Est. hammustada Vot. purr[??] L-L. purrA [[phrase omitted]] [hammustama] Soi. purra Fin. purra
Proto-Finnic *pure- 'to bite' goes back to Proto-Uralic *puri- 'to gnaw, bite' (cf. SSA 2 : 438; EES 393; UEW 405-406). The verb pureda exists in Estonian but hammustada (derived from hammas 'tooth') is considered a more neutral word.
8. black [[phrase omitted]] Est. must Vot. muss[??] L-L. mustA Soi. musta Fin. musta
Proto-Finnic *musta 'black' has no plausible etymology (cf. SSA 2 : 183; EES 289; LAGLOS II 276).
9. blood [[phrase omitted]] Est. veri Vot. veri L-L. veri Soi. veri Fin. veri
Proto-Finnic *veri 'blood' goes back to Proto-Uralic *weri 'blood' (cf. SSA 3 : 427; EES 598-599; UEW 576).
10. bone [[phrase omitted]] Est. luu Vot. [??]uu L-L. luu Soi. luu Fin. luu
Proto-Finnic *luu 'bone' goes back to Proto-Uralic *liwi 'bone' (cf. SSA 2 : 114; EES 256; UEW 254-255). In Estonian, there is also a word kont (of Finnic origin, SSA 1 : 398; EES 175) that possibly broadened its meaning from 'shin' to 'bone'. This word was considered as more colloquial and less neutral.
11. breast [[phrase omitted]] Est. rind Vot. rint[??] L-L. rintA Soi. rinda Fin. rinta
There are several hypotheses for the origin of Proto-Finnic *rinta 'breast' (cf. SSA 3 : 80; EES 429). It is improbable that it is a borrowing from Germanic (LAGLOS III 158-159). Koivulehto (2008 : 315-317) has suggested a Slavic origin. Proto-Saami *rente 'breast' is a Finnic loanword. In Finnish, there is also a word povi of Uralic origin (cf. SSA 2 : 408; UEW 395) that can be used at least for the second context (His breast (chest) was decorated with ornaments). However it is rarely used and should not be considered the main word. There is no special word for 'woman's breast' but there is a Finnic word for 'teat' that in some idioms has an extended meaning 'woman's breast': Est. nann, Vot. nann[??], L-L. nannA, Soi. nanna, Fin. nanni. This word came from child language but it is probably rather old (SSA 2 : 252).
12. to burn (trans.) Est. poletada Vot. pe[??]etta L-L. poltta [[phrase omitted]] [poletama] Soi. polttaa Fin. polttaa
Proto-Finnic *poltta- 'to burn (trans.)' is an irregular causative derivative from Proto-Finnic *pala- 'to burn (intrans.)' (Est. poleda, Vot. peless[??], L-L. palla, Soi. pallaa, Fin. palaa) (cf. SSA 2 : 392; EES 399). This pair of verbs goes back to Proto West Uralic *pala- 'to burn (intrans.)' ~ *poltta-'to burn (trans.)' (cf. UEW 352). In Estonian and Votic the reflex of *polttawas replaced by a more regular causative from the same root.
13. cloud [[phrase omitted]] Est. pilv Vot. pilvi L-L. pilvI Soi. pilvi Fin. pilvi
Proto-Finnic *pilvi 'cloud' goes back to Proto-Uralic *pilwi 'cloud (cf. SSA 2 : 367; EES 370; UEW 381).
14. cold [[phrase omitted]] Est. kulm Vot. tsulm[??] L-L. kulmA Soi. kulma Fin. kylma
Proto-Finnic *kulma 'cold' goes back to Proto-Uralic *kulma 'cold', attested in Finnic, Saami, Mordvin, Mari and Permic (cf. UEW 663). The wide distribution of this word and completely regular sound correspondences make the hypothesis of its borrowing from Baltic (Koivulehto 1983; SSA 1 : 462; EES 213) quite improbable.
15. to come Est. tulla [tulema] Vot. tu[??] L-L. tullA [[phrase omitted]] Soi. tulla Fin. tulla
Proto-Finnic *tule- 'to come' goes back to Proto-Uralic *tuli- 'to come' (cf. SSA 3 : 324; EES 552-553; UEW 535).
16. to die Est. surra [surema] Vot. koo[??] L-L. kuollA [[phrase omitted]] Soi. koolla Fin. kuolla
Proto-Finnic *koole- 'to die' goes back to Proto-Uralic *kali- 'to die' (cf. SSA 1 : 440; UEW 173). In Estonian, it is observed only in dialects (EES 176). Estonian surra goes back to Proto-Finnic *sure- < Proto-Uralic *suri-'to die' (cf. EES 489; UEW 489)--certainly not the main synonym for this mea ning in Proto-Uralic.
17. dog Est. koer Vot. kojr[??] L-L. koirA [[phrase omitted]] Soi. koira Fin. koira
Proto-Finnic *koira 'dog' goes back to Proto-Uralic *kojra 'male' (cf. SSA 1 : 385; EES 168; UEW 168-169). The meaning 'male' is preserved in the Finnic derivative *koiras. The original Finnic word for 'dog' was rather Proto-Finnic *peni 'dog' (< Proto-Uralic *peni 'dog'), replaced as the main word for this meaning everywhere except Livonian and South Estonian (cf. SSA 2 : 335-336; EES 361; UEW 371).
18. to drink Est. juua [jooma] Vot. juuvv[??] L-L. juovvA [[phrase omitted]] Soi. joovva Fin. juoda
Proto-Finnic *joo- 'to drink' goes back to Proto-Uralic *jixi- 'to drink' (cf. SSA 1 : 249; EES 98; UEW 103).
19. dry [[phrase omitted]] Est. kuiv Vot. kujv[??] L-L. kuivA Soi. kuiva Fin. kuiva
Proto-Finnic *kuiva 'dry' lacks an acceptable etymology (cf. SSA 1 : 426; EES 187). The hypothesis of a Germanic origin is implausible (LAGLOS II 114), and the comparison with Proto-Khanty *kuj[??]m- 'to fall, sink (of water)' is d ubious (cf. UEW 196-197).
20. ear [yxo] Est. korv Vot. k[??]rv[??] L-L. korvA Soi. korva Fin. korva
Proto-Finnic *korva 'ear' is cognate with Proto-Saami *koarve 'oarlock'. Further etymological connections of this word are unclear (cf. SSA 1 : 408; EES 202-203; UEW 187-188). It is a replacement of Proto-Uralic *pelja 'ear' (cf. UEW 370).
21. earth Est. muld Vot. maa L-L. maa [[phrase omitted]] Soi. maa Fin. maa
Proto-Finnic *maa 'earth' (cf. SSA 2 : 133; EES 268) goes back to Proto-Uralic *mixi 'earth'. In Estonian, earth as a physical substance (i.e. earth vs sand, handful of earth, etc. (see Kassian, Starostin, Dybo, Chernov 2010)) is expressed by the word muld, which is a Germanic borrowing (EES 286; LAGLOS II 270). In other idioms the word multa also exists but is more peripheral than in Estonian. However, there are deviations. For example, in Finnish, the test sentence "I don't know whether that site contains sand or earth" requires the word multa, since maa is a general term for both 'sand' and 'earth'.
22. to eat Est. suua [sooma] Vot. suuvv[??] L-L. suovvA [[phrase omitted]] Soi. soovva Fin. syoda
Proto-Finnic *soo- 'to eat' goes back to Proto-Uralic *sewi- 'to eat' (cf. SSA 3 : 235; EES 500-501; UEW 440).
23. egg Est. muna Vot. muna L-L. muna [[phrase omitted]] Soi. muna Fin. muna
Proto-Finnic *muna 'egg' goes back to Proto-Uralic *muna 'egg' (cf. SSA 2: 178; EES 287; UEW 285-286).
24. eye Est. silm Vot. silm[??] L-L. silmA [[phrase omitted]] Soi. silma Fin. silma
Proto-Finnic *silma 'eye' goes back to Proto-Uralic *silma 'eye' (cf. SSA 3:1 81; EES 472-473; UEW 479).
25. fat Est. rasv Vot. razv[??] L-L. razvA [[phrase omitted]] Soi. razva Fin. rasva
Proto-Finnic *rasva 'fat' is possibly an early Germanic borrowing (SSA 3: 53; EES 420; LAGLOS III 132). The word replaces Proto-Uralic *waji 'fat', whose Finnic reflex *voi means 'butter' (cf. UEW 578-579). Cf. also Proto-Uralic *koja 'fat, tallow', whose Finnic reflex *kuu 'tallow' is preserved only in Finnish and Karelian (cf. UEW 195-196).
26. feather Est. sulg Vot. su[??]k[??] L-L. sulkA [[??]epo] Soi. sulga Fin. sulka
Proto-Finnic *sulka 'feather' is possibly an irregular reflex of Proto-Uralic *tulka 'feather' (cf. SSA 3 : 211; EES 487; UEW 535-536). Livonian turg[??]z 'feather' may be another irregular reflex of the same Uralic word. As suggested by Kirill Reshetnikov (p.c.), the Livonian word may instead be cognate with Finnish turkki 'fur, hair (of animals)', Ingrian turkki 'chicken feather'. However, we would expect Livonian -rk- as a reflex of Proto-Finnic *-rkk-, so the etymology of Livonian turg[??]z remains moot.
27. fire Est. tuli Vot. tuli L-L. tuli [[phrase omitted]] Soi. tuli Fin. tuli
Proto-Finnic *tuli 'fire' goes back to Proto-Uralic *tuli 'fire' (cf. SSA 3 : 211; EES 553; UEW 535).
28. fish Est. kala Vot. kala L-L. kala [[phrase omitted]] Soi. kala Fin. kala
Proto-Finnic *kala 'fish' goes back to Proto-Uralic *kala 'fish' (SSA 1 : 282; EES 120; UEW 119).
29. to fly Est. lennata [lendama] Vot. lenta L-L. lentta [[phrase omitted]] Soi. lenttaa Fin. lentaa
Proto-Finnic *lenta- 'to fly' has no plausible etymology (cf. SSA 2 : 64; EES 236).
30. foot Est. jalg Vot. ja[??]k[??] L-L. jalkA [HO[??]a] Soi. jalga Fin. jalka
Proto-Finnic *jalka 'foot' goes back to Proto-Uralic *jalka or *jilka 'foot' (cf. SSA 1: 234; EES 96-97; UEW 88-89).
31. full Est. tais Vot. taun[??] L-L. taun [[phrase omitted]] Soi. taun Fin. taysi/taynna
Proto-Finnic *tauci 'full' goes back to Proto-Uralic *taw[??]i 'full' (cf. SSA 3: 358; EES 566; UEW 518). Koivulehto's hypothesis of a Germanic borrowing does not withstand scrutiny (LAGLOS III 331-332; Aikio 2002 : 31-34). In some languages, the lexicalized essive form *taun-na is used in the test contexts.
32. to give Est. anda [andma] Vot. anta L-L. anta [[phrase omitted]] Soi. anttaa Fin. antaa
Proto-Finnic *anta- 'to give' goes back to Proto-Uralic *amta- or *imta-'to give' (cf. SSA 1 : 77; EES 50; UEW 8)--one of two or three Proto-Uralic verbs of giving.
33. to go Est. minna [minema] Vot. menn[??] L-L. mannA [[phrase omitted]] Soi. manna Fin. menna
Proto-Finnic *mene- 'to go' goes back to Proto-Uralic *meni- 'to go' (cf. SSA 2 : 159; EES 282; UEW 272). In Estonian, this word is combined in one suppletive paradigm with the verb lahe- < Proto-Finnic *lakte- < Proto-Uralic *lakti- 'to go out, to go away' (cf. EES 262; UEW 239-240).
34. good Est. hea Vot. uva L-L. huva [[phrase omitted]] Soi. huva Fin. hyva
Proto-Finnic *huva 'good' has cognates in Saami and Mordvin languages (cf. SSA 1 : 201; EES 86; UEW 499). In Saami the word means 'to heal (of wound)', the Mordvin word means 'good', but it is not the main synonym for 'good' in Mordvin. Comparisons with words in other Uralic languages are hypothetical. The Estonian word is somewhat aberrant phonetically; still, it is cognate with words in other Finnic languages. The non-aberrant form huva exists in Estonian dialects and in colloquial speech.
35. green Est. roheline Vot. rohojn L-L. rohoin [[phrase omitted]] Soi. rohhoin Fin. vihrea
Finnish preserves the reflex of Proto-Finnic *vihera 'green'. Together with a morphological variant *vihanta, this word goes back to a late dialectal Uralic protoform *wisa 'green; poison', borrowed from Indo-Iranian (cf. SSA 3 : 438; UEW 823-824). In four idioms, the adjective 'green' is derived from Proto-Finnic *rooho 'grass', possibly of Germanic origin (EES 432-433; LAGLOS III 180).
36. hair Est. juus Vot. ivuz L-L. hiuz [[phrase omitted]] Soi. hiuz Fin. hius
Proto-Finnic *hi[beta]us 'hair' is a derivative based on a root borrowed from Germanic (cf. SSA 1 : 168; EES 102; LAGLOS I 107-108). This word is a replacement of Proto-Uralic *ipti 'hair' (cf. cf. UEW 14-15).
37. hand [pyKa] Est. kasi Vot. tsasi L-L. kasi Soi. kazi Fin. kasi
Proto-Finnic *kaci 'hand' goes back to Proto-Uralic *kati 'hand' (cf. SSA 1 : 479; EES 209; UEW 140).
38. head Est. pea Vot. paa L-L. paa [[phrase omitted]] Soi. paa Fin. paa
Proto-Finnic *paa 'head' goes back to Proto-Uralic *pa[??]i 'head'--one of the two Uralic words for 'head' (cf. SSA 2 : 462; EES 357; UEW 365-366) .
39. to hear Est. kuulda [kuulma] Vot. kuu[??] L-L. kuullA [[phrase omitted]] Soi. kuulla Fin. kuulla
Proto-Finnic *kuule- 'to hear' goes back to Proto-Uralic *kuwli- 'to hear' (cf. SSA 1 : 456; EES 197; UEW 197-198).
40. heart Est. suda Vot. sua L-L. suan [[phrase omitted]] Soi. suan Fin. sydan
Proto-Finnic *su[??]an 'heart' goes back to Proto-Uralic *sa[??]'am 'heart' (cf. SSA 3 : 228; EES 501; UEW 477).
41. horn [po[??]] Est. sarv Vot. sarvi L-L. sarvI Soi. sarvi Fin. sarvi
Proto-Finnic *sarvi 'horn' goes back to the dialectal Uralic protoform *sarwi 'horn', borrowed from Indo-Iranian (cf. SSA 3 : 159; EES 461-462; UEW 486-487).
42. I [[??]] Est. mina (~ ma) Vot. mia L-L. mia Soi. mia Fin. mina (~ ma)
Proto-Finnic *mina 'I' goes back to Proto-Uralic *min 'I' (cf. SSA 2 : 168; EES 281-282; UEW 294). In Estonian and Finnish, there is variation between a long and a short form.
43. to kill Est. tappa [tapma] Vot. tappa L-L. tappa [[phrase omitted]] Soi. tappaa Fin. tappaa
Proto-Finnic *tappa- 'to kill' goes back to Proto West Uralic *tappa-, whose reflex in Mordvin languages means 'to break' (cf. SSA 3 : 269-270; EES 514-515; UEW 509-510).
44. knee Est. polv Vot. pe[??]vi L-L. polvI [[phrase omitted]] Soi. polvi Fin. polvi
Proto-Finnic *polvi 'knee' goes back to the Proto-Uralic word for 'knee', whose exact reconstruction is doubtful. Apparently it was a compound of two roots: *puxi or *puwi 'knee' (> Proto-Samoyed *pu[??] 'knee') and *liwi 'bone' (cf. SSA 2 : 393; EES 400; UEW 393).
45. to know Est. teada [teadma] Vot. taata L-L. tiita [[phrase omitted]] Soi. tiittaa Fin. tietaa
Proto-Finnic *teeta- 'to know' is derived from *tee 'road, path' (SSA 3: 289). The hypothesis of a Germanic origin (EES 519) is unacceptable (LAGLOS III 292-293). This word replaces Proto-Uralic *tumti- 'to know', whose Finnic reflex *tunte- means rather 'to feel; to recognize' (cf. SSA 3 : 327; UEW 536-537).
46. leaf Est. leht Vot. lehto L-L. lehtI [[phrase omitted]] Soi. lehti Fin. lehti
Proto-Finnic *lehti 'leaf' goes back to late dialectal Uralic (West Uralic and Mari) *lesti 'leaf', apparently of Balto-Slavic origin (cf. SSA 2 : 58-59; EES 234; UEW 689).
47. to lie Est. lamada [lamama] Vot. lezia L-L. lezze [[phrase omitted]] Soi. lessia Fin. maata
The Finnic languages usually use the verb 'to be' to denote the position of an object and do not express the difference between 'to lie' and 'to stand'. Therefore, the Proto-Finnic word for 'to lie' cannot be convincingly reconstructed. The Estonian word is derived from the Proto-Finnic noun/adjective *lama 'lying', borrowed from Germanic (cf. SSA 2 : 42; EES 225-226; LAGLOS II 165), Votic and Ingrian borrowed this verb from Russian, and Finnish uses the reflex of Proto-Finnic *maka- 'to sleep' (q.v.). In Estonian, there is another word for 'to lie', lebada [lebama], that is less general than lamada [lamama]. It goes back to Proto-Finnic *lepa-, for which two mutually contradictory and phonetically problematic Germanic etymologies were proposed (cf. SSA 2 : 67-68; EES 232; LAGLOS II 198-199). Proto-Uralic root *kuji- 'to lie' has no reflexes in West Uralic (cf. UEW 197). Votic verbs lammoa and lamota 'lie about, to rest lying' are peripheral: they are very restricted dialectally (VKS 574) and are not known to contemporary speakers.
48. liver Est. maks Vot. mahs[??] L-L. maksA [[phrase omitted]] Soi. leiba-liha Fin. maksa ~ petsonka
Proto-Finnic *maksa 'liver' goes back to Proto-Uralic *miksa 'liver' (cf. SSA 2 : 142; EES 273; UEW 264). This is one of the most stable words in the Uralic basic lexicon. However in Soikkola Ingrian the word maksa means 'fish liver' or (in plural) 'internal apparatus'. The meaning 'liver' is expressed either by a descriptive compound leiba-liha (literally: bread meat) or by a Rus sian borrowing.
49. long Est. pikk Vot. pitts[??] ~ pittsi L-L. pitkA [[phrase omitted]] Soi. pitka Fin. pitka
Proto-Finnic *pitka 'long' goes back to Proto-Uralic *pi[??]-ka 'long', from the root *pi[??]i (cf. SSA 2 : 377; EES 368; UEW 377-378).
50. louse Est. tai Vot. taj L-L. tai [[phrase omitted]] Soi. tai Fin. tai
Proto-Finnic *tai 'louse' goes back to Proto-Uralic *taji 'louse' (cf. SSA 3 : 3 53; EES 565; UEW 515).
51. man (male) Est. mees Vot. meez L-L. mies [[phrase omitted]] Soi. meez Fin. mies
Proto-Finnic *mees 'man' has no acceptable etymology (SSA 2 : 166). The hypothesis of a Germanic origin is not likely (EES 279; LAGLOS II 263). The shape CVVC is anomalous from the point of view of Finnic phonotact ics.
52. man (person) Est. inimene Vot. inimin L-L. ihmin [[phrase omitted]] Soi. ihmiin ~ ilmihin ~ inemin Fin. ihminen
The phonetic reconstruction of Proto-Finnic *inehminen 'person' is tentative (cf. SSA 1 : 221; EES 92-93). Forms like Finnish ihminen are probably due to contamination with Proto-Finnic *imeh 'miracle'. The word has no acceptable etymology; attempts to derive it from various Indo-European sources are unconvincing. At the same time, comparison with the Mordvin word for 'guest' (UEW 627-628) faces multiple irregularities. In Soikkola Ingrian, there are variants of this word; the choice depends on the particular idiolect.
53. many, a lot of Est. palju Vot. pallo L-L. paljo [[phrase omitted]] Soi. paljo Fin. paljon ~ monta
The etymology of Proto-Finnic *paljo 'many' remains disputed (cf. SSA 2 : 301; EES 350). Potential Uralic comparisons are dubious (cf. UEW 350-351). The Germanic origin is not accepted in LAGLOS III 22. Saarikivi (2009 : 146-147) suggests a Slavic etymology. In Finnish there is also a word monta (partitive of moni), that may be viewed as an archaism. It goes back to Proto-Finnic *moni 'many' < Proto-Uralic *moni, the reflex of which is preserved also in Permic. The Germanic origin of this word cannot be accepted (LAGLOS II 265-266). It is not clear which word is more general in Finnish (both words sound good in the test sentences). In Estonian, Votic and Ingrian, the reflex of *moni either has a different meaning or is not the main word for 'many'.
54. meat Est. liha Vot. liha L-L. liha [[phrase omitted]] Soi. liha Fin. liha
Proto-Finnic *osa 'meat', cognate with Proto-Saami *oance 'meat', is preserved only in Livonian. In other languages, this word is replaced by Proto-Finnic *liha (cf. SSA 2 : 72; EES 238-239), whose Livonian reflex pres erved the original meaning 'body; (human) flesh'.
55. moon Est. kuu Vot. kuu L-L. kuu [[phrase omitted]] Soi. kuu Fin. kuu
Proto-Finnic *kuu 'moon' goes back to Proto-Uralic *kiwi or *ki[??]i 'moon' (cf. SSA 1 : 455-456; EES 196-197; UEW 211-212).
56. mountain Est. magi Vot. matsi L-L. maki [[phrase omitted]] Soi. magi Fin. vuori
Proto-Finnic *voori 'mountain', going back to Proto-Uralic *wari 'hill, mountain' (cf. SSA 3 : 475; UEW 571), is preserved only in Finnish, where it is opposed to maki 'hill'. Other languages have lost the inherited word for 'mountain' and replaced it with the word for 'hill'. Proto-Finnic *maki 'hill' goes back to Proto-Uralic *maki, also preserved in Khanty, where its reflex means 'tussock' (cf. SSA 2 : 191; EES 294; UEW 266).
57. mouth Est. suu Vot. suu L-L. suu [[phrase omitted]] Soi. suu Fin. suu
Proto-Finnic *suu 'mouth' goes back to Proto-Uralic *suwi 'throat, mouth' (cf. SSA 3 : 223-224; EES 491; UEW 492-493).
58. nail Est. kuus Vot. tsunsi L-L. kunsI [[phrase omitted]] Soi. kunz Fin. kynsi
Proto-Finnic *kunci 'claw, nail' goes back to Proto-Uralic *kunci 'claw, nail ' (cf. SSA 1 : 464; EES 216; UEW 157).
59. name Est. nimi Vot. nimi ~ imi L-L. nimi [[phrase omitted]] Soi. imi Fin. nimi
Proto-Finnic *nimi 'name' goes back to Proto-Uralic *nimi 'name' (cf. SSA 2 : 222; EES 313; UEW 305). Votic and Ingrian show variation between nimi and the variant imi, whose origin is not obvious (possibly it results from a contamination of nimi and Russian [phrase omitted] 'name'). In Soikkola Ingrian, the variant imi is the most prevalent form; in Luuditsa Votic both variants are used; for Lower Luga Ingian the variant nimi looks more typical.
60. neck Est. kael Vot. kag[??] L-L. kaglA [[phrase omitted]] Soi. kagla Fin. kaula
Proto-Finnic *kakla 'neck' is borrowed from Baltic (SSA 1 : 331; EES 113). This word replaces Proto-Uralic *sepa 'neck, collar', preserved in Finnic with the meanings 'collar, front part of sledge, etc.' (cf. SSA 3 : 169-170; UEW 473-474).
61. new Est. uus Vot. uus(i) L-L. uusI [[phrase omitted]] Soi. uuz Fin. uusi
Proto-Finnic *uuci 'new' goes back to Proto-Uralic *wu[??]i 'new' (cf. SSA 3 : 3 81; EES 581; UEW 587).
62. night Est. oo Vot. uu L-L. uo [[phrase omitted]] Soi. oo Fin. yo
Proto-Finnic *oo 'night' goes back to Proto-Uralic *uji or *eji 'night' (cf. SSA 3 : 493; EES 633; UEW 72).
63. nose [HOC] Est. nina Vot. nena L-L. nena Soi. nena Fin. nena
Proto-Finnic *nena ~ *nena ~ *nana 'nose' is related to Proto-Saami *nuone 'nose' (cf. SSA 2 : 213; EES 313-314).
64. not [He] Est. ei Vot. eb L-L. ei Soi. ei Fin. ei
Proto-Finnic negative verb *e- goes back to the Proto-Uralic negative verb *e- (cf. SSA 1 : 99; EES 59; UEW 68-70).
65. one Est. uks Vot. uhs(i) L-L. uks [[phrase omitted]] Soi. uks Fin. yksi
Proto-Finnic *ukci 'one' goes back to the Proto-Uralic word for 'one', attested from Finnic to Mansi (cf. SSA 3 : 489; EES 635; UEW 81). However, the exact phonetic reconstruction of the Proto-Uralic form is difficult.
66. rain Est. vihm Vot. vihm[??] L-L. vihmA [[phrase omitted]] Soi. vihma Fin. sade
Proto-Finnic *vihma 'rain' is related to Proto-Saami *vesme 'light snow' (cf. SSA 3 : 438; EES 601). In Finnish, vihma means 'drizzle' and a derivative from Proto-Finnic *sata- 'to rain, to snow' (< Proto-Uralic *sa[??]a- 'to rain') is used as the main word for 'rain' instead (cf. SSA 3 : 141, 160, EES 455-456).
67. red Est. punane Vot. kauniz L-L. punnain [[phrase omitted]] Soi. punnain Fin. punainen
Proto-Finnic *punain??n 'red' is derived from Proto-Finnic *puna 'red colour'-a reflex of Proto-Uralic *puna 'hair, fur' (cf. SSA 2 : 426-427; EES 137; UEW 402). The semantic development may look strange, but is actually understandable. The words for 'hair' in Eurasia frequently have an additional meaning 'colour'. An intermediate meaning 'hair colour (of animals)' is actually attested for reflexes of PU *puna in Hill Mari and South Khanty. The following path of sematic development can be supposed in this case: 'hair, fur' > '(hair) colour' > 'red colour'. In Votic, the main word for 'red' is kauniz, going back to Proto-Finnic *kaunis 'beautiful', borrowed from Germanic (LAGLOS II 62). The semantic shift 'beautiful' > 'red' occurred under the influence of Russian [phrase omitted] 'red/beautiful'. (8)
68. road Est. tee Vot. tee L-L. tie [[phrase omitted]] Soi. tee Fin. tie
Proto-Finnic *tee 'road' is apparently related to Komi tuj 'road', although the reconstruction of a common protoform is difficult (cf. SSA 3 : 288; EES 520; UEW 794).
69. root Est. juur Vot. juuri L-L. juurI [[phrase omitted]] Soi. juuri Fin. juuri
Proto-Finnic *juuri 'root' goes back to Proto West Uralic *juwri 'root', attested also in Mordvin (cf. SSA 1 : 253; EES 102; UEW 639). This word repl aces Proto-Uralic *wanca 'root' (cf. UEW 548-549).
70. round Est. ummargune Vot. ummerkajn L-L. ummerkain [[phrase omitted]] Soi. umberlain Fin. pyorea
Proto-Finnic *poore[??]a 'round', reflected in Finnish, has cognates with the same meaning in Ob-Ugric languages and goes back to Proto-Uralic *pe[??]ira 'round' (cf. SSA 2 : 455; EES 406; UEW 372-373). In other languages in our sample, the word 'round' is derived from Proto-Finnic *umpara, a Germanic loanword with a Finnic suffix *-ra (cf SSA 3 : 491; EES 636-637; LAGLOS III 426-427). There is no difference between '3D round' and '2D round'.
71. sand Est. liiv Vot. liiv[??] L-L. liivA [[phrase omitted]] Soi. liiva Fin. hiekka
Proto-Finnic *liiva 'sand' may be a Baltic or Germanic loan (SSA 2 : 205; EES 240; LAGLOS II 207). Although now Ingrian is the only North Finnic language that has this word for 'sand', the Proto-Finnic status of the word is confirmed by the fact that it was borrowed from a lost North Finnic idiom into Permic languages: Komi lia, Udmurt luo 'sand' (Saarikivi 2006 : 36). In Finnish, a specific word hiekka is used instead (SSA 1 : 160).
72. to say Est. utelda Vot. jut[??] L-L. sanno [[phrase omitted]] ~ oelda [utlema] Soi. sannoa Fin. sanoa
Proto-Finnic *sano- ~ *seno- 'to say' is derived from *sana ~ *sena 'word' (cf. SSA 3 : 155; EES 494). In Estonian and Votic, this word is replaced by the reflexes of Proto-Finnic *jutta- 'to talk; to tell, narrate', going back to Proto-Uralic *jupta- 'to tell, narrate' (cf. SSA 1 : 252; EES 102, 627; UEW 104; Aikio 2002 : 48). The Estonian verb demonstrates an irregular change *ju- > u-. The original anlaut is preserved in the Estonian noun jutt 'story; talk'. The Estonian reflex of *seno- has a clearly secondary meaning 'to scold'.
73. to see Est. naha [nagema] Vot. nahh[??] L-L. naha [[phrase omitted]] Soi. nah(h)a Fin. nahda
Proto-Finnic *nake- 'to see' goes back to Proto-Uralic *naki- 'to see' (cf. SSA 2 : 249; EES 326-327; UEW 302).
74. seed Est. seeme Vot. seemene L-L. siemen [[phrase omitted]] Soi. seemen Fin. siemen
Proto-Finnic *seemen 'seed' is a Baltic borrowing (SSA 3 : 173; EES 464).
75. to sit Est. istuda [istuma] Vot. issua L-L. isto [[phrase omitted]] Soi. istua Fin. istua
Proto-Finnic *istu- 'to sit' goes back to Proto West Uralic *isa- 'to sit', which may be an Indo-European borrowing (cf. SSA 1 : 229; EES 94; UEW 629).
76. skin Est. nahk Vot. nahk[??] L-L. nahkA [[phrase omitted]] Soi. nahka Fin. iho
Proto-Finnic *iho 'skin' goes back to Proto-Uralic *isa 'skin, surface' and is preserved in Finnish (cf. SSA 1 : 222; EES 89; UEW 636-637). Other idioms use Proto-Finnic *nahka 'skin, hide', borrowed from Germanic (SSA 2 : 202; EES 306; LAGLOS II 287-288). In Finnish, there is a word nahka but it has a more specific meaning (mainly it is 'a skin of an animal, fur' but in colloquial speech it can be easily used in the test contexts).
77. to sleep Est. magada [magama] Vot. magat[??] L-L. maatA [[phrase omitted]] Soi. maada Fin. nukkua
The Germanic etymology of Proto-Finnic *maka- 'to sleep', pace LAGLOS, does not seem convincing to us (SSA 2 : 136; EES 270; LAGLOS II 237-238). In Finnish, this word means 'to lie' (see above) and the meaning 'to sleep' is expressed by the reflex of Proto-Finnic *nukku- 'to doze, to drowse' (SSA 2 : 237), cognate with Proto-Saami *nokku- 'to doze, to drowse' (SSA 2 : 237). The Proto-Uralic word for 'to sleep' was *a[??]i- (cf. UEW 334; Aikio 2015 : 51).
78. small, little Est. vaike Vot. peen(i) L-L. pienI [[phrase omitted]] Soi. pikkarain Fin. pieni (~ pikkarain)
Proto-Finnic *peeni 'small' is preserved in Votic and Finnish (cf. SSA 2 : 348; EES 358). The Germanic etymology of this word is not convincing (LAGLOS III 55). This word exists in Estonian but rather means 'thin, fine'. In Soikkola Ingrian, the word pikkarain (that also exists in Finnish) predominates, but in Lower Luga it is not the most prevalent variant. In Votic, pikkerajn is less common than peen(i). According to SSA 2 : 361, this word is a hypocoristic byform of *peeni. In Estonian, the main word for 'small' is derived from Proto-Finnic *vaha 'small', which may go back to Proto West Uralic (cf. SSA 3 : 478; EES 618-619; UEW 818-819). Germanic etymologies, proposed for this word, are dubious (LAGLOS III 420). The semantic difference between *peeni and *vaha on the Proto-Finnic level remains elusive.
79. smoke Est. suits Vot. savvu L-L. savvU [[phrase omitted]] Soi. savvu Fin. savu
Proto-Finnic *savu 'smoke' goes back to Proto West Uralic *siwi 'smoke' (cf. SSA 3 : 163; UEW 754). The Estonian word is a reflex of Proto-Finnic *suiccu 'smoke', with potential cognates in Saami meaning 'to rise' (cf. SSA 3 : 208; EES 486). This word is also attested in Finnish dialects. Since reflexes of *savu are the main words for 'smoke' in Livonian and South Estonian, there can be no doubt that the main Proto-Finnic word for 'smoke' was *savu.
80. to stand Est. seista [seisma] Vot. sejss[??] L-L. seissa [[phrase omitted]] Soi. seissa Fin. seisoa
Proto-Finnic *saisa- 'to stand' goes back to Proto-Uralic *sa[??]sa- 'to stand' (cf. SSA 3 : 164-165; EES 466; UEW 431-432).
81. star Est. taht Vot. tahti L-L. tahtI [[phrase omitted]] Soi. tahti Fin. tahti
Proto-Finnic *tahti 'star' is related to Saami and Mordvin words for 'star' and the Mari word for 'sign' (cf. SSA 3 : 353; EES 565; UEW 793-794). However, irregular sound correspondences between these forms suggest that the word was borrowed from an unknown substrate separately in already differentiated branches of West Uralic (Aikio 2015 : 43-47). This word replaced Proto-Uralic *kunsi 'star' (cf. UEW 210-211).
82. stone Est. kivi Vot. tsivi L-L. kivi [[phrase omitted]] Soi. kivi Fin. kivi
Proto-Finnic *kivi 'stone' goes back to Proto-Uralic *kiwi 'stone' (cf. SSA 1 : 378; EES 163-164; UEW 163-164).
83. sun Est. paike Vot. pajvud L-L. paivukkain [[phrase omitted]] Soi. paivud Fin. aurinko
Proto-Finnic *paiva 'sun, day' goes back to Proto-Uralic *pajwa, whose reflexes mean 'sun, day' in Saami and 'heat, warm' in Samoyed (cf. SSA 2 : 456; EES 403; UEW 360). Finnish paiva means 'day' only. In the meaning 'sun', the word is replaced by aurinko, which has no acceptable etymology (SSA 1 : 90).
84. to swim Est. ujuda [ujuma] Vot. ujjua L-L. ujjo [[phrase omitted], Soi. ujjua Fin. uida [phrase omitted]]
Proto-Finnic *ui- 'to swim' goes back to Proto-Uralic *uji- 'to swim' (cf. SSA 3 : 368; EES 576-577; UEW 542).
85. tail Est. saba Vot. ant[??] L-L. handA [[phrase omitted]] Soi. handa Fin. hanta
Proto-Finnic *hanta 'tail' has no acceptable etymology: supposed cognates in other branches of Uralic show irregular correspondences (cf. SSA 1 : 208; EES 85; UEW 56). In Estonian, it also exists but the main word for 'tail' was borrowed from the Baltic languages (EES 455). The Proto-Uralic word for 'tail' was *ponci.
86. that [TOT] Est. see Vot. see (9) L-L. see Soi. see Fin. tuo
Both Proto-Finnic *se 'that' (SSA 3 : 163; EES 463-464; UEW 33-34) and Proto-Finnic *too 'that' (cf. SSA 3 : 327-328; EES 538; UEW 526-528) have Uralic pedigree. However, it is difficult to reconstruct the Proto-Finnic demonstrative system. Finnic dialects have different systems: monopartite, bipartite or tripartite. Standard Estonian has a formally bipartite system see ~ too but it functions rather as a monopartite system where see means 'this/that' and in the contrastive contexts the word teine 'other' is usually used. Finnish has a tripartite system tama ~ tuo ~ se, and in the test contexts tuo is preferable. Votic and Ingrian have bipartite systems but see is often used in the contexts for 'this'.
87. this [[??]TOT] Est. see Vot. kase L-L. tama Soi. tama Fin. tama
According to Laanest (1982 : 196), Votic kase results from the merging of some interjection with se. Tama is a Uralic word (SSA 3 : 355; UEW 513-515). Estonian tema and Votic tama are 3Sg personal pronouns but not demonstrative pronouns. Since the typical path of diachronic development leads from demonstrative pronouns to personal pronouns, but not vice versa, we can suppose that Proto-Finnic *tama 'this' was a demonstrative (see comments on the previous word).
88. tongue Est. keel Vot. tseeli L-L. kielI [[phrase omitted]] Soi. keeli Fin. kieli
Proto-Finnic *keeli 'tongue' goes back to Proto-Uralic *kali 'tongue' (cf. SSA 1 : 353; EES 140; UEW 144-145).
89. tooth Est. hammas Vot. ammez L-L. hammaz [[phrase omitted]] Soi. hammaz Fin. hammas
Proto-Finnic *hambas 'tooth' is a Baltic loanword (SSA 1 : 136; EES 68-69). This word replaced Proto-Uralic *pi[??]i 'tooth', whose Finnic reflex *pii means 'tooth in a saw, rake etc.' (cf. SSA 2 : 352; UEW 382).
90. tree Est. puu Vot. puu L-L. puu [[phrase omitted]] Soi. puu Fin. puu
Proto-Finnic *puu 'tree' goes back to Proto-Uralic *pawi 'tree' (cf. SSA 2 : 443-444; EES 396-397; UEW 410-411).
91. two Est. kaks Vot. kahs(i) L-L. kaks [[phrase omitted]] Soi. kaks Fin. kaksi
Proto-Finnic *kakci 'two' goes back to the Proto-Uralic numeral 'two', whose exact phonetic shape is hard to reconstruct (cf. SSA 1 : 282; EES 120; UEW 118-119).
92. warm Est. soe Vot. sooj[??] L-L. soojA [[phrase omitted]] Soi. lammaa Fin. lammin
Proto-Finnic *lambin 'warm' goes back to Proto-Uralic *lampi 'warm' (cf. SSA 2 : 124; EES 263; UEW 685; Aikio 2002 : 13). The word lammi exists in Estonian dialects (EES 263), and the same root is known in Votic (mostly through the word lammitta(a) 'to stoke', VKS 657). Other idioms use the root *sooja 'shelter; warm', borrowed from an Iranian word for 'shade' (cf. SSA 3 : 214; EES 478; UEW 748-749). In Finnish, there is a word suoja but it is not the main word for 'warm' (it is used when speaking about above-zero weather). In Ingrian, the same root is observed only in the Lower Lug a dialect (Nirvi 1971 : 542).
93. water [water] Est. vesi Vot. vesi L-L. vesi Soi. vezi Fin. vesi
Proto-Finnic *veci 'water' goes back to Proto-Uralic *weti 'water' (cf. SSA 3 : 429; EES 599; UEW 570).
94. we Est. meie ~ me Vot. muu L-L. muo [[phrase omitted]] Soi. moo Fin. me
Proto-Finnic *me(k) 'we' goes back to Proto-Uralic *me(-) 'we' (cf. SSA 2 : 156; EES 279; UEW 294-295). In Estonian, there is variation between a long and a short form.
95. what Est. mis Vot. mika L-L. mika [[phrase omitted]] Soi. miga Fin. mika
Proto-Finnic *mi(ka) 'what' goes back to Proto-Uralic *mi ~ *mi 'what' (cf. SSA 2 : 164; UEW 296). In Estonian, the formative -s originates from a demonstrative pronoun see (EES 282-283).
96. white Est. valge Vot. va[??]ka L-L. valke [[phrase omitted]] Soi. valkkia Fin. valkoinen
Proto-Finnic *valke[??]da 'white' goes back to Proto-Uralic *wilki 'light' (cf. SSA 3 : 399-400; EES 588; UEW 554-555; Aikio 2015 : 59). In Finnish, the derivate with the adjectival suffix valkoinen looks more natural in the test contexts than valkea 'white'.
97. who [KTO] Est. kes Vot. tsen L-L. ken Soi. ken Fin. kuka
Proto-Finnic *ken 'who' goes back to Proto-Uralic *ke(-) 'who' (cf. SSA 1 : 342-343; EES 145-146; UEW 140-141). The Estonian word has the formative -s absent from three of the idioms, however the variant ken is observed in the Estonian dialects (EES 145-146). In Finnish, the main word for 'who' is kuka < Proto-Uralic interrogative stem *ku(-), used in words for 'where', 'which', etc. (SSA 1 : 423-424; UEW 191-192), but the word ken also exists as a poetic variant.
98. woman Est. naine Vot. najn L-L. nain [[phrase omitted]] Soi. nain Fin. nainen
Proto-Finnic *nainen 'woman' (cf. SSA 2 : 202; EES 306) is derived from a root *naa-, seen also in naaras 'female' (SSA 2 : 200-201). This root goes back to Proto-Uralic *naxi 'woman' (Janhunen 1981 : 245-246).
99. yellow Est. kollane Vot. ke[??]tejn L-L. keltain [[phrase omitted]] Soi. kelttain Fin. keltainen
Proto-Finnic *keltainen 'yellow' consists of the root borrowed from Baltic, and an adjectival suffix (SSA 1 : 342; EES 172-173).
100. you (thou) Est. sina (~ sa) Vot. sia L-L. sia [[phrase omitted]] Soi. sia Fin. sina (~ sa)
Proto-Finnic *cina 'thou' goes back to Proto-Uralic *tin 'thou' (cf. SSA 3 : 184; EES 473-474; UEW 539). In Estonian and Finnish, there is variation between a long and a short form.
101. far Est. kaugel Vot. kauka[??] L-L. kaukall [[phrase omitted]] Soi. ettaa/ettal Fin. kaukana
Proto-Finnic *kauka- 'far' is a Germanic loanword (Aikio 2000; EES 137). Supposed cognates in Mordvin and Khanty (SSA 1 : 330-331; UEW 132) are phonetically incompatible with the Finnic word. Soikkola Ingrian uses a reflex of Proto-Finnic *eta- 'far', going back to Proto West Uralic *eca-'far' (cf. SSA 1 : 109-110; UEW 624). This word is the main word for 'far' also in Veps. It is hard to say which of these two words was the main Prot o-Finnic word for 'far'.
102. heavy Est. raske Vot. rankk[??] L-L. rankkA [[phrase omitted]] Soi. raskaz Fin. raskas
There are two different words: Proto-Finnic *rankka 'heavy', apparently borrowed from Germanic (cf. SSA 3 : 47; EES 419, 445; LAGLOS III 124-125), and Proto-Finnic *raskas 'heavy' (cf. SSA 3 : 52; EES 419-420). The former became dominant in Votic and Lower Luga Ingrian, the latter in three other idioms. Estonian rank is more bookish than raske. It is difficult to tell which word was the main word for 'heavy' in Proto-Finnic.
103. near Est. lahedal Vot. litsi L-L. liki [[phrase omitted]] Soi. ligi Fin. lahella
Proto-Finnic *lahe- 'near', going back to Proto-Uralic *lasi 'near', is preserved in Estonian and Finnish (cf. SSA 2 : 122; EES 262; UEW 687; Aikio 2002 : 48). Votic and Ingrian use another root, Proto-Finnic *liki 'near', cognate with Proto-Saami *leke 'near' (cf. SSA 2 : 76; EES 238). Estonian ligidal and Finnish liki ~ likella are synonymic forms but are less general or neutral. The original semantic diffe rence between *lahe- and *liki in Proto-Finnic is not clear.
104. salt Est. sool Vot. soo[??] L-L. suolA [[phrase omitted]] Soi. soola Fin. suola
Proto-Finnic *soola 'salt' is borrowed from an Indo-European language, most probably from Baltic (cf. SSA 3 : 214-215; EES 480). Similar loanwords exist in other Uralic languages (UEW 750-751), but the phonetic shape of the Finnic word (long vowel in an a-stem) shows that it was borrowed independently.
105. short Est. luhike Vot. luhud L-L. luhud [[phrase omitted]] Soi. luhud Fin. lyhyt
Proto-Finnic *luhut 'short' has no acceptable etymology (cf. SSA 2 : 117; EES 266).
106. snake Est. uss Vot. mato L-L. mato [[phrase omitted]] Soi. mado Fin. kaarme
Although Proto-Finnic *kuu 'viper, snake', going back to Proto-Uralic *kuji 'snake', retains the meaning 'snake' in Karelian, Veps, and Livonian dialects (cf. SSA 1 : 467; UEW 154-155), these are hardly the main words for 'snake' in the respective idioms. Proto-Finnic *mato 'snake, worm' was perhaps the main word for 'snake' already in the proto-language. It is possibly a Germanic borrowing. According to an alternative etymology, *mato is cognate with Proto-Saami *muoce 'moth' (cf. SSA 2 : 154; EES 270; LAGLOS II 255). In Finnish, this word means 'worm' (see below), and another word, ultimately borrowed from Baltic, is used for 'snake' (SSA 1 : 484). In Estonian, the word madu means 'snake', but a more neutral word is uss 'snake, worm'. The etymology of uss is not clear but it is possibly a Russian borrowing (EES 580).
107a. thin (2D) Est. ohuke Vot. hojkk[??] L-L. hoikkA [[phrase omitted]] Soi. hoikka ~ hoikkain Fin. ohut
Two words can be reconstructed: Proto-Finnic *ohut 'thin' (cf. SSA 2 : 260; EES 625) and Proto-Finnic *hoikka 'thin' (cf. SSA 1 : 169). The former goes back to Proto-Uralic *woksi 'thin' ([phrase omitted] 2011 : 110; Luobbal Sammol Sammol Ante (Aikio) 2014 : 10-11), the latter has no known etymology. It is hard to reconstruct the semantic difference between these words on the Proto-Finnic level. The word hoikka also exists in Finnish but is not predominant there, while the reflexes of *ohut are not predominant in Votic and Ingrian. In Soikkola Ingrian, there is a variant with an adjectival suffix.
107b. thin (1D) Est. peenike Vot. hojkk[??] L-L. hoikkA [[phrase omitted]] Soi. hoikka Fin. ohut
In Votic, Finnish and Lower Luga Ingrian there is no difference between '2D thin' and '1D thin'. In Soikkola Ingrian, the variant with the suffix is not typical in the test contexts. In Estonian, the derivate from peen 'small' (see above) is more typical in the test contexts (the form peen without a suffix is also possible in the test contexts but peenike looks more neutral).
108. wind Est. tuul Vot. tuuli L-L. tuulI [[phrase omitted]] Soi. tuuli Fin. tuuli
Proto-Finnic *tuuli 'wind' goes back to Proto-Uralic *tiwli 'wind' (cf. SSA 3 : 340; EES 558-559; UEW 800).
109. worm Est. uss Vot. matokkejn L-L. matokkain [[phrase omitted]] ~ mato ~ mato Soi. madokkain Fin. mato ~ mado
The distinction snake vs worm is not typical for Finnic languages. Among the five analysed idioms only Finnish distinguishes these two notions, while in the other languages this distinction is not relevant. Thus, we can reconstruct Proto-Finnic *mato 'snake, worm'. In Votic and Ingrian, a derivate with the diminutive suffix can be used to stress that it is a worm but not a (big) snake. (See comments to the word for 'snake', #106.)
110. year Est. aasta Vot. voosi L-L. vuosI ~ aastaikA [[phrase omitted]] Soi. vooz Fin. vuosi
Proto-Finnic *vooci 'year' goes back to Proto-Uralic *i[??]i 'year' (cf. SSA 3 : 476; EES 612-613; UEW 335-336). In Estonian, this word means 'harvest' and another word is used for 'year' (etymologically a compound *ai[gamma]astaaika built from the forms of *aika 'time', see EES 42). In Lower Luga Ingrian, both words are used; the choice depends on the particular idiolect.
In the current section we formulate some observations on the compiled wordlists. These are preliminary observations that do not purport to be a comprehensive analysis of the data.
3.1. The analysed set of five languages is rather homogeneous. Among 111 items, 77 (69%) have the same word in all five varieties. There are no items where all five idioms use different roots neither are there items with four different roots. There are only three items in the list where three roots appear: #47 'to lie' Est. lamada vs Fin. maata vs Vot. lezia, L-L. lezze, Soi. lessia, #106 'snake' Est. uss vs Fin. kaarme vs Vot. mato, N L. mato, Soi. mado, #107b 'thin' Est. peenike vs Fin. ohut vs Vot. hojkk[??], L-L. hoikkA, Soi. hoikka. For all three items opposition is organized in the same way: Estonian opposes Finnish and they both oppose three minor varieties, which have the same root.
In all other cases, either one language has a root that is different from the other languages (24 items) or two languages differ from the other three (7 items).
3.2. The three minor varieties are rather uniform; the two major languages are often different from the minor ones.
The three minor Finnic varieties do not demonstrate significant diversity. Only in 8 cases (i.e. 7%), the roots were not the same. Ingrian is opposed to all other varieties in #48 'liver' and #101 'far'; Votic is different from all other varieties in #67 'red' (and this difference would not hold if we take other Votic varieties into account); in two cases Votic is uniform only with Estonian (#72 'say' and #87 'this'); in one case Votic and Lower Luga Ingrian are different from the other varieties including Soikkola Ingrian (#102 'heavy'), in another Soikkola Ingrian and Finnish differ from the other varieties (#92 'warm'), and there is also a specific Estonian word which exists as one of the two variants for Lower Luga Ingrian (#110 'year'). Summing up, the number of cases where a minor variety does not have the same root as the two other minor varieties is the following: Votic-3 items, Soikkola Ingrian-4 items, Lower Luga Ingrian-1 item.
However, the situation with major languages is quite different. There are 11 cases where Estonian has a root that differs from all other languages (#4 'belly', #7 'to bite', #16 'to die, '#21 'earth', #47 'to lie', #78 'small', #79 'smoke', #85 'tail', #106 'snake', #107b 'thin (1D)' and #109 'small, little') and 5 cases where the Estonian root is found in one other variety but where they are opposed to the other three varieties (#72 'to say', #87 'this', #103 'near', #107a 'thin (2D)', and #110 'year'). In Finnish, a root is opposed to all other varieties in 16 cases (#3 'bark', #5 'big, large', #35 'green', #47 'to lie', #53 'many, a lot of', #56 'mountain', #66 'rain', #77 'to sleep', #70 'round (3D)', #71 'sand', #76 'skin', #83 'sun', #86 'that', #97 'who', #106 'snake' and #107b 'thin (1D)'), and there are 3 cases where a Finnish root is the same as in one of the other varieties but different from all others (#92 'warm', #103 'near', #107a 'thin (2D)').
3.3. Since none of the five varieties is isolated from all others, the compiled lists should be considered also from the point of view of language contact. The most typical directions of borrowing for these varieties are the following:
(a) Votic borrowed many words from Ingrian. Usually it is difficult to define whether it was a borrowing from Soikkola Ingrian adapted to Votic phonetics or a borrowing from Lower Luga Ingrian. Also there are two types of borrowings: regular borrowings (e.g. Vot. huu 'they', karkku 'cone') and recent "double-layer" borrowings where the Ingrian pronunciation of a word replaced the original Votic variant (e.g. auki 'pike', hiili 'coal', haapezikko 'aspen forest' cf. proper Votic autsi, iili, aapezikko).
In the compiled Swadesh lists, we did not notice obvious borrowings from Votic into Ingrian. (10) If a word, which has specific phonetic differences between Votic and Ingrian, is borrowed from Ingrian into Votic, it usually keeps the Ingrian phonetic shape (e.g. the initial [h], or [k] before a front vowel). However, for all such pairs which appear in our Swadesh lists, Votic has its original phonetic shape so we cannot assume that these words were borrowed: cf. #14 'cold' Soi. kulma, Vot. tsulm[??], #34 'good' Soi. huva, Vot. uva, #36 'hair' Soi. hiuz, Vot. ivuz, #37 'hand' Soi. kazi, Vot. tsasi, #49 'long' Soi. pitka, Vot. pitts[??]~pittsi, #56 'mountain' Soi. magi, Vot. matsi, #58 'nail' Soi. kunz, Vot. tsunsi, #82 'stone' Soi. kivi, Vot. tsivi, #85 'tail' Soi. handa, Vot. ant[??], #88 'tongue' Soi. keeli, Vot. tseeli, #89 'tooth' Soi. hammaz, Vot. ammez, #97 'who' Soi. ken, Vot. tsen, #99 'yellow' Soi. kelttain, Vot. ke[??]tejn, #103 'near' Soi. ligi, Vot. litsi.
Based on this, we can state that the Swadesh list is stable from the point of view of new borrowings.
(b) As Lower Luga Ingrian is a convergent language on the basis of Votic and Ingrian, it could have taken many words from Votic. However, among 111 words of the core lexicon, there is only one possible candidate for such a borrowing: the word #102 rankkA 'heavy' (Vot. rankk[??]). In the three other varieties, another root is observed. We do not have solid evidence that this word came from Votic and was not some dialectal variant in Ingrian.
(c) One can also expect some borrowings from Finnish via the Ingrian Finnish dialect into Votic or into Lower Luga Ingrian. However, we did not notice such candidates in the compiled lists. The same concerns the borrowings from Estonian into Lower Luga Ingrian: usually, they are not from the core lexicon (e.g. kleit < Est. kleit 'dress').
3.4. Diversity in the core lexicon is explained by different reasons. Among the 34 items where the five varieties were not uniform, several groups of words are distinguished.
a. The biggest group appeared because of quasi-synonymic words that existed in Proto-Finnic. (11) It happened (usually without obvious reason) that one word became predominant in one language and its synonym became predominant in another language. This situation is observed with the following items. Estonian: #4 'belly' Est. koht vs Fin. vatsa (12), #78 'small, little' Est. vaike vs Fin. pieni; Estonian and Finnish: #103 'near' Est. lahedal, Fin. lahella vs Vot. litsi, #107a 'thin(2D)' Est. ohuke, Fin. ohut vs Vot. hojkk[??]; Finnish: #5 'big, large' Fin. iso vs Est. suur, #53 'many, a lot of' Fin. monta vs Est. palju (as well as the alternative Finnish variant paljon), #70 'round(3D)' Fin. pyorea vs Est. ummargune, #76. 'skin [[phrase omitted]]' Fin. iho vs Est. nahk, #86 'that' Fin. tuo vs Est. see, #107b 'thin(1D)' Fin. ohut vs Soi. hoikka; Finnish and Soikkola Ingrian: #92 'warm' Fin. lammin, Soi. lammaa vs Est. soe; Soikkola Ingrian: #101 'far' Soi. ettaa/ettal vs Fin. kaukana; Votic and Lower Luga Ingrian: #102 'heavy' Vot. rankk[??], L-L. rankkA vs Fin. raskas.
b. Some words appeared in the list because of a semantic shift. (13) They already existed in Proto-Finnic but in some language(s) they changed their meaning and became predominant for the corresponding item in the Swadesh list. In some cases, the semantic shift happened in a majority of the varieties, so that only one language preserves the original Proto-Finnic root while the others use another root for the item in the list. This is the case, for example, with #56 'mountain' where only Finnish retains the original Finnic root for 'mountain'.
The words that have a different root due to a semantic shift specific to Estonian are #16 'to die' surra, #21 'earth' muld, #79 'smoke' suits, and #107b 'thin(1D)' peenike. Specific to Estonian and Votic are the words #72 'to say' Est. utelda/oelda, Vot. jute[??]. In Votic, the word #67 'red' kauniz shifted its meaning from 'beautiful' to 'red'. The aforementioned word #56 'mountain' underwent a semantic shift in all varieties except Finnish: Est. magi, Vot. matsi L-L. maki Soi. magi. Specific to Finnish are also the words #3 'bark' kaarna, #47 'to lie' maata, #77 'to sleep [[phrase omitted]] nukkua, #97 'who' kuka, and #106 'snake' kaarme. (14)
c. In rare cases a new derivative from the old root traced to Proto-Finnic or earlier becomes a predominant word in a language. In Estonian, such words are #47 'to lie' lamada and #110 'year' aasta. The latter word also appears in Lower Luga Ingrian: aastaikA is one of the variants for 'year' (see Section 2). In all varieties except Finnish, the word #35 'green' is an adjective derived from the noun with the meaning 'grass': Est. roheline, Vot. rohojn, L-L. rohoin, Soi. rohhoin. In Finnish, the noun #66 'rain' sade is derived from the original verb. Possibly, a Soikkola Ingrian compound #48 'liver' leiba-liha built from two Finnic roots should be placed in this group too. d. In spite of the fact that the core lexicon is relatively stable, new (post-Proto-Finnic) loan words can replace the original words. In Estonian, the word saba (#85 'tail') was borrowed from the Baltic languages, and the word uss (both #106 'snake' and #109 'worm') was possibly borrowed from Russian. In all three minor varieties, the word for #47 'to lie' was borrowed from Russian: Vot. lezia, L-L. lezze, Soi. lessia. In Soikkola Ingrian, one of the variants for #48 'liver' is also a Russian loanword: petsonka.
e. In addition to the described groups, there are two Finnish words with unclear etymology: #71 'sand' hiekka and #83 'sun' aurinko. Also, the word imi (#59 'name'), which is predominant in Soikkola Ingrian and is present in Votic as one of two variants, does not belong unambiguously to one of the proposed groups: it could be either a borrowing from Russian or a contamination (see Section 2).
The distribution of the divergent part of the core lexicon among the discussed groups and varieties is summarized in Table 1. (15)
3.5. The distribution of words in the core lexicon does not correlate with borders between Finnic sub-groups.
One might expect that many of the analysed words would oppose southern Finnic languages (Estonian and Votic) and northern Finnic languages (Finnish and the two dialects of Ingrian). In fact, only two items demonstrate such an opposition: #72 'to say' and #87 'this' (the latter case is not pure since Votic uses a more complicated morphological form than Estonian: kase vs se). Even if we take into account the fact that Lower Luga Ingrian was heavily influenced by Votic and possibly should not be unambiguously considered a northern Finnic language, the situation would not change: only one word opposes Finnish and Soikkola Ingrian to the other varieties: #92 'warm'. This fact has two theoretically possible interpretations: (a) the difference between the two Finnic branches is not considerable enough to be reflected in the core lexicon represented in the Swadesh list; (b) in a contact zone between closely related languages, convergent processes can play a part (e.g. one of the existing basic words becomes predominant under the influence of the neighbouring idiom). Both interpretations can only be confirmed through a thorough analysis of individual words, and this task is beyond the scope of the current paper.
Table 2 presents pairwise comparisons of the Swadesh lists. In the upper-right part of the table, the percentage of the common roots is given. In the lower-left part of the table, the number of words that have different roots is indicated. Rare cases where a language has two roots for the same item (e.g. Finnish paljon ~ monta 'many, a lot of' or Lower Luga Ingrian voosI ~ aastaikA) but the second language in the pair only has one of these roots were counted as 0.5 instead of 1.
The closest varieties are Votic and Lower Luga Ingrian, which formally belong to different Finnic branches. In general, the distance between all three minor languages is small. The major languages demonstrate a greater diversity, and the largest distance is between Estonian and Finnish. It can be clearly seen that the distances between the analysed varieties do not obviously correlate with their genetic affiliation. Thus, we may conclude that a lexicostatistical analysis of the minimal depth (i.e. made for closely related languages) should not be seen as demonstrating a linear correlation with the genetic distance. Changes in the core lexicon happen due to different reasons including convergent processes that are not always transparent. In spite of the fact that the analysed Finnic varieties do not have obvious borrowings from each other, it is evident that the three minor varieties located in the compact area in Western Ingria are less diverse than geographically peripheral major languages.
The Swadesh lists for five Finnic varieties were compiled following an elaborated methodology that makes them transparent and discussable.
The difference between minor languages (Votic and two Ingrian dialects) is small: 94% or more of their core lexicon coincides. The major languages (Estonian and Finnish) demonstrate a greater difference both from minor languages (80-86%) and from each other (75%).
There are various reasons why the lexical diversity between languages increases: semantic shifts, the existence of synonymic pairs in the proto-language, new borrowings, and new derivatives, among other reasons.
The lexicostatistic difference between closely related languages does not have a strong correlation with their genetic distance.
We are very grateful to our colleagues and native speakers of Finnic languages who we consulted in the course of our work on the wordlists, in particular, Alevtina Fedotova and Galina Samsonova on Soikkola Ingrian, Nikolai Poder on Lower Luga Ingrian, Zinaida Saveljeva on Votic, Terhi Honkola on Finnish, Partel Lippus and Ellen Niit on Estonian.
We would like to thank the anonymous reviewer and Kirill Reshetnikov for the many valuable comments on the article.
The research of F. Rozhanskiy has been supported by the University of Tartu, grant PHVEE18904.
It is obvious that Swadesh lists compiled by different researchers on the basis of different methods cannot be identical. However, a priori the degree of the diversity is not evident. For this reason, we give a short comment on the differences between the Swadesh lists for Estonian and Finnish compiled in the current article and those presented in Tillinger (2014). Tillinger's lists were chosen because they do not give synonyms and return exactly one word for each item (unlike the lists in Hofirkova, Blazek 2012, and Syrjanen, Honkola, Korhonen, Lehtinen, Vesa koski, Wahlberg 2013).
For both Estonian and Finnish, we found four cases when we propose a word different from Tillinger's (2014), see Table 3.
The reasons behind these differences are obvious: either our variant corresponds better to the context ('bark' and 'earth'), or it was chosen as more general and/or more neutral by a consultant ('big', 'skin', 'bone' and 'snake'). In case of 'many, lots of' we were not able to choose a single variant (but monta and moni have the same root); too 'that' looks more formal and is peculiar to written language so see 'this, that' was chosen as a more neutral variant.
In two cases, Tillinger (2014) does not have an exact correspondence to the words from our list. These are the items #12 burn (we use a transitive verb and Tillinger lists an intransitive verb) and #107 'thin' that is not mentioned by Tillinger.
Item #92 'warm' does not have an exact correspondence in Tillinger's Swadesh list but can be found in another wordlist (Tillinger 2014 : 183).
We conclude that in spite of the different methods of compiling the Swadesh lists, the differences between the versions do not look dramatic.
University of Tartu
Institute for Linguistic Studies of the Russian Academy of Sciences
Russian State University for the Humanities
National Research University Higher School of Economics
EVS--Eesti-vene sonaraamat I--V, Tallinn 1997-2009; GLD--http://starling.rinet.ru/new100/main.htm; LAGLOS I--A. D. Kylstra, S.-L. Hahmo, T. Hofstra, O. Nikkila, Lexikon der alteren germanischen Lehnworter in den ostseefinnischen Sprachen. Bd. I: A--J, Amsterdam--Atlanta 1991; LAGLOS II--A. D. Kylstra, S.-L. Hahmo, T. Hofstra, O. Nikkila, Lexikon der alteren germanischen Lehnworter in den ostseefinnischen Sprachen. Bd. II: K--O, Amsterdam--Atlanta 1996; LAGLOS III--Kylstra, A. D., Hahmo, S.-L., Hofstra, T., Nikkila, O., Lexikon der alteren germanischen Lehnworter in den ostseefinnischen Sprachen. Bd. III: P-A, Amsterdam--New York 2012; VKS--Vadja keele sonaraamat. Toimetanud S. Grunberg, Tallinn 2013; [phrase omitted] 2007; [phrase omitted] 1999.
Aikio, A. 2000, Suomen kauka.--Vir. 104, 612-614. --2002, New and Old Samoyed Etymologies.--FUF 57, 9-57. --2015, The Finnic 'secondary e-stems' and Proto-Uralic Vocalism.--JSFOu 95, 25-66.
Chang, W., Cathcart, C., Hall, D., Garrett, A. 2015, Ancestry-Constrained Phylogenetic Analysis Supports the Indo-European Steppe Hypothesis.--Language 91 (1), 194-244.
Ernits, E. 2005, Vadja keele varasemast murdeliigendusest ja hilisemast haabumisest. --Piirikultuuriq ja -keeleq. Konvorents Kurgjarvel, 21.-23. rehekuu 2004, Voro, 76-90.
Heinsoo, H. 2015, VadIdIa sonakopittoja, Tartu--Helsinki. --2018, Suuri paive, Tartu.
Hofirkova, L., Blazek, V. 2012, Ke klasifikaci ugrofinskych jazyku. --Linguistica Brunensia 60 (1/2), 87-126.
Holman, E. W., Wichmann, S., Brown, C. H., Velupillai, V., Muller, A., Bakker, D. 2008, Explorations in Automated Language Classification.--Folia Linguistica 42, 331-354.
Honkola, T. 2016, Macro- and Microevolution of Languages: Exploring Linguistic Divergence with Approaches from Evolutionary Biology, Turku (Turun Yliopiston Julkaisuja--Annales Universitatis Turkuensis. Ser. A II osa--tom. 311. Biologica--Geographica--Geologica).
Janhunen, J. 1981, Uralilaisen kantakielen sanastosta.--JSFOu 77, 219-274.
Kassian, A., Starostin, G., Dybo, A., Chernov, V. 2010, The Swadesh Wordlist. An Attempt at Semantic Specification.--Journal of Language Relationship 4, 46-89.
Koivulehto, J. 1983, Suomalaisten maahanmuutto indoeurooppalaisten lainasanojen valossa.--JSFOu 78, 107-132.
--2008, Fruhe slawisch-finnische Kontakte.--Evidence and Counter-Evidence. Essays in Honour of Frederik Kortlandt. Vol. 1. Balto-Slavic and Indo-European Linguistics, Amsterdam--New York, 309-321.
Laanest, A. 1982, Einfuhrung in die ostseefinnischen Sprachen, Hamburg.
Luobbal Sammol Sammol Ante (Aikio, A.) 2014, Studies in Uralic Etymology II: Finnic Etymologies.--LU L, 1-19.
Markus, E., Rozhanskiy, F. 2012, Votic or Ingrian. New Evidence on the Kukkuzi Variety.--Finnisch-Ugrische Mitteilungen 35, 77-95.
Nirvi, R. E. 1971, Inkeroismurteiden sanakirja, Helsinki (LSFU XVIII).
Rozhanskiy, F., Markus, E. 2014, Lower Luga Ingrian as a Convergent Language.--On the Border of Language and Dialect. FINKA Symposium. University of Eastern Finland, Joensuu, 4-6 June, 2014, Joensuu, 36-37.
--2015, Dialectal Variation in Votic: Jogopera vs. Luuditsa.--ESUKA 6 (1), 23-39.
Saarikivi, J. 2006, Substrata Uralica. Studies on Finno-Ugrian Substrate in Northern Russian Dialects, Tartu.
--2009, Itamerensuomalais-slaavilaisten kontaktien tutkimuksen nykytilasta. --The Quasquicentennial of the Finno-Ugrian Society, Helsinki (MSFOu 258), 109-160.
Suhonen, S. 1985, Wotisch oder Ingrisch?--Dialectologia Uralica. Materialien des ersten Internationalen Symposions zur Dialektologie der uralischen Sprachen 4-7. September 1984 in Hamburg, Wiesbaden, 139-148.
Swadesh, M. 1952, Lexicostatistic Dating of Prehistoric Ethnic Contacts.--Proceedings of the American Philosophical Society 96, 452-463.
--1971, The Origin and Diversification of Language, Chicago.
Syrjanen, K., Honkola, T., Korhonen, K., Lehtinen, J., Vesakoski, O., Wahlberg, N. 2013, Shedding More Light on Language Classification Using Basic Vocabularies and Phylogenetic Methods. --Diachronica 30 (3), 323-352.
Taagepera, R. 1994, The Linguistic Distances between Uralic Languages.--LU XXX, 161-167.
Tadmor, U. 2009, Loanwords in the World's Languages: Findings and Results. --Loanwords in the World's Languages. A Comparative Handbook, Berlin, 55-75.
Tillinger, G. 2014, Samiska ord for ord. Att mata lexikalt avstand mellan sprak, Uppsala (Studia Uralica Upsaliensia 39).
Tsvetkov, D. 1995, Vatjan kielen Joenperan murteen sanasto, Helsinki (LSFU XXV).
[phrase omitted] 2015, [phrase omitted] --Journal of Language Relationship 13 (3), 205-255.
[phrase omitted] 1966, [phrase omitted].
--1993, [phrase omitted], 55-63.
[phrase omitted] 2017, [phrase omitted].
[phrase omitted] 2003, Vadda kaazgad. [phrase omitted].
[phrase omitted] 2011, [phrase omitted] 2 (5), 109-112.
[phrase omitted] 2013, [phrase omitted]--Acta Linguistica Petropolitana. Transactions of the Institute for Linguistic Studies IX (3), St. Petersburg, 261-298.
FEDOR ROZHANSKIY (Tartu--St. Petersburg), MIKHAIL ZHIVLOV (Moscow)
(1) When our article was already submitted to the journal it became known that the dataset used by Syrjanen, Honkola, Korhonen, Lehtinen, Vesakoski, Wahlberg (2013) is now open for online access, see https://www.bedlan.net/data. The lexical lists from this dataset did not try to solve the synonymy problem: several words are often given for the same definition.
(2) For example, muno 'egg' instead of muna, polottaa 'burn (tr)' instead of polottaa, anna 'give' instead of antaa (anna is the 2Sg imperative form; but the infinitive is given for other verbs in the list), polvi 'knee' instead of polvi, tund[??] 'know' instead of tunta, ju[??]a 'say' instead of jute[??]a, seemee 'seed' instead of seemene, kelten 'yellow' instead of keltein. NB! Here we give Votic forms in the spelling for Kattila and the neighbouring varieties of Votic. Most publications on Votic including Hofirkova, Blazek 2012 follow this system of spelling.
(3) For example, 'green' is rather rohoin than viher as viher means 'unripe'; 'to lie' is lezia but not magata as magata means 'to sleep'; sato is a rare and dialectally restricted word for 'precipitation' and a common word for 'rain' is vihma; the main word for 'this' is kase and se means 'this' or 'that' depending on a context (see comments to this item in the wordlist below); 'to hear' is kuulla but kuullua is 'to be heard; to listen to smb'.
(4) See, for example, [phrase omitted] 1966 : 146, 161: "As we can see the largest number of differences is between the Lower Luga dialect and three other dialects", "At present, the problem of origin of the Lower Luga dialect cannot be finally solved".
(5) The following dictionaries were used: Tsvetkov 1995 and VKS 2013 for Votic, Nirvi 1971 for Ingrian, EVS for Estonian, [phrase omitted] and [phrase omitted] for Finnish.
(6) Our corpora were collected during fieldtrips organized by Fedor Rozhanskiy and Elena Markus in 2003-2018. The Soikkola Ingrian corpus contains about 650 hours of recordings; the Votic and Lower Luga Ingrian corpora contain about 250 hours of recordings each.
(7) We mean that Votic has never been taught in school or had a written standard that was regularly used by native speakers to communicate and to read printed materials. However, besides various texts transcribed by linguists as speech samples there were a number of texts in Votic published for native speakers or other people studying Votic (e.g. [phrase omitted] 2003; Heinsoo 2015; 2018).
(8) Kauniz did not preserve the original meaning 'beautiful' in Luuditsa Votic, but this meaning was observed in some Central Votic varieties (VKS 408).
(9) The spelling of this word in Votic and Ingrian is approximate as there is significant variation in the length of this vowel. We spell it with long ee.
(10) The examples given in the previous paragraph are not from our Swadesh lists.
(11) Of course, in such cases one of the quasi-synonyms must have been "basic" in Proto-Finnic. Additional research is needed to determine the precise semantic difference between such quasi-synonyms at the Proto-Finnic level.
(12) In cases where several languages have the same root, we give examples only from one of these languages. In Section 2 one can find words with this root in other varieties.
(13) By "semantic shift" we mean not only a proper change of meaning but also finer modifications, e. g. stylistic changes.
(14) It is unlikely that kaarme is a new borrowing, because the Northern Finnic languages did not have contact with the Baltic languages since the Proto-Finnic period.
(15) Note that a word of Finnic origin that did not change its meaning and was not a derivate was counted only in group "a" and only in cases where this word was not predominant for most of the varieties under discussion. In general, this table analyses only the words where these varieties demonstrate diversity while changes (e.g. semantic shifts) that happened in all five varieties are not studied here.
Table 1 Causes of lexical innovations Est. Vot. L-L. Soi. Fin. Overall a. Synonyms 4 1 1 2 8 + (1) 16 + (1) b. Semantic shift 6 3 1 1 5 16 c. New derivatives 3 1 1 + (1) 1 + (1) 1 7 + (2) d. New borrowings 3 1 1 1 + (1) 6 + (1) e. Other (1) 1 2 3 + (1) Overall 16 6 + (1) 4 + (1) 6 + (2) 16 + (1) Table 2 Lexicostatistical distances Est. Vot. L-L. Soi. Fin. Est. 86 % 86% 83% 75% Vot. 16 97% 94% 80% L-L. 16 3.5 96% 82% Soi. 19 7 4.5 82% Fin. 27.5 22.5 20 19.5 Table 3 Differences in versions of Swadesh lists for Estonian and Finnish N Meaning Tillinger Rozhanskiy, Zhivlov Finnish 3 bark kuori kaarna 5 big, large suuri iso 53 many, a lot of moni paljon ~ monta 76 skin nahka iho Estonian 10 bone kont luu 12 earth maa muld 86 that too see 106 snake madu uss
[Please note: Some non-Latin characters were omitted from this article].