1. Introduction

The core idea of lexicostatistics is the analysis and comparison of wordlists compiled from the most stable part of the lexicon. The compilation of such lists is not a trivial task that can be solved through simple searching of translation equivalents in a dictionary. Synonymy, dialectal variation, and other factors significantly influence the composition of the list and correspondingly the final result of the research.

In this article, we present the 111-word Swadesh lists for five Finnic idioms. The core of the research are the wordlists for three minor varieties: a dialect of the Votic language, and two dialects of the Ingrian language. These are analysed and compared with the wordlists for two major Finnic languages: standard Estonian and standard Finnish.

The research has the following goals:

(1) to compile wordlists of five Finnic varieties applying the same methodology;

(2) to analyse and compare the materials from the minor varieties with the data from the major languages;

(3) provide comments on particular words in order to make the content of the lists and the differences between the analysed varieties more transparent;

(4) to draw a lexicostatistic picture of the minor varieties in the context of the major Finnic languages;

(5) to make some other preliminary observations based on the compiled wordlists.

The existing lexicostatistical research on the Uralic languages rarely uses explicit Swadesh lists. In most cases the compiled list is not accessible to a reader (see, for example, Taagepera 1994; Syrjanen, Honkola, Korhonen, Lehtinen, Vesakoski, Wahlberg 2013 (1)). The paper Hofirkova, Blazek 2012 is an exception as it gives the wordlists for many languages including Finnish, Estonian and Votic. However, the method of compilation of a wordlist as well as sources of the data are not always transparent there. For example, the list of sources does not contain any dictionary of the Votic language that leads us to conclude that secondary sources (such as etymological dictionaries) were used to obtain data. Many flaws both in transcription (2) and the choice of words (3) increase this impression. Another piece of recent research that uses Swadesh lists is Tillinger 2014. Tillinger analyses Saami languages and, among other things, gives the Swadesh lists of several European languages including Finnish and Estonian. In the Appendix, we comment on the differences between Tillinger's and our Swadesh lists.

In the current article we present both the explicit wordlists and the transparent methodology of their compilation.

The article consists of four sections. Section 1 provides the basic information: (a) the main facts about the Votic and the Ingrian languages, (b) a description of data and methods of the research, (c) transcription conventions. Section 2 presents the annotated wordlists. In Section 3 (Discussion), we formulate our preliminary observations of the wordlists. Section 4 (Conclusions) contains a short summary of the results.

1.1. Languages

Votic and Ingrian are minor Finnic languages on the verge of extinction. Votic belongs to the southern branch of the Finnic languages and is the closest relative of Estonian; Ingrian belongs to the northern branch and is the closest relative of Finnish and Karelian.

The last generation of Votic and Ingrian fluent speakers was born in the early 1930s. Their deportation to Finland during the Second World War, a ban on living in their native settlements after the war and the negative attitude of Russian people towards speakers of minority languages led to the rapid extinction of both Votic and Ingrian (see more details in [phrase omitted] 2013). At the moment, the most optimistic calculations give no more than five Votic and twenty Ingrian speakers (representing two dialects).

Most Votic dialects are extinct. Krevin--the language of the Votic population relocated to Latvia in the 15th century--died out by the middle of the 19th century. The last speaker of Eastern Votic died in 1976 (Ernits 2005 : 87), and the last recordings of Central Votic were made in the 1970s. Most probably there are no fluent speakers of the mixed Votic-Ingrian Kukkuzi variety (Suhonen 1985; Markus, Rozhanskiy 2012), though there were a few in the mid-2000s. The last speakers of Votic represent the Western dialect (Vaipoli Votic), which shows some contact-induced Ingrian influence (Rozhanskiy, Markus 2015).

The last speakers of Ingrian represent the Soikkola and Lower Luga dialects. Two other traditionally distinguished Ingrian dialects are already extinct. Oredezi Ingrian died out in the second half of the 20th century (Laanest ([phrase omitted] 1993 : 62) considered it already moribund in the 1960s); Hevaha Ingrian became extinct around the turn of the millennium.

In the current research, we use Votic data from the Western dialect and Ingrian data from both the Soikkola and Lower Luga dialects. The analysis of two Ingrian dialects is not redundant but is one of the key goals of the research. According to the hypothesis formulated in Rozhanskiy, Markus 2014, the Lower Luga dialect (traditionally described as the most specific Ingrian dialect (4)) is in fact a very specific convergent variety based on Ingrian and Votic but also influenced by Ingrian Finnish and Estonian. Many Votes shifted completely to this variety and changed their identity.

The contact situation for the analysed minor languages can be briefly described as follows. Votic has had intensive contact with Russian during the last millennium. The Western Votic variety discussed in this article was influenced by Ingrian (in fact, all Votic villages in the Lower Luga area had a mixed Votic-Ingrian population in the 20th century, and there were plenty of mixed Votic-Ingrian families). Central Votic was in close contact with Ingrian Finnish; however, due to the difference in religion, mixed marriages were not typical.

Ingrians also had contact with the Russian population but it seems that Soikkola Ingrian was less influenced by Russian than Votic. Also, there are no evident traces of Votic or some other Finnic influence on Soikkola Ingrian.

On the contrary, Lower Luga Ingrian had intensive contact not only with Russian but also with Ingrian Finnish, Votic, and (in the southern part of the area) with Estonian.

As most of the Finnic population from Western Ingria was deported to Finland during the Second World War, most of the speakers had some experience in the Finnish language.

1.2. Data and methods

The last decades have witnessed a renewal of interest in lexicostatistics and glottochronology. Different scholars use different mathematic algorithms: some work with the classical Swadesh method or its modified versions (Hofirkova, Blazek 2012), some use methods borrowed from evolutionary biology, such as maximum parsimony or Bayesian phylogenetic inference (Chang, Cathcart, Hall, Garrett 2015; Honkola 2016). All these studies have one thing in common: they use lists of basic lexemes with fixed meanings. Different authors use different wordlists, for example, the original Swadesh 200-word list (Swadesh 1952), the Swadesh 100-word list (Swadesh 1971) and various modifications thereof (Kassian, Starostin, Dybo, Chernov 2010), the ASJP 40-word list (Holman, Wichmann, Brown, Velupillai, Muller, Bakker 2008), the Leipzig-Jakarta list (Tadmor 2009), and others. A useful catalogue of such lists can be found on the Concepticon site (

We are convinced that the key problem with lexicostatistics lies not so much in the mathematics, as in the lexicography. Whatever algorithm we choose to apply, if our initial data are not sufficiently accurate, the wellknown maxim "garbage in--garbage out" will aptly describe the result. The core problem while compiling the wordlists is synonymy. For every meaning on the list in every language included in the comparison we must use the most neutral basic word representing this meaning. However, the standard meanings in the various lists of basic vocabulary are usually represented by English words (or words of some other natural language). It is clear that English words need not have one-to-one sematic equivalents in other human languages. For example, English 'hand' may be translated into Russian as '[phrase omitted]' or '[phrase omitted]', depending on the context. So, the compilation of a reliable basic vocabulary list requires a semantic specification of items on this list. Such a specification can be done in several ways:

(1) using more than one language for the list of basic meanings counting on more specific meanings in at least one of the languages;

(2) giving additional comments and explanations to narrow the meaning of the basic word;

(3) choosing a specific context to narrow down the meaning of an item on the list.

We chose the Swadesh wordlist and its particular modification because it is one of a few, if not the only, basic lexicon list for which a detailed semantic specification is available (Kassian, Starostin, Dybo, Chernov 2010). This standard was already used in hundreds of wordlists compiled for the Global Lexicostatistical Database project (GLD 2018), as well as in publications not affiliated with this project (Gruntov, Mazo 2015).

The selected standard allows us to use all the mentioned methods of resolving the choice between synonymic variants. First, the list of basic words is given in both English and Russian. Second, the comments specifying the meaning of the basic words are given. Third, every basic word has several contexts that narrow down the meaning. Thus, it becomes possible with minimal exceptions to choose the most neutral word that is not too general or too specific, is not stylistically marked, and is not too bookish or too colloquial.

In this article, we present the 111-word modified Swadesh lists for five Finnic idioms, compiled on the basis of the following methodology. The compilation of lists had two stages. During the first (preliminary) stage, the lists were compiled with the help of dictionaries (5) and/or the authors' competence. During this stage, some items had several variants in case there were no evident reasons to select the most suitable word. During the second stage, the lists were checked with the help of native speakers (see Acknowledgements section) and (for the minor languages) corpora of elicitations and narratives were also used. (6) The native speakers annotated the meaning and usage of words and translated the sentences with the contexts. The final decision of which word should be added to the list was made exclusively by F. Rozhanskiy.

Etymological comments are based on standard etymological dictionaries (SSA; EES; UEW; LAGLOS) and other sources on Finnic and Uralic etymology. The final decisions on etymology were made exclusively by M. Zhivlov.

1.3. Transcription conventions

Two of the analysed languages, Estonian and Finnish, have a literary tradition; Ingrian had a literary tradition only for a short period in the 1930s; and Votic has always been an unwritten language. (7) In this paper, we use the following transcription conventions.

For Estonian and Finnish, the standard orthography is used.

Our Votic transcription is similar to the one used by Tsvetkov (1995) but has some minor differences. First, we use j instead of the traditional Finnic i as the second part of diphthongs, e.g. kejg 'all' not keig (see the discussion in [phrase omitted], [phrase omitted] 2017 : 351-352). Second, the final reduced vowels are spelled as [??] (back vowel) and [??] (front vowel) instead of [??] and E respectively, e.g. rint[??] 'breast', tsulm[??] 'cold'. The long vowels are transcribed with double letters for comparability with other languages.

The Soikkola Ingrian transcription is close to Nirvi 1971 but the short geminates are transcribed with double letters and a breve in all phonetic contexts, e.g. valkkia 'white', not valkia. The long mid-high vowels of the first syllable (that in some idiolects merged with the long high vowels uu, uu, ii) are marked with a circumflex accent below: koori 'bark', oo 'night', seemen 'seed'. The sibilant fricatives are s and z (instead of s and z), e.g. suur 'big', meez 'man'.

There are no authoritative sources for the transcription of Lower Luga Ingrian, which also exhibits very significant phonetic variation between different varieties. We represent long mid vowels in the first syllable as diphthongs, e.g. kuorI 'bark', uo 'night', siemen 'seed', although their diphthongization is usually much weaker than in Finnish. The final reduced vowels that can be realized as short, voiceless, or dropped are marked with small capital letters: savvU 'smoke' [savvu ~ savvu ~ savv], hantA 'tail' [hanta ~ hanta ~ hant].

Proto-Finnic and Proto-Uralic reconstructions are written in a system based on the UPA. In this system affricates are written as single symbols.

2. The wordlists
1. all [BCe]  Est. koik  Vot. kejg    L-L. kai
              Soi. kaig  Fin. kaikki

This word exists in most Finnic languages. It goes back to Proto-Finnic *kai kki 'all', possibly of Baltic origin (cf. SSA 1 : 275; EES 199).
2. ashes [[phrase omitted]]  Est. tuhk   Vot. tuhk[??]  L-L. tuhkA
                             Soi. tuhka  Fin. tuhka

This word exists in most Finnic languages. It goes back to Proto-Finnic * tuhka 'ashes', borrowed from Germanic (cf. SSA 3 : 319; LAGLOS III : 307).
3. bark [[phrase omitted]]  Est. koor   Vot. koori   L-L. kuorI
                            Soi. koori  Fin. kaarna

Proto-Finnic *koori 'bark' goes back to Proto-Uralic *kari 'surface, crust, skin, bark' (Aikio 2015 : 52), which is certainly not the main word for 'bark' in Proto-Uralic (the meaning 'bark' is represented only in Finnic). Fin. kaarna exists in some Finnic languages and possibly has a Baltic origin (cf. SKES 1987 : 135; SSA 1 : 265-266). The word kuori also exists in Finnish but has a more general meaning, and the word kaarna looks more natural in the test contexts. The word kaarna is known in Ingrian with the meaning 'cork; fir or pine bark' (Nirvi 1971 : 148).
4. belly [[phrase omitted]]  Est. koht   Vot. vatts[??]  L-L. vatsA
                             Soi. vatsa  Fin. vatsa

Votic, Ingrian and Finnish preserve Proto-Finnic *vacca 'belly'. Pace Redei (UEW 547), this word has no acceptable etymology: the proposed Mansi cognate has irregular vocalism and is restricted to North Mansi. Estonian koht goes back to Proto-Finnic *koktu 'belly' (cf. EES 199); vats exists as a dialectal variant. In colloquial Finnish, the word maha looks more natural in the test sentences. The difference between *vacca and *koktu in Proto-Finnic may have been that of '(external) belly' vs '(internal) bell y/stomach'.
5. big, large [[phrase omitted]]  Est. suur  Vot. suur(i)  L-L. suur
                                  Soi. suur  Fin. iso

This word exists in most Finnic languages and goes back to Proto-Finnic *suuri 'big', borrowed form Germanic (cf. SSA 3 : 224-225; EES 491; LAGLOS III 253-254). In Finnish, the word suuri which is semantically close also exists. However, our Finnish consultant considered iso to be the main word. The Finnish word may be an archaism, replaced in other Finnic languages by a Germanic loanword. Proto-Finnic *iso 'big', derived from *isa 'father', has a striking parallel in Moksha ocu 'big', derived from oca 'paternal uncle' (cf. SSA 1 : 228; UEW 78), cf. also Finnish eno '(maternal) uncle' and enemman 'more'.
6. bird [[phrase omitted]]  Est. lind   Vot. lintu  L-L. lintU
                            Soi. lindu  Fin. lintu

Proto-Finnic *lintu 'bird' is either an isolated word or an irregular reflex of Proto-Uralic *lunta 'bird, goose' (cf. SSA 2 : 80; EES 242; UEW 254). Livonian and dialectal Estonian data show that Proto-Finnic *lintu was originally polysemous 'bird, flying insect, wild animal'. The polysemous word 'bird / wild animal' is found also in Samoyed and Ob-Ugric, although Finn ic, Ob-Ugric, and Samoyed words with these meanings are not related
7. to bite          Est. hammustada  Vot. purr[??]  L-L. purrA
[[phrase omitted]]  [hammustama]
                    Soi. purra       Fin. purra

Proto-Finnic *pure- 'to bite' goes back to Proto-Uralic *puri- 'to gnaw, bite' (cf. SSA 2 : 438; EES 393; UEW 405-406). The verb pureda exists in Estonian but hammustada (derived from hammas 'tooth') is considered a more neutral word.
8. black [[phrase omitted]]  Est. must   Vot. muss[??]  L-L. mustA
                             Soi. musta  Fin. musta

Proto-Finnic *musta 'black' has no plausible etymology (cf. SSA 2 : 183; EES 289; LAGLOS II 276).
9. blood [[phrase omitted]]  Est. veri  Vot. veri  L-L. veri
                             Soi. veri  Fin. veri

Proto-Finnic *veri 'blood' goes back to Proto-Uralic *weri 'blood' (cf. SSA 3 : 427; EES 598-599; UEW 576).
10. bone [[phrase omitted]]  Est. luu  Vot. [??]uu  L-L. luu
                             Soi. luu  Fin. luu

Proto-Finnic *luu 'bone' goes back to Proto-Uralic *liwi 'bone' (cf. SSA 2 : 114; EES 256; UEW 254-255). In Estonian, there is also a word kont (of Finnic origin, SSA 1 : 398; EES 175) that possibly broadened its meaning from 'shin' to 'bone'. This word was considered as more colloquial and less neutral.
11. breast [[phrase omitted]]  Est. rind   Vot. rint[??]  L-L. rintA
                               Soi. rinda  Fin. rinta

There are several hypotheses for the origin of Proto-Finnic *rinta 'breast' (cf. SSA 3 : 80; EES 429). It is improbable that it is a borrowing from Germanic (LAGLOS III 158-159). Koivulehto (2008 : 315-317) has suggested a Slavic origin. Proto-Saami *rente 'breast' is a Finnic loanword. In Finnish, there is also a word povi of Uralic origin (cf. SSA 2 : 408; UEW 395) that can be used at least for the second context (His breast (chest) was decorated with ornaments). However it is rarely used and should not be considered the main word. There is no special word for 'woman's breast' but there is a Finnic word for 'teat' that in some idioms has an extended meaning 'woman's breast': Est. nann, Vot. nann[??], L-L. nannA, Soi. nanna, Fin. nanni. This word came from child language but it is probably rather old (SSA 2 : 252).
12. to burn (trans.)  Est. poletada  Vot. pe[??]etta  L-L. poltta
[[phrase omitted]]    [poletama]
                      Soi. polttaa   Fin. polttaa

Proto-Finnic *poltta- 'to burn (trans.)' is an irregular causative derivative from Proto-Finnic *pala- 'to burn (intrans.)' (Est. poleda, Vot. peless[??], L-L. palla, Soi. pallaa, Fin. palaa) (cf. SSA 2 : 392; EES 399). This pair of verbs goes back to Proto West Uralic *pala- 'to burn (intrans.)' ~ *poltta-'to burn (trans.)' (cf. UEW 352). In Estonian and Votic the reflex of *polttawas replaced by a more regular causative from the same root.
13. cloud [[phrase omitted]]  Est. pilv   Vot. pilvi  L-L. pilvI
                              Soi. pilvi  Fin. pilvi

Proto-Finnic *pilvi 'cloud' goes back to Proto-Uralic *pilwi 'cloud (cf. SSA 2 : 367; EES 370; UEW 381).
14. cold [[phrase omitted]]  Est. kulm   Vot. tsulm[??]  L-L. kulmA
                             Soi. kulma  Fin. kylma

Proto-Finnic *kulma 'cold' goes back to Proto-Uralic *kulma 'cold', attested in Finnic, Saami, Mordvin, Mari and Permic (cf. UEW 663). The wide distribution of this word and completely regular sound correspondences make the hypothesis of its borrowing from Baltic (Koivulehto 1983; SSA 1 : 462; EES 213) quite improbable.
15. to come         Est. tulla [tulema]  Vot. tu[??]  L-L. tullA
[[phrase omitted]]  Soi. tulla           Fin. tulla

Proto-Finnic *tule- 'to come' goes back to Proto-Uralic *tuli- 'to come' (cf. SSA 3 : 324; EES 552-553; UEW 535).
16. to die          Est. surra [surema]  Vot. koo[??]  L-L. kuollA
[[phrase omitted]]  Soi. koolla          Fin. kuolla

Proto-Finnic *koole- 'to die' goes back to Proto-Uralic *kali- 'to die' (cf. SSA 1 : 440; UEW 173). In Estonian, it is observed only in dialects (EES 176). Estonian surra goes back to Proto-Finnic *sure- < Proto-Uralic *suri-'to die' (cf. EES 489; UEW 489)--certainly not the main synonym for this mea ning in Proto-Uralic.
17. dog             Est. koer   Vot. kojr[??]  L-L. koirA
[[phrase omitted]]  Soi. koira  Fin. koira

Proto-Finnic *koira 'dog' goes back to Proto-Uralic *kojra 'male' (cf. SSA 1 : 385; EES 168; UEW 168-169). The meaning 'male' is preserved in the Finnic derivative *koiras. The original Finnic word for 'dog' was rather Proto-Finnic *peni 'dog' (< Proto-Uralic *peni 'dog'), replaced as the main word for this meaning everywhere except Livonian and South Estonian (cf. SSA 2 : 335-336; EES 361; UEW 371).
18. to drink        Est. juua [jooma]  Vot. juuvv[??]  L-L. juovvA
[[phrase omitted]]  Soi. joovva        Fin. juoda

Proto-Finnic *joo- 'to drink' goes back to Proto-Uralic *jixi- 'to drink' (cf. SSA 1 : 249; EES 98; UEW 103).
19. dry [[phrase omitted]]  Est. kuiv   Vot. kujv[??]  L-L. kuivA
                            Soi. kuiva  Fin. kuiva

Proto-Finnic *kuiva 'dry' lacks an acceptable etymology (cf. SSA 1 : 426; EES 187). The hypothesis of a Germanic origin is implausible (LAGLOS II 114), and the comparison with Proto-Khanty *kuj[??]m- 'to fall, sink (of water)' is d ubious (cf. UEW 196-197).
20. ear [yxo]  Est. korv   Vot. k[??]rv[??]  L-L. korvA
               Soi. korva  Fin. korva

Proto-Finnic *korva 'ear' is cognate with Proto-Saami *koarve 'oarlock'. Further etymological connections of this word are unclear (cf. SSA 1 : 408; EES 202-203; UEW 187-188). It is a replacement of Proto-Uralic *pelja 'ear' (cf. UEW 370).
21. earth           Est. muld  Vot. maa  L-L. maa
[[phrase omitted]]  Soi. maa   Fin. maa

Proto-Finnic *maa 'earth' (cf. SSA 2 : 133; EES 268) goes back to Proto-Uralic *mixi 'earth'. In Estonian, earth as a physical substance (i.e. earth vs sand, handful of earth, etc. (see Kassian, Starostin, Dybo, Chernov 2010)) is expressed by the word muld, which is a Germanic borrowing (EES 286; LAGLOS II 270). In other idioms the word multa also exists but is more peripheral than in Estonian. However, there are deviations. For example, in Finnish, the test sentence "I don't know whether that site contains sand or earth" requires the word multa, since maa is a general term for both 'sand' and 'earth'.
22. to eat          Est. suua [sooma]  Vot. suuvv[??]  L-L. suovvA
[[phrase omitted]]  Soi. soovva        Fin. syoda

Proto-Finnic *soo- 'to eat' goes back to Proto-Uralic *sewi- 'to eat' (cf. SSA 3 : 235; EES 500-501; UEW 440).
23. egg             Est. muna  Vot. muna  L-L. muna
[[phrase omitted]]  Soi. muna  Fin. muna

Proto-Finnic *muna 'egg' goes back to Proto-Uralic *muna 'egg' (cf. SSA 2: 178; EES 287; UEW 285-286).
24. eye             Est. silm   Vot. silm[??]  L-L. silmA
[[phrase omitted]]  Soi. silma  Fin. silma

Proto-Finnic *silma 'eye' goes back to Proto-Uralic *silma 'eye' (cf. SSA 3:1 81; EES 472-473; UEW 479).
25. fat             Est. rasv   Vot. razv[??]  L-L. razvA
[[phrase omitted]]  Soi. razva  Fin. rasva

Proto-Finnic *rasva 'fat' is possibly an early Germanic borrowing (SSA 3: 53; EES 420; LAGLOS III 132). The word replaces Proto-Uralic *waji 'fat', whose Finnic reflex *voi means 'butter' (cf. UEW 578-579). Cf. also Proto-Uralic *koja 'fat, tallow', whose Finnic reflex *kuu 'tallow' is preserved only in Finnish and Karelian (cf. UEW 195-196).
26. feather  Est. sulg   Vot. su[??]k[??]  L-L. sulkA
[[??]epo]    Soi. sulga  Fin. sulka

Proto-Finnic *sulka 'feather' is possibly an irregular reflex of Proto-Uralic *tulka 'feather' (cf. SSA 3 : 211; EES 487; UEW 535-536). Livonian turg[??]z 'feather' may be another irregular reflex of the same Uralic word. As suggested by Kirill Reshetnikov (p.c.), the Livonian word may instead be cognate with Finnish turkki 'fur, hair (of animals)', Ingrian turkki 'chicken feather'. However, we would expect Livonian -rk- as a reflex of Proto-Finnic *-rkk-, so the etymology of Livonian turg[??]z remains moot.
27. fire            Est. tuli  Vot. tuli  L-L. tuli
[[phrase omitted]]  Soi. tuli  Fin. tuli

Proto-Finnic *tuli 'fire' goes back to Proto-Uralic *tuli 'fire' (cf. SSA 3 : 211; EES 553; UEW 535).
28. fish            Est. kala  Vot. kala  L-L. kala
[[phrase omitted]]  Soi. kala  Fin. kala

Proto-Finnic *kala 'fish' goes back to Proto-Uralic *kala 'fish' (SSA 1 : 282; EES 120; UEW 119).
29. to fly          Est. lennata [lendama]  Vot. lenta   L-L. lentta
[[phrase omitted]]  Soi. lenttaa            Fin. lentaa

Proto-Finnic *lenta- 'to fly' has no plausible etymology (cf. SSA 2 : 64; EES 236).
30. foot   Est. jalg   Vot. ja[??]k[??]  L-L. jalkA
[HO[??]a]  Soi. jalga  Fin. jalka

Proto-Finnic *jalka 'foot' goes back to Proto-Uralic *jalka or *jilka 'foot' (cf. SSA 1: 234; EES 96-97; UEW 88-89).
31. full            Est. tais  Vot. taun[??]      L-L. taun
[[phrase omitted]]  Soi. taun  Fin. taysi/taynna

Proto-Finnic *tauci 'full' goes back to Proto-Uralic *taw[??]i 'full' (cf. SSA 3: 358; EES 566; UEW 518). Koivulehto's hypothesis of a Germanic borrowing does not withstand scrutiny (LAGLOS III 331-332; Aikio 2002 : 31-34). In some languages, the lexicalized essive form *taun-na is used in the test contexts.
32. to give         Est. anda [andma]  Vot. anta   L-L. anta
[[phrase omitted]]  Soi. anttaa        Fin. antaa

Proto-Finnic *anta- 'to give' goes back to Proto-Uralic *amta- or *imta-'to give' (cf. SSA 1 : 77; EES 50; UEW 8)--one of two or three Proto-Uralic verbs of giving.
33. to go           Est. minna [minema]  Vot. menn[??]  L-L. mannA
[[phrase omitted]]  Soi. manna           Fin. menna

Proto-Finnic *mene- 'to go' goes back to Proto-Uralic *meni- 'to go' (cf. SSA 2 : 159; EES 282; UEW 272). In Estonian, this word is combined in one suppletive paradigm with the verb lahe- < Proto-Finnic *lakte- < Proto-Uralic *lakti- 'to go out, to go away' (cf. EES 262; UEW 239-240).
34. good            Est. hea   Vot. uva   L-L. huva
[[phrase omitted]]  Soi. huva  Fin. hyva

Proto-Finnic *huva 'good' has cognates in Saami and Mordvin languages (cf. SSA 1 : 201; EES 86; UEW 499). In Saami the word means 'to heal (of wound)', the Mordvin word means 'good', but it is not the main synonym for 'good' in Mordvin. Comparisons with words in other Uralic languages are hypothetical. The Estonian word is somewhat aberrant phonetically; still, it is cognate with words in other Finnic languages. The non-aberrant form huva exists in Estonian dialects and in colloquial speech.
35. green           Est. roheline  Vot. rohojn  L-L. rohoin
[[phrase omitted]]  Soi. rohhoin   Fin. vihrea

Finnish preserves the reflex of Proto-Finnic *vihera 'green'. Together with a morphological variant *vihanta, this word goes back to a late dialectal Uralic protoform *wisa 'green; poison', borrowed from Indo-Iranian (cf. SSA 3 : 438; UEW 823-824). In four idioms, the adjective 'green' is derived from Proto-Finnic *rooho 'grass', possibly of Germanic origin (EES 432-433; LAGLOS III 180).
36. hair            Est. juus  Vot. ivuz  L-L. hiuz
[[phrase omitted]]  Soi. hiuz  Fin. hius

Proto-Finnic *hi[beta]us 'hair' is a derivative based on a root borrowed from Germanic (cf. SSA 1 : 168; EES 102; LAGLOS I 107-108). This word is a replacement of Proto-Uralic *ipti 'hair' (cf. cf. UEW 14-15).
37. hand [pyKa]  Est. kasi  Vot. tsasi  L-L. kasi
                 Soi. kazi  Fin. kasi

Proto-Finnic *kaci 'hand' goes back to Proto-Uralic *kati 'hand' (cf. SSA 1 : 479; EES 209; UEW 140).
38. head            Est. pea  Vot. paa  L-L. paa
[[phrase omitted]]  Soi. paa  Fin. paa

Proto-Finnic *paa 'head' goes back to Proto-Uralic *pa[??]i 'head'--one of the two Uralic words for 'head' (cf. SSA 2 : 462; EES 357; UEW 365-366) .
39. to hear         Est. kuulda [kuulma]  Vot. kuu[??]  L-L. kuullA
[[phrase omitted]]  Soi. kuulla           Fin. kuulla

Proto-Finnic *kuule- 'to hear' goes back to Proto-Uralic *kuwli- 'to hear' (cf. SSA 1 : 456; EES 197; UEW 197-198).
40. heart           Est. suda  Vot. sua    L-L. suan
[[phrase omitted]]  Soi. suan  Fin. sydan

Proto-Finnic *su[??]an 'heart' goes back to Proto-Uralic *sa[??]'am 'heart' (cf. SSA 3 : 228; EES 501; UEW 477).
41. horn [po[??]]  Est. sarv   Vot. sarvi  L-L. sarvI
                   Soi. sarvi  Fin. sarvi

Proto-Finnic *sarvi 'horn' goes back to the dialectal Uralic protoform *sarwi 'horn', borrowed from Indo-Iranian (cf. SSA 3 : 159; EES 461-462; UEW 486-487).
42. I [[??]]  Est. mina (~ ma)  Vot. mia          L-L. mia
              Soi. mia          Fin. mina (~ ma)

Proto-Finnic *mina 'I' goes back to Proto-Uralic *min 'I' (cf. SSA 2 : 168; EES 281-282; UEW 294). In Estonian and Finnish, there is variation between a long and a short form.
43. to kill         Est. tappa [tapma]  Vot. tappa   L-L. tappa
[[phrase omitted]]  Soi. tappaa         Fin. tappaa

Proto-Finnic *tappa- 'to kill' goes back to Proto West Uralic *tappa-, whose reflex in Mordvin languages means 'to break' (cf. SSA 3 : 269-270; EES 514-515; UEW 509-510).
44. knee            Est. polv   Vot. pe[??]vi  L-L. polvI
[[phrase omitted]]  Soi. polvi  Fin. polvi

Proto-Finnic *polvi 'knee' goes back to the Proto-Uralic word for 'knee', whose exact reconstruction is doubtful. Apparently it was a compound of two roots: *puxi or *puwi 'knee' (> Proto-Samoyed *pu[??] 'knee') and *liwi 'bone' (cf. SSA 2 : 393; EES 400; UEW 393).
45. to know         Est. teada [teadma]  Vot. taata   L-L. tiita
[[phrase omitted]]  Soi. tiittaa         Fin. tietaa

Proto-Finnic *teeta- 'to know' is derived from *tee 'road, path' (SSA 3: 289). The hypothesis of a Germanic origin (EES 519) is unacceptable (LAGLOS III 292-293). This word replaces Proto-Uralic *tumti- 'to know', whose Finnic reflex *tunte- means rather 'to feel; to recognize' (cf. SSA 3 : 327; UEW 536-537).
46. leaf            Est. leht   Vot. lehto  L-L. lehtI
[[phrase omitted]]  Soi. lehti  Fin. lehti

Proto-Finnic *lehti 'leaf' goes back to late dialectal Uralic (West Uralic and Mari) *lesti 'leaf', apparently of Balto-Slavic origin (cf. SSA 2 : 58-59; EES 234; UEW 689).
47. to lie          Est. lamada [lamama]  Vot. lezia  L-L. lezze
[[phrase omitted]]  Soi. lessia           Fin. maata

The Finnic languages usually use the verb 'to be' to denote the position of an object and do not express the difference between 'to lie' and 'to stand'. Therefore, the Proto-Finnic word for 'to lie' cannot be convincingly reconstructed. The Estonian word is derived from the Proto-Finnic noun/adjective *lama 'lying', borrowed from Germanic (cf. SSA 2 : 42; EES 225-226; LAGLOS II 165), Votic and Ingrian borrowed this verb from Russian, and Finnish uses the reflex of Proto-Finnic *maka- 'to sleep' (q.v.). In Estonian, there is another word for 'to lie', lebada [lebama], that is less general than lamada [lamama]. It goes back to Proto-Finnic *lepa-, for which two mutually contradictory and phonetically problematic Germanic etymologies were proposed (cf. SSA 2 : 67-68; EES 232; LAGLOS II 198-199). Proto-Uralic root *kuji- 'to lie' has no reflexes in West Uralic (cf. UEW 197). Votic verbs lammoa and lamota 'lie about, to rest lying' are peripheral: they are very restricted dialectally (VKS 574) and are not known to contemporary speakers.
48. liver           Est. maks        Vot. mahs[??]  L-L. maksA
[[phrase omitted]]  Soi. leiba-liha                 Fin. maksa
                    ~ petsonka

Proto-Finnic *maksa 'liver' goes back to Proto-Uralic *miksa 'liver' (cf. SSA 2 : 142; EES 273; UEW 264). This is one of the most stable words in the Uralic basic lexicon. However in Soikkola Ingrian the word maksa means 'fish liver' or (in plural) 'internal apparatus'. The meaning 'liver' is expressed either by a descriptive compound leiba-liha (literally: bread meat) or by a Rus sian borrowing.
49. long            Est. pikk   Vot. pitts[??] ~ pittsi  L-L. pitkA
[[phrase omitted]]  Soi. pitka  Fin. pitka

Proto-Finnic *pitka 'long' goes back to Proto-Uralic *pi[??]-ka 'long', from the root *pi[??]i (cf. SSA 2 : 377; EES 368; UEW 377-378).
50. louse           Est. tai  Vot. taj  L-L. tai
[[phrase omitted]]  Soi. tai  Fin. tai

Proto-Finnic *tai 'louse' goes back to Proto-Uralic *taji 'louse' (cf. SSA 3 : 3 53; EES 565; UEW 515).
51. man (male)      Est. mees  Vot. meez  L-L. mies
[[phrase omitted]]  Soi. meez  Fin. mies

Proto-Finnic *mees 'man' has no acceptable etymology (SSA 2 : 166). The hypothesis of a Germanic origin is not likely (EES 279; LAGLOS II 263). The shape CVVC is anomalous from the point of view of Finnic phonotact ics.
52. man (person)    Est. inimene           Vot. inimin  L-L. ihmin
[[phrase omitted]]  Soi. ihmiin ~ ilmihin   ~ inemin    Fin. ihminen

The phonetic reconstruction of Proto-Finnic *inehminen 'person' is tentative (cf. SSA 1 : 221; EES 92-93). Forms like Finnish ihminen are probably due to contamination with Proto-Finnic *imeh 'miracle'. The word has no acceptable etymology; attempts to derive it from various Indo-European sources are unconvincing. At the same time, comparison with the Mordvin word for 'guest' (UEW 627-628) faces multiple irregularities. In Soikkola Ingrian, there are variants of this word; the choice depends on the particular idiolect.
53. many, a lot of  Est. palju  Vot. pallo           L-L. paljo
[[phrase omitted]]  Soi. paljo  Fin. paljon ~ monta

The etymology of Proto-Finnic *paljo 'many' remains disputed (cf. SSA 2 : 301; EES 350). Potential Uralic comparisons are dubious (cf. UEW 350-351). The Germanic origin is not accepted in LAGLOS III 22. Saarikivi (2009 : 146-147) suggests a Slavic etymology. In Finnish there is also a word monta (partitive of moni), that may be viewed as an archaism. It goes back to Proto-Finnic *moni 'many' < Proto-Uralic *moni, the reflex of which is preserved also in Permic. The Germanic origin of this word cannot be accepted (LAGLOS II 265-266). It is not clear which word is more general in Finnish (both words sound good in the test sentences). In Estonian, Votic and Ingrian, the reflex of *moni either has a different meaning or is not the main word for 'many'.
54. meat            Est. liha  Vot. liha  L-L. liha
[[phrase omitted]]  Soi. liha  Fin. liha

Proto-Finnic *osa 'meat', cognate with Proto-Saami *oance 'meat', is preserved only in Livonian. In other languages, this word is replaced by Proto-Finnic *liha (cf. SSA 2 : 72; EES 238-239), whose Livonian reflex pres erved the original meaning 'body; (human) flesh'.
55. moon            Est. kuu  Vot. kuu  L-L. kuu
[[phrase omitted]]  Soi. kuu  Fin. kuu

Proto-Finnic *kuu 'moon' goes back to Proto-Uralic *kiwi or *ki[??]i 'moon' (cf. SSA 1 : 455-456; EES 196-197; UEW 211-212).
56. mountain        Est. magi  Vot. matsi  L-L. maki
[[phrase omitted]]  Soi. magi  Fin. vuori

Proto-Finnic *voori 'mountain', going back to Proto-Uralic *wari 'hill, mountain' (cf. SSA 3 : 475; UEW 571), is preserved only in Finnish, where it is opposed to maki 'hill'. Other languages have lost the inherited word for 'mountain' and replaced it with the word for 'hill'. Proto-Finnic *maki 'hill' goes back to Proto-Uralic *maki, also preserved in Khanty, where its reflex means 'tussock' (cf. SSA 2 : 191; EES 294; UEW 266).
57. mouth           Est. suu  Vot. suu  L-L. suu
[[phrase omitted]]  Soi. suu  Fin. suu

Proto-Finnic *suu 'mouth' goes back to Proto-Uralic *suwi 'throat, mouth' (cf. SSA 3 : 223-224; EES 491; UEW 492-493).
58. nail            Est. kuus  Vot. tsunsi  L-L. kunsI
[[phrase omitted]]  Soi. kunz  Fin. kynsi

Proto-Finnic *kunci 'claw, nail' goes back to Proto-Uralic *kunci 'claw, nail ' (cf. SSA 1 : 464; EES 216; UEW 157).
59. name            Est. nimi  Vot. nimi ~ imi  L-L. nimi
[[phrase omitted]]  Soi. imi   Fin. nimi

Proto-Finnic *nimi 'name' goes back to Proto-Uralic *nimi 'name' (cf. SSA 2 : 222; EES 313; UEW 305). Votic and Ingrian show variation between nimi and the variant imi, whose origin is not obvious (possibly it results from a contamination of nimi and Russian [phrase omitted] 'name'). In Soikkola Ingrian, the variant imi is the most prevalent form; in Luuditsa Votic both variants are used; for Lower Luga Ingian the variant nimi looks more typical.
60. neck            Est. kael   Vot. kag[??]  L-L. kaglA
[[phrase omitted]]  Soi. kagla  Fin. kaula

Proto-Finnic *kakla 'neck' is borrowed from Baltic (SSA 1 : 331; EES 113). This word replaces Proto-Uralic *sepa 'neck, collar', preserved in Finnic with the meanings 'collar, front part of sledge, etc.' (cf. SSA 3 : 169-170; UEW 473-474).
61. new             Est. uus  Vot. uus(i)  L-L. uusI
[[phrase omitted]]  Soi. uuz  Fin. uusi

Proto-Finnic *uuci 'new' goes back to Proto-Uralic *wu[??]i 'new' (cf. SSA 3 : 3 81; EES 581; UEW 587).
62. night           Est. oo  Vot. uu  L-L. uo
[[phrase omitted]]  Soi. oo  Fin. yo

Proto-Finnic *oo 'night' goes back to Proto-Uralic *uji or *eji 'night' (cf. SSA 3 : 493; EES 633; UEW 72).
63. nose [HOC]  Est. nina  Vot. nena  L-L. nena
                Soi. nena  Fin. nena

Proto-Finnic *nena ~ *nena ~ *nana 'nose' is related to Proto-Saami *nuone 'nose' (cf. SSA 2 : 213; EES 313-314).
64. not [He]  Est. ei  Vot. eb  L-L. ei
              Soi. ei  Fin. ei

Proto-Finnic negative verb *e- goes back to the Proto-Uralic negative verb *e- (cf. SSA 1 : 99; EES 59; UEW 68-70).
65. one             Est. uks  Vot. uhs(i)  L-L. uks
[[phrase omitted]]  Soi. uks  Fin. yksi

Proto-Finnic *ukci 'one' goes back to the Proto-Uralic word for 'one', attested from Finnic to Mansi (cf. SSA 3 : 489; EES 635; UEW 81). However, the exact phonetic reconstruction of the Proto-Uralic form is difficult.
66. rain            Est. vihm   Vot. vihm[??]  L-L. vihmA
[[phrase omitted]]  Soi. vihma  Fin. sade

Proto-Finnic *vihma 'rain' is related to Proto-Saami *vesme 'light snow' (cf. SSA 3 : 438; EES 601). In Finnish, vihma means 'drizzle' and a derivative from Proto-Finnic *sata- 'to rain, to snow' (< Proto-Uralic *sa[??]a- 'to rain') is used as the main word for 'rain' instead (cf. SSA 3 : 141, 160, EES 455-456).
67. red             Est. punane   Vot. kauniz    L-L. punnain
[[phrase omitted]]  Soi. punnain  Fin. punainen

Proto-Finnic *punain??n 'red' is derived from Proto-Finnic *puna 'red colour'-a reflex of Proto-Uralic *puna 'hair, fur' (cf. SSA 2 : 426-427; EES 137; UEW 402). The semantic development may look strange, but is actually understandable. The words for 'hair' in Eurasia frequently have an additional meaning 'colour'. An intermediate meaning 'hair colour (of animals)' is actually attested for reflexes of PU *puna in Hill Mari and South Khanty. The following path of sematic development can be supposed in this case: 'hair, fur' > '(hair) colour' > 'red colour'. In Votic, the main word for 'red' is kauniz, going back to Proto-Finnic *kaunis 'beautiful', borrowed from Germanic (LAGLOS II 62). The semantic shift 'beautiful' > 'red' occurred under the influence of Russian [phrase omitted] 'red/beautiful'. (8)
68. road            Est. tee  Vot. tee  L-L. tie
[[phrase omitted]]  Soi. tee  Fin. tie

Proto-Finnic *tee 'road' is apparently related to Komi tuj 'road', although the reconstruction of a common protoform is difficult (cf. SSA 3 : 288; EES 520; UEW 794).
69. root            Est. juur   Vot. juuri  L-L. juurI
[[phrase omitted]]  Soi. juuri  Fin. juuri

Proto-Finnic *juuri 'root' goes back to Proto West Uralic *juwri 'root', attested also in Mordvin (cf. SSA 1 : 253; EES 102; UEW 639). This word repl aces Proto-Uralic *wanca 'root' (cf. UEW 548-549).
70. round           Est. ummargune  Vot. ummerkajn  L-L. ummerkain
[[phrase omitted]]  Soi. umberlain  Fin. pyorea

Proto-Finnic *poore[??]a 'round', reflected in Finnish, has cognates with the same meaning in Ob-Ugric languages and goes back to Proto-Uralic *pe[??]ira 'round' (cf. SSA 2 : 455; EES 406; UEW 372-373). In other languages in our sample, the word 'round' is derived from Proto-Finnic *umpara, a Germanic loanword with a Finnic suffix *-ra (cf SSA 3 : 491; EES 636-637; LAGLOS III 426-427). There is no difference between '3D round' and '2D round'.
71. sand            Est. liiv   Vot. liiv[??]  L-L. liivA
[[phrase omitted]]  Soi. liiva  Fin. hiekka

Proto-Finnic *liiva 'sand' may be a Baltic or Germanic loan (SSA 2 : 205; EES 240; LAGLOS II 207). Although now Ingrian is the only North Finnic language that has this word for 'sand', the Proto-Finnic status of the word is confirmed by the fact that it was borrowed from a lost North Finnic idiom into Permic languages: Komi lia, Udmurt luo 'sand' (Saarikivi 2006 : 36). In Finnish, a specific word hiekka is used instead (SSA 1 : 160).
72. to say          Est. utelda       Vot. jut[??]  L-L. sanno
[[phrase omitted]]  ~ oelda [utlema]
                    Soi. sannoa       Fin. sanoa

Proto-Finnic *sano- ~ *seno- 'to say' is derived from *sana ~ *sena 'word' (cf. SSA 3 : 155; EES 494). In Estonian and Votic, this word is replaced by the reflexes of Proto-Finnic *jutta- 'to talk; to tell, narrate', going back to Proto-Uralic *jupta- 'to tell, narrate' (cf. SSA 1 : 252; EES 102, 627; UEW 104; Aikio 2002 : 48). The Estonian verb demonstrates an irregular change *ju- > u-. The original anlaut is preserved in the Estonian noun jutt 'story; talk'. The Estonian reflex of *seno- has a clearly secondary meaning 'to scold'.
73. to see          Est. naha [nagema]  Vot. nahh[??]  L-L. naha
[[phrase omitted]]  Soi. nah(h)a        Fin. nahda

Proto-Finnic *nake- 'to see' goes back to Proto-Uralic *naki- 'to see' (cf. SSA 2 : 249; EES 326-327; UEW 302).
74. seed            Est. seeme   Vot. seemene  L-L. siemen
[[phrase omitted]]  Soi. seemen  Fin. siemen

Proto-Finnic *seemen 'seed' is a Baltic borrowing (SSA 3 : 173; EES 464).
75. to sit          Est. istuda [istuma]  Vot. issua  L-L. isto
[[phrase omitted]]  Soi. istua            Fin. istua

Proto-Finnic *istu- 'to sit' goes back to Proto West Uralic *isa- 'to sit', which may be an Indo-European borrowing (cf. SSA 1 : 229; EES 94; UEW 629).
76. skin            Est. nahk   Vot. nahk[??]  L-L. nahkA
[[phrase omitted]]  Soi. nahka  Fin. iho

Proto-Finnic *iho 'skin' goes back to Proto-Uralic *isa 'skin, surface' and is preserved in Finnish (cf. SSA 1 : 222; EES 89; UEW 636-637). Other idioms use Proto-Finnic *nahka 'skin, hide', borrowed from Germanic (SSA 2 : 202; EES 306; LAGLOS II 287-288). In Finnish, there is a word nahka but it has a more specific meaning (mainly it is 'a skin of an animal, fur' but in colloquial speech it can be easily used in the test contexts).
77. to sleep        Est. magada [magama]  Vot. magat[??]  L-L. maatA
[[phrase omitted]]  Soi. maada            Fin. nukkua

The Germanic etymology of Proto-Finnic *maka- 'to sleep', pace LAGLOS, does not seem convincing to us (SSA 2 : 136; EES 270; LAGLOS II 237-238). In Finnish, this word means 'to lie' (see above) and the meaning 'to sleep' is expressed by the reflex of Proto-Finnic *nukku- 'to doze, to drowse' (SSA 2 : 237), cognate with Proto-Saami *nokku- 'to doze, to drowse' (SSA 2 : 237). The Proto-Uralic word for 'to sleep' was *a[??]i- (cf. UEW 334; Aikio 2015 : 51).
78. small, little   Est. vaike      Vot. peen(i)  L-L. pienI
[[phrase omitted]]  Soi. pikkarain  Fin. pieni    (~ pikkarain)

Proto-Finnic *peeni 'small' is preserved in Votic and Finnish (cf. SSA 2 : 348; EES 358). The Germanic etymology of this word is not convincing (LAGLOS III 55). This word exists in Estonian but rather means 'thin, fine'. In Soikkola Ingrian, the word pikkarain (that also exists in Finnish) predominates, but in Lower Luga it is not the most prevalent variant. In Votic, pikkerajn is less common than peen(i). According to SSA 2 : 361, this word is a hypocoristic byform of *peeni. In Estonian, the main word for 'small' is derived from Proto-Finnic *vaha 'small', which may go back to Proto West Uralic (cf. SSA 3 : 478; EES 618-619; UEW 818-819). Germanic etymologies, proposed for this word, are dubious (LAGLOS III 420). The semantic difference between *peeni and *vaha on the Proto-Finnic level remains elusive.
79. smoke           Est. suits  Vot. savvu  L-L. savvU
[[phrase omitted]]  Soi. savvu  Fin. savu

Proto-Finnic *savu 'smoke' goes back to Proto West Uralic *siwi 'smoke' (cf. SSA 3 : 163; UEW 754). The Estonian word is a reflex of Proto-Finnic *suiccu 'smoke', with potential cognates in Saami meaning 'to rise' (cf. SSA 3 : 208; EES 486). This word is also attested in Finnish dialects. Since reflexes of *savu are the main words for 'smoke' in Livonian and South Estonian, there can be no doubt that the main Proto-Finnic word for 'smoke' was *savu.
80. to stand        Est. seista [seisma]  Vot. sejss[??]  L-L. seissa
[[phrase omitted]]  Soi. seissa           Fin. seisoa

Proto-Finnic *saisa- 'to stand' goes back to Proto-Uralic *sa[??]sa- 'to stand' (cf. SSA 3 : 164-165; EES 466; UEW 431-432).
81. star            Est. taht   Vot. tahti  L-L. tahtI
[[phrase omitted]]  Soi. tahti  Fin. tahti

Proto-Finnic *tahti 'star' is related to Saami and Mordvin words for 'star' and the Mari word for 'sign' (cf. SSA 3 : 353; EES 565; UEW 793-794). However, irregular sound correspondences between these forms suggest that the word was borrowed from an unknown substrate separately in already differentiated branches of West Uralic (Aikio 2015 : 43-47). This word replaced Proto-Uralic *kunsi 'star' (cf. UEW 210-211).
82. stone           Est. kivi  Vot. tsivi  L-L. kivi
[[phrase omitted]]  Soi. kivi  Fin. kivi

Proto-Finnic *kivi 'stone' goes back to Proto-Uralic *kiwi 'stone' (cf. SSA 1 : 378; EES 163-164; UEW 163-164).
83. sun             Est. paike   Vot. pajvud   L-L. paivukkain
[[phrase omitted]]  Soi. paivud  Fin. aurinko

Proto-Finnic *paiva 'sun, day' goes back to Proto-Uralic *pajwa, whose reflexes mean 'sun, day' in Saami and 'heat, warm' in Samoyed (cf. SSA 2 : 456; EES 403; UEW 360). Finnish paiva means 'day' only. In the meaning 'sun', the word is replaced by aurinko, which has no acceptable etymology (SSA 1 : 90).
84. to swim         Est. ujuda [ujuma]  Vot. ujjua  L-L. ujjo
[[phrase omitted],  Soi. ujjua          Fin. uida
[phrase omitted]]

Proto-Finnic *ui- 'to swim' goes back to Proto-Uralic *uji- 'to swim' (cf. SSA 3 : 368; EES 576-577; UEW 542).
85. tail            Est. saba   Vot. ant[??]  L-L. handA
[[phrase omitted]]  Soi. handa  Fin. hanta

Proto-Finnic *hanta 'tail' has no acceptable etymology: supposed cognates in other branches of Uralic show irregular correspondences (cf. SSA 1 : 208; EES 85; UEW 56). In Estonian, it also exists but the main word for 'tail' was borrowed from the Baltic languages (EES 455). The Proto-Uralic word for 'tail' was *ponci.
86. that [TOT]  Est. see  Vot. see (9)  L-L. see
                Soi. see  Fin. tuo

Both Proto-Finnic *se 'that' (SSA 3 : 163; EES 463-464; UEW 33-34) and Proto-Finnic *too 'that' (cf. SSA 3 : 327-328; EES 538; UEW 526-528) have Uralic pedigree. However, it is difficult to reconstruct the Proto-Finnic demonstrative system. Finnic dialects have different systems: monopartite, bipartite or tripartite. Standard Estonian has a formally bipartite system see ~ too but it functions rather as a monopartite system where see means 'this/that' and in the contrastive contexts the word teine 'other' is usually used. Finnish has a tripartite system tama ~ tuo ~ se, and in the test contexts tuo is preferable. Votic and Ingrian have bipartite systems but see is often used in the contexts for 'this'.
87. this [[??]TOT]  Est. see   Vot. kase  L-L. tama
                    Soi. tama  Fin. tama

According to Laanest (1982 : 196), Votic kase results from the merging of some interjection with se. Tama is a Uralic word (SSA 3 : 355; UEW 513-515). Estonian tema and Votic tama are 3Sg personal pronouns but not demonstrative pronouns. Since the typical path of diachronic development leads from demonstrative pronouns to personal pronouns, but not vice versa, we can suppose that Proto-Finnic *tama 'this' was a demonstrative (see comments on the previous word).
88. tongue          Est. keel   Vot. tseeli  L-L. kielI
[[phrase omitted]]  Soi. keeli  Fin. kieli

Proto-Finnic *keeli 'tongue' goes back to Proto-Uralic *kali 'tongue' (cf. SSA 1 : 353; EES 140; UEW 144-145).
89. tooth           Est. hammas  Vot. ammez   L-L. hammaz
[[phrase omitted]]  Soi. hammaz  Fin. hammas

Proto-Finnic *hambas 'tooth' is a Baltic loanword (SSA 1 : 136; EES 68-69). This word replaced Proto-Uralic *pi[??]i 'tooth', whose Finnic reflex *pii means 'tooth in a saw, rake etc.' (cf. SSA 2 : 352; UEW 382).
90. tree            Est. puu  Vot. puu  L-L. puu
[[phrase omitted]]  Soi. puu  Fin. puu

Proto-Finnic *puu 'tree' goes back to Proto-Uralic *pawi 'tree' (cf. SSA 2 : 443-444; EES 396-397; UEW 410-411).
91. two             Est. kaks  Vot. kahs(i)  L-L. kaks
[[phrase omitted]]  Soi. kaks  Fin. kaksi

Proto-Finnic *kakci 'two' goes back to the Proto-Uralic numeral 'two', whose exact phonetic shape is hard to reconstruct (cf. SSA 1 : 282; EES 120; UEW 118-119).
92. warm            Est. soe     Vot. sooj[??]  L-L. soojA
[[phrase omitted]]  Soi. lammaa  Fin. lammin

Proto-Finnic *lambin 'warm' goes back to Proto-Uralic *lampi 'warm' (cf. SSA 2 : 124; EES 263; UEW 685; Aikio 2002 : 13). The word lammi exists in Estonian dialects (EES 263), and the same root is known in Votic (mostly through the word lammitta(a) 'to stoke', VKS 657). Other idioms use the root *sooja 'shelter; warm', borrowed from an Iranian word for 'shade' (cf. SSA 3 : 214; EES 478; UEW 748-749). In Finnish, there is a word suoja but it is not the main word for 'warm' (it is used when speaking about above-zero weather). In Ingrian, the same root is observed only in the Lower Lug a dialect (Nirvi 1971 : 542).
93. water [water]  Est. vesi  Vot. vesi  L-L. vesi
                   Soi. vezi  Fin. vesi

Proto-Finnic *veci 'water' goes back to Proto-Uralic *weti 'water' (cf. SSA 3 : 429; EES 599; UEW 570).
94. we              Est. meie ~ me  Vot. muu  L-L. muo
[[phrase omitted]]  Soi. moo        Fin. me

Proto-Finnic *me(k) 'we' goes back to Proto-Uralic *me(-) 'we' (cf. SSA 2 : 156; EES 279; UEW 294-295). In Estonian, there is variation between a long and a short form.
95. what            Est. mis   Vot. mika  L-L. mika
[[phrase omitted]]  Soi. miga  Fin. mika

Proto-Finnic *mi(ka) 'what' goes back to Proto-Uralic *mi ~ *mi 'what' (cf. SSA 2 : 164; UEW 296). In Estonian, the formative -s originates from a demonstrative pronoun see (EES 282-283).
96. white           Est. valge    Vot. va[??]ka   L-L. valke
[[phrase omitted]]  Soi. valkkia  Fin. valkoinen

Proto-Finnic *valke[??]da 'white' goes back to Proto-Uralic *wilki 'light' (cf. SSA 3 : 399-400; EES 588; UEW 554-555; Aikio 2015 : 59). In Finnish, the derivate with the adjectival suffix valkoinen looks more natural in the test contexts than valkea 'white'.
97. who [KTO]  Est. kes  Vot. tsen  L-L. ken
               Soi. ken  Fin. kuka

Proto-Finnic *ken 'who' goes back to Proto-Uralic *ke(-) 'who' (cf. SSA 1 : 342-343; EES 145-146; UEW 140-141). The Estonian word has the formative -s absent from three of the idioms, however the variant ken is observed in the Estonian dialects (EES 145-146). In Finnish, the main word for 'who' is kuka < Proto-Uralic interrogative stem *ku(-), used in words for 'where', 'which', etc. (SSA 1 : 423-424; UEW 191-192), but the word ken also exists as a poetic variant.
98. woman           Est. naine  Vot. najn    L-L. nain
[[phrase omitted]]  Soi. nain   Fin. nainen

Proto-Finnic *nainen 'woman' (cf. SSA 2 : 202; EES 306) is derived from a root *naa-, seen also in naaras 'female' (SSA 2 : 200-201). This root goes back to Proto-Uralic *naxi 'woman' (Janhunen 1981 : 245-246).
99. yellow          Est. kollane   Vot. ke[??]tejn  L-L. keltain
[[phrase omitted]]  Soi. kelttain  Fin. keltainen

Proto-Finnic *keltainen 'yellow' consists of the root borrowed from Baltic, and an adjectival suffix (SSA 1 : 342; EES 172-173).
100. you (thou)     Est. sina (~ sa)  Vot. sia          L-L. sia
[[phrase omitted]]  Soi. sia          Fin. sina (~ sa)

Proto-Finnic *cina 'thou' goes back to Proto-Uralic *tin 'thou' (cf. SSA 3 : 184; EES 473-474; UEW 539). In Estonian and Finnish, there is variation between a long and a short form.
101. far            Est. kaugel       Vot. kauka[??]  L-L. kaukall
[[phrase omitted]]  Soi. ettaa/ettal  Fin. kaukana

Proto-Finnic *kauka- 'far' is a Germanic loanword (Aikio 2000; EES 137). Supposed cognates in Mordvin and Khanty (SSA 1 : 330-331; UEW 132) are phonetically incompatible with the Finnic word. Soikkola Ingrian uses a reflex of Proto-Finnic *eta- 'far', going back to Proto West Uralic *eca-'far' (cf. SSA 1 : 109-110; UEW 624). This word is the main word for 'far' also in Veps. It is hard to say which of these two words was the main Prot o-Finnic word for 'far'.
102. heavy          Est. raske   Vot. rankk[??]  L-L. rankkA
[[phrase omitted]]  Soi. raskaz  Fin. raskas

There are two different words: Proto-Finnic *rankka 'heavy', apparently borrowed from Germanic (cf. SSA 3 : 47; EES 419, 445; LAGLOS III 124-125), and Proto-Finnic *raskas 'heavy' (cf. SSA 3 : 52; EES 419-420). The former became dominant in Votic and Lower Luga Ingrian, the latter in three other idioms. Estonian rank is more bookish than raske. It is difficult to tell which word was the main word for 'heavy' in Proto-Finnic.
103. near           Est. lahedal  Vot. litsi    L-L. liki
[[phrase omitted]]  Soi. ligi     Fin. lahella

Proto-Finnic *lahe- 'near', going back to Proto-Uralic *lasi 'near', is preserved in Estonian and Finnish (cf. SSA 2 : 122; EES 262; UEW 687; Aikio 2002 : 48). Votic and Ingrian use another root, Proto-Finnic *liki 'near', cognate with Proto-Saami *leke 'near' (cf. SSA 2 : 76; EES 238). Estonian ligidal and Finnish liki ~ likella are synonymic forms but are less general or neutral. The original semantic diffe rence between *lahe- and *liki in Proto-Finnic is not clear.
104. salt           Est. sool   Vot. soo[??]  L-L. suolA
[[phrase omitted]]  Soi. soola  Fin. suola

Proto-Finnic *soola 'salt' is borrowed from an Indo-European language, most probably from Baltic (cf. SSA 3 : 214-215; EES 480). Similar loanwords exist in other Uralic languages (UEW 750-751), but the phonetic shape of the Finnic word (long vowel in an a-stem) shows that it was borrowed independently.
105. short          Est. luhike  Vot. luhud  L-L. luhud
[[phrase omitted]]  Soi. luhud   Fin. lyhyt

Proto-Finnic *luhut 'short' has no acceptable etymology (cf. SSA 2 : 117; EES 266).
106. snake          Est. uss   Vot. mato    L-L. mato
[[phrase omitted]]  Soi. mado  Fin. kaarme

Although Proto-Finnic *kuu 'viper, snake', going back to Proto-Uralic *kuji 'snake', retains the meaning 'snake' in Karelian, Veps, and Livonian dialects (cf. SSA 1 : 467; UEW 154-155), these are hardly the main words for 'snake' in the respective idioms. Proto-Finnic *mato 'snake, worm' was perhaps the main word for 'snake' already in the proto-language. It is possibly a Germanic borrowing. According to an alternative etymology, *mato is cognate with Proto-Saami *muoce 'moth' (cf. SSA 2 : 154; EES 270; LAGLOS II 255). In Finnish, this word means 'worm' (see below), and another word, ultimately borrowed from Baltic, is used for 'snake' (SSA 1 : 484). In Estonian, the word madu means 'snake', but a more neutral word is uss 'snake, worm'. The etymology of uss is not clear but it is possibly a Russian borrowing (EES 580).
107a. thin (2D)     Est. ohuke Vot. hojkk[??]  L-L. hoikkA
[[phrase omitted]]  Soi. hoikka ~ hoikkain     Fin. ohut

Two words can be reconstructed: Proto-Finnic *ohut 'thin' (cf. SSA 2 : 260; EES 625) and Proto-Finnic *hoikka 'thin' (cf. SSA 1 : 169). The former goes back to Proto-Uralic *woksi 'thin' ([phrase omitted] 2011 : 110; Luobbal Sammol Sammol Ante (Aikio) 2014 : 10-11), the latter has no known etymology. It is hard to reconstruct the semantic difference between these words on the Proto-Finnic level. The word hoikka also exists in Finnish but is not predominant there, while the reflexes of *ohut are not predominant in Votic and Ingrian. In Soikkola Ingrian, there is a variant with an adjectival suffix.
107b. thin (1D)     Est. peenike  Vot. hojkk[??]  L-L. hoikkA
[[phrase omitted]]  Soi. hoikka   Fin. ohut

In Votic, Finnish and Lower Luga Ingrian there is no difference between '2D thin' and '1D thin'. In Soikkola Ingrian, the variant with the suffix is not typical in the test contexts. In Estonian, the derivate from peen 'small' (see above) is more typical in the test contexts (the form peen without a suffix is also possible in the test contexts but peenike looks more neutral).
108. wind           Est. tuul   Vot. tuuli  L-L. tuulI
[[phrase omitted]]  Soi. tuuli  Fin. tuuli

Proto-Finnic *tuuli 'wind' goes back to Proto-Uralic *tiwli 'wind' (cf. SSA 3 : 340; EES 558-559; UEW 800).
109. worm           Est. uss        Vot. matokkejn  L-L. matokkain
[[phrase omitted]]                   ~ mato          ~ mato
                    Soi. madokkain                  Fin. mato
                    ~ mado

The distinction snake vs worm is not typical for Finnic languages. Among the five analysed idioms only Finnish distinguishes these two notions, while in the other languages this distinction is not relevant. Thus, we can reconstruct Proto-Finnic *mato 'snake, worm'. In Votic and Ingrian, a derivate with the diminutive suffix can be used to stress that it is a worm but not a (big) snake. (See comments to the word for 'snake', #106.)
110. year           Est. aasta  Vot. voosi  L-L. vuosI ~ aastaikA
[[phrase omitted]]  Soi. vooz   Fin. vuosi

Proto-Finnic *vooci 'year' goes back to Proto-Uralic *i[??]i 'year' (cf. SSA 3 : 476; EES 612-613; UEW 335-336). In Estonian, this word means 'harvest' and another word is used for 'year' (etymologically a compound *ai[gamma]astaaika built from the forms of *aika 'time', see EES 42). In Lower Luga Ingrian, both words are used; the choice depends on the particular idiolect.

3. Discussion

In the current section we formulate some observations on the compiled wordlists. These are preliminary observations that do not purport to be a comprehensive analysis of the data.

3.1. The analysed set of five languages is rather homogeneous. Among 111 items, 77 (69%) have the same word in all five varieties. There are no items where all five idioms use different roots neither are there items with four different roots. There are only three items in the list where three roots appear: #47 'to lie' Est. lamada vs Fin. maata vs Vot. lezia, L-L. lezze, Soi. lessia, #106 'snake' Est. uss vs Fin. kaarme vs Vot. mato, N L. mato, Soi. mado, #107b 'thin' Est. peenike vs Fin. ohut vs Vot. hojkk[??], L-L. hoikkA, Soi. hoikka. For all three items opposition is organized in the same way: Estonian opposes Finnish and they both oppose three minor varieties, which have the same root.

In all other cases, either one language has a root that is different from the other languages (24 items) or two languages differ from the other three (7 items).

3.2. The three minor varieties are rather uniform; the two major languages are often different from the minor ones.

The three minor Finnic varieties do not demonstrate significant diversity. Only in 8 cases (i.e. 7%), the roots were not the same. Ingrian is opposed to all other varieties in #48 'liver' and #101 'far'; Votic is different from all other varieties in #67 'red' (and this difference would not hold if we take other Votic varieties into account); in two cases Votic is uniform only with Estonian (#72 'say' and #87 'this'); in one case Votic and Lower Luga Ingrian are different from the other varieties including Soikkola Ingrian (#102 'heavy'), in another Soikkola Ingrian and Finnish differ from the other varieties (#92 'warm'), and there is also a specific Estonian word which exists as one of the two variants for Lower Luga Ingrian (#110 'year'). Summing up, the number of cases where a minor variety does not have the same root as the two other minor varieties is the following: Votic-3 items, Soikkola Ingrian-4 items, Lower Luga Ingrian-1 item.

However, the situation with major languages is quite different. There are 11 cases where Estonian has a root that differs from all other languages (#4 'belly', #7 'to bite', #16 'to die, '#21 'earth', #47 'to lie', #78 'small', #79 'smoke', #85 'tail', #106 'snake', #107b 'thin (1D)' and #109 'small, little') and 5 cases where the Estonian root is found in one other variety but where they are opposed to the other three varieties (#72 'to say', #87 'this', #103 'near', #107a 'thin (2D)', and #110 'year'). In Finnish, a root is opposed to all other varieties in 16 cases (#3 'bark', #5 'big, large', #35 'green', #47 'to lie', #53 'many, a lot of', #56 'mountain', #66 'rain', #77 'to sleep', #70 'round (3D)', #71 'sand', #76 'skin', #83 'sun', #86 'that', #97 'who', #106 'snake' and #107b 'thin (1D)'), and there are 3 cases where a Finnish root is the same as in one of the other varieties but different from all others (#92 'warm', #103 'near', #107a 'thin (2D)').

3.3. Since none of the five varieties is isolated from all others, the compiled lists should be considered also from the point of view of language contact. The most typical directions of borrowing for these varieties are the following:

(a) Votic borrowed many words from Ingrian. Usually it is difficult to define whether it was a borrowing from Soikkola Ingrian adapted to Votic phonetics or a borrowing from Lower Luga Ingrian. Also there are two types of borrowings: regular borrowings (e.g. Vot. huu 'they', karkku 'cone') and recent "double-layer" borrowings where the Ingrian pronunciation of a word replaced the original Votic variant (e.g. auki 'pike', hiili 'coal', haapezikko 'aspen forest' cf. proper Votic autsi, iili, aapezikko).

In the compiled Swadesh lists, we did not notice obvious borrowings from Votic into Ingrian. (10) If a word, which has specific phonetic differences between Votic and Ingrian, is borrowed from Ingrian into Votic, it usually keeps the Ingrian phonetic shape (e.g. the initial [h], or [k] before a front vowel). However, for all such pairs which appear in our Swadesh lists, Votic has its original phonetic shape so we cannot assume that these words were borrowed: cf. #14 'cold' Soi. kulma, Vot. tsulm[??], #34 'good' Soi. huva, Vot. uva, #36 'hair' Soi. hiuz, Vot. ivuz, #37 'hand' Soi. kazi, Vot. tsasi, #49 'long' Soi. pitka, Vot. pitts[??]~pittsi, #56 'mountain' Soi. magi, Vot. matsi, #58 'nail' Soi. kunz, Vot. tsunsi, #82 'stone' Soi. kivi, Vot. tsivi, #85 'tail' Soi. handa, Vot. ant[??], #88 'tongue' Soi. keeli, Vot. tseeli, #89 'tooth' Soi. hammaz, Vot. ammez, #97 'who' Soi. ken, Vot. tsen, #99 'yellow' Soi. kelttain, Vot. ke[??]tejn, #103 'near' Soi. ligi, Vot. litsi.

Based on this, we can state that the Swadesh list is stable from the point of view of new borrowings.

(b) As Lower Luga Ingrian is a convergent language on the basis of Votic and Ingrian, it could have taken many words from Votic. However, among 111 words of the core lexicon, there is only one possible candidate for such a borrowing: the word #102 rankkA 'heavy' (Vot. rankk[??]). In the three other varieties, another root is observed. We do not have solid evidence that this word came from Votic and was not some dialectal variant in Ingrian.

(c) One can also expect some borrowings from Finnish via the Ingrian Finnish dialect into Votic or into Lower Luga Ingrian. However, we did not notice such candidates in the compiled lists. The same concerns the borrowings from Estonian into Lower Luga Ingrian: usually, they are not from the core lexicon (e.g. kleit < Est. kleit 'dress').

3.4. Diversity in the core lexicon is explained by different reasons. Among the 34 items where the five varieties were not uniform, several groups of words are distinguished.

a. The biggest group appeared because of quasi-synonymic words that existed in Proto-Finnic. (11) It happened (usually without obvious reason) that one word became predominant in one language and its synonym became predominant in another language. This situation is observed with the following items. Estonian: #4 'belly' Est. koht vs Fin. vatsa (12), #78 'small, little' Est. vaike vs Fin. pieni; Estonian and Finnish: #103 'near' Est. lahedal, Fin. lahella vs Vot. litsi, #107a 'thin(2D)' Est. ohuke, Fin. ohut vs Vot. hojkk[??]; Finnish: #5 'big, large' Fin. iso vs Est. suur, #53 'many, a lot of' Fin. monta vs Est. palju (as well as the alternative Finnish variant paljon), #70 'round(3D)' Fin. pyorea vs Est. ummargune, #76. 'skin [[phrase omitted]]' Fin. iho vs Est. nahk, #86 'that' Fin. tuo vs Est. see, #107b 'thin(1D)' Fin. ohut vs Soi. hoikka; Finnish and Soikkola Ingrian: #92 'warm' Fin. lammin, Soi. lammaa vs Est. soe; Soikkola Ingrian: #101 'far' Soi. ettaa/ettal vs Fin. kaukana; Votic and Lower Luga Ingrian: #102 'heavy' Vot. rankk[??], L-L. rankkA vs Fin. raskas.

b. Some words appeared in the list because of a semantic shift. (13) They already existed in Proto-Finnic but in some language(s) they changed their meaning and became predominant for the corresponding item in the Swadesh list. In some cases, the semantic shift happened in a majority of the varieties, so that only one language preserves the original Proto-Finnic root while the others use another root for the item in the list. This is the case, for example, with #56 'mountain' where only Finnish retains the original Finnic root for 'mountain'.

The words that have a different root due to a semantic shift specific to Estonian are #16 'to die' surra, #21 'earth' muld, #79 'smoke' suits, and #107b 'thin(1D)' peenike. Specific to Estonian and Votic are the words #72 'to say' Est. utelda/oelda, Vot. jute[??]. In Votic, the word #67 'red' kauniz shifted its meaning from 'beautiful' to 'red'. The aforementioned word #56 'mountain' underwent a semantic shift in all varieties except Finnish: Est. magi, Vot. matsi L-L. maki Soi. magi. Specific to Finnish are also the words #3 'bark' kaarna, #47 'to lie' maata, #77 'to sleep [[phrase omitted]] nukkua, #97 'who' kuka, and #106 'snake' kaarme. (14)

c. In rare cases a new derivative from the old root traced to Proto-Finnic or earlier becomes a predominant word in a language. In Estonian, such words are #47 'to lie' lamada and #110 'year' aasta. The latter word also appears in Lower Luga Ingrian: aastaikA is one of the variants for 'year' (see Section 2). In all varieties except Finnish, the word #35 'green' is an adjective derived from the noun with the meaning 'grass': Est. roheline, Vot. rohojn, L-L. rohoin, Soi. rohhoin. In Finnish, the noun #66 'rain' sade is derived from the original verb. Possibly, a Soikkola Ingrian compound #48 'liver' leiba-liha built from two Finnic roots should be placed in this group too. d. In spite of the fact that the core lexicon is relatively stable, new (post-Proto-Finnic) loan words can replace the original words. In Estonian, the word saba (#85 'tail') was borrowed from the Baltic languages, and the word uss (both #106 'snake' and #109 'worm') was possibly borrowed from Russian. In all three minor varieties, the word for #47 'to lie' was borrowed from Russian: Vot. lezia, L-L. lezze, Soi. lessia. In Soikkola Ingrian, one of the variants for #48 'liver' is also a Russian loanword: petsonka.

e. In addition to the described groups, there are two Finnish words with unclear etymology: #71 'sand' hiekka and #83 'sun' aurinko. Also, the word imi (#59 'name'), which is predominant in Soikkola Ingrian and is present in Votic as one of two variants, does not belong unambiguously to one of the proposed groups: it could be either a borrowing from Russian or a contamination (see Section 2).

The distribution of the divergent part of the core lexicon among the discussed groups and varieties is summarized in Table 1. (15)

3.5. The distribution of words in the core lexicon does not correlate with borders between Finnic sub-groups.

One might expect that many of the analysed words would oppose southern Finnic languages (Estonian and Votic) and northern Finnic languages (Finnish and the two dialects of Ingrian). In fact, only two items demonstrate such an opposition: #72 'to say' and #87 'this' (the latter case is not pure since Votic uses a more complicated morphological form than Estonian: kase vs se). Even if we take into account the fact that Lower Luga Ingrian was heavily influenced by Votic and possibly should not be unambiguously considered a northern Finnic language, the situation would not change: only one word opposes Finnish and Soikkola Ingrian to the other varieties: #92 'warm'. This fact has two theoretically possible interpretations: (a) the difference between the two Finnic branches is not considerable enough to be reflected in the core lexicon represented in the Swadesh list; (b) in a contact zone between closely related languages, convergent processes can play a part (e.g. one of the existing basic words becomes predominant under the influence of the neighbouring idiom). Both interpretations can only be confirmed through a thorough analysis of individual words, and this task is beyond the scope of the current paper.

Table 2 presents pairwise comparisons of the Swadesh lists. In the upper-right part of the table, the percentage of the common roots is given. In the lower-left part of the table, the number of words that have different roots is indicated. Rare cases where a language has two roots for the same item (e.g. Finnish paljon ~ monta 'many, a lot of' or Lower Luga Ingrian voosI ~ aastaikA) but the second language in the pair only has one of these roots were counted as 0.5 instead of 1.

The closest varieties are Votic and Lower Luga Ingrian, which formally belong to different Finnic branches. In general, the distance between all three minor languages is small. The major languages demonstrate a greater diversity, and the largest distance is between Estonian and Finnish. It can be clearly seen that the distances between the analysed varieties do not obviously correlate with their genetic affiliation. Thus, we may conclude that a lexicostatistical analysis of the minimal depth (i.e. made for closely related languages) should not be seen as demonstrating a linear correlation with the genetic distance. Changes in the core lexicon happen due to different reasons including convergent processes that are not always transparent. In spite of the fact that the analysed Finnic varieties do not have obvious borrowings from each other, it is evident that the three minor varieties located in the compact area in Western Ingria are less diverse than geographically peripheral major languages.

4. Conclusions

The Swadesh lists for five Finnic varieties were compiled following an elaborated methodology that makes them transparent and discussable.

The difference between minor languages (Votic and two Ingrian dialects) is small: 94% or more of their core lexicon coincides. The major languages (Estonian and Finnish) demonstrate a greater difference both from minor languages (80-86%) and from each other (75%).

There are various reasons why the lexical diversity between languages increases: semantic shifts, the existence of synonymic pairs in the proto-language, new borrowings, and new derivatives, among other reasons.

The lexicostatistic difference between closely related languages does not have a strong correlation with their genetic distance.


We are very grateful to our colleagues and native speakers of Finnic languages who we consulted in the course of our work on the wordlists, in particular, Alevtina Fedotova and Galina Samsonova on Soikkola Ingrian, Nikolai Poder on Lower Luga Ingrian, Zinaida Saveljeva on Votic, Terhi Honkola on Finnish, Partel Lippus and Ellen Niit on Estonian.

We would like to thank the anonymous reviewer and Kirill Reshetnikov for the many valuable comments on the article.

The research of F. Rozhanskiy has been supported by the University of Tartu, grant PHVEE18904.


It is obvious that Swadesh lists compiled by different researchers on the basis of different methods cannot be identical. However, a priori the degree of the diversity is not evident. For this reason, we give a short comment on the differences between the Swadesh lists for Estonian and Finnish compiled in the current article and those presented in Tillinger (2014). Tillinger's lists were chosen because they do not give synonyms and return exactly one word for each item (unlike the lists in Hofirkova, Blazek 2012, and Syrjanen, Honkola, Korhonen, Lehtinen, Vesa koski, Wahlberg 2013).

For both Estonian and Finnish, we found four cases when we propose a word different from Tillinger's (2014), see Table 3.

The reasons behind these differences are obvious: either our variant corresponds better to the context ('bark' and 'earth'), or it was chosen as more general and/or more neutral by a consultant ('big', 'skin', 'bone' and 'snake'). In case of 'many, lots of' we were not able to choose a single variant (but monta and moni have the same root); too 'that' looks more formal and is peculiar to written language so see 'this, that' was chosen as a more neutral variant.

In two cases, Tillinger (2014) does not have an exact correspondence to the words from our list. These are the items #12 burn (we use a transitive verb and Tillinger lists an intransitive verb) and #107 'thin' that is not mentioned by Tillinger.

Item #92 'warm' does not have an exact correspondence in Tillinger's Swadesh list but can be found in another wordlist (Tillinger 2014 : 183).

We conclude that in spite of the different methods of compiling the Swadesh lists, the differences between the versions do not look dramatic.


Fedor Rozhanskiy

University of Tartu

Institute for Linguistic Studies of the Russian Academy of Sciences


Mikhail Zhivlov

Russian State University for the Humanities

National Research University Higher School of Economics



FEDOR ROZHANSKIY (Tartu--St. Petersburg), MIKHAIL ZHIVLOV (Moscow)

(1) When our article was already submitted to the journal it became known that the dataset used by Syrjanen, Honkola, Korhonen, Lehtinen, Vesakoski, Wahlberg (2013) is now open for online access, see The lexical lists from this dataset did not try to solve the synonymy problem: several words are often given for the same definition.

(2) For example, muno 'egg' instead of muna, polottaa 'burn (tr)' instead of polottaa, anna 'give' instead of antaa (anna is the 2Sg imperative form; but the infinitive is given for other verbs in the list), polvi 'knee' instead of polvi, tund[??] 'know' instead of tunta, ju[??]a 'say' instead of jute[??]a, seemee 'seed' instead of seemene, kelten 'yellow' instead of keltein. NB! Here we give Votic forms in the spelling for Kattila and the neighbouring varieties of Votic. Most publications on Votic including Hofirkova, Blazek 2012 follow this system of spelling.

(3) For example, 'green' is rather rohoin than viher as viher means 'unripe'; 'to lie' is lezia but not magata as magata means 'to sleep'; sato is a rare and dialectally restricted word for 'precipitation' and a common word for 'rain' is vihma; the main word for 'this' is kase and se means 'this' or 'that' depending on a context (see comments to this item in the wordlist below); 'to hear' is kuulla but kuullua is 'to be heard; to listen to smb'.

(4) See, for example, [phrase omitted] 1966 : 146, 161: "As we can see the largest number of differences is between the Lower Luga dialect and three other dialects", "At present, the problem of origin of the Lower Luga dialect cannot be finally solved".

(5) The following dictionaries were used: Tsvetkov 1995 and VKS 2013 for Votic, Nirvi 1971 for Ingrian, EVS for Estonian, [phrase omitted] and [phrase omitted] for Finnish.

(6) Our corpora were collected during fieldtrips organized by Fedor Rozhanskiy and Elena Markus in 2003-2018. The Soikkola Ingrian corpus contains about 650 hours of recordings; the Votic and Lower Luga Ingrian corpora contain about 250 hours of recordings each.

(7) We mean that Votic has never been taught in school or had a written standard that was regularly used by native speakers to communicate and to read printed materials. However, besides various texts transcribed by linguists as speech samples there were a number of texts in Votic published for native speakers or other people studying Votic (e.g. [phrase omitted] 2003; Heinsoo 2015; 2018).

(8) Kauniz did not preserve the original meaning 'beautiful' in Luuditsa Votic, but this meaning was observed in some Central Votic varieties (VKS 408).

(9) The spelling of this word in Votic and Ingrian is approximate as there is significant variation in the length of this vowel. We spell it with long ee.

(10) The examples given in the previous paragraph are not from our Swadesh lists.

(11) Of course, in such cases one of the quasi-synonyms must have been "basic" in Proto-Finnic. Additional research is needed to determine the precise semantic difference between such quasi-synonyms at the Proto-Finnic level.

(12) In cases where several languages have the same root, we give examples only from one of these languages. In Section 2 one can find words with this root in other varieties.

(13) By "semantic shift" we mean not only a proper change of meaning but also finer modifications, e. g. stylistic changes.

(14) It is unlikely that kaarme is a new borrowing, because the Northern Finnic languages did not have contact with the Baltic languages since the Proto-Finnic period.

(15) Note that a word of Finnic origin that did not change its meaning and was not a derivate was counted only in group "a" and only in cases where this word was not predominant for most of the varieties under discussion. In general, this table analyses only the words where these varieties demonstrate diversity while changes (e.g. semantic shifts) that happened in all five varieties are not studied here.
Table 1

Causes of lexical innovations

                    Est.  Vot.      L-L.     Soi.     Fin.      Overall

a. Synonyms          4     1        1        2         8 + (1)  16 + (1)
b. Semantic shift    6     3        1        1         5        16
c. New derivatives   3     1        1 + (1)  1 + (1)   1         7 + (2)
d. New borrowings    3     1        1        1 + (1)             6 + (1)
e. Other                  (1)                1         2         3 + (1)
Overall             16     6 + (1)  4 + (1)  6 + (2)  16 + (1)

Table 2

Lexicostatistical distances

      Est.  Vot.  L-L.  Soi.  Fin.

Est.        86 %  86%   83%   75%
Vot.  16          97%   94%   80%
L-L.  16     3.5        96%   82%
Soi.  19     7     4.5        82%
Fin.  27.5  22.5  20    19.5

Table 3

Differences in versions of Swadesh lists for Estonian and Finnish

  N  Meaning         Tillinger  Rozhanskiy, Zhivlov


  3  bark            kuori      kaarna
  5  big, large      suuri      iso
 53  many, a lot of  moni       paljon ~ monta
 76  skin            nahka      iho


 10  bone            kont       luu
 12  earth           maa        muld
 86  that            too        see
106  snake           madu       uss

[Please note: Some non-Latin characters were omitted from this article].
