On some hitherto unidentified Mari items in the "vocabularia comparativa" of P. S. Pallas.

One of the earliest major appearances of the Mari language in print is the "Linguarum Totius Orbis Vocabularia comparativa" edited by Peter Simon Pallas and published in two volumes in St. Petersburg in 1787 and 1789. This comparative wordlist consists of translations of 273 Russian headwords into a large number of languages. As with other languages, the Mari portion of Pallas's ambitious work was drawn from a number of manuscript wordlists compiled at the order of Catherine the Great.

The Mari entries in Pallas were extensively commented upon by Sebeok (1960) who described the origin and context of Pallas's wordlist, provided an item-by-item analysis, summarized the derivational morphology and stress patterns, noted common sound correspondences between Mari dialectal variants, and provided an English index. As a lexical reference, Sebeok relied mainly on Moric Szilasi's "Cseremisz szotar" of 1901. The absence of certain items in Szilasi led Sebeok to leave several items from Pallas uncommented, denoting them only with the designation "unattested".

Due to these lacunae and some errors in Sebeok's analysis, Alhoniemi (1979) took up the material again, producing his own extensive German-language commentary that analyzes each of the items, comments on the orthography used, and traces the inflectional endings attested in the wordlist data. One distinct concern of Alhoniemi's commentary is that he aimed to determine the specific phonological traits of the items in Pallas.

Alhoniemi's commentary corrects most of Sebeok's errors and manages to identify a larger number of Mari words. Besides simply drawing on more lexical references than his predecessor, one technique by which Alhoniemi identifies items that Sebeok missed is that he takes into account the possibility of misunderstandings between the Russian-speaking wordlist compiler and the Mari informant. Thus, he recognizes e.g. Pallas's [TEXT NOT REPRODUCIBLE IN ASCII] 'Myxa' as MariE lozas 'flour', cf. Russian MyKa 'flour', Myxa 'fly' (an understandable mistake as /x/ is not a phoneme in Meadow Mari). Nevertheless, although Alhoniemi's commentary is an improvement on Sebeok's, he too was unable to identify certain items.

In the decades since, new information has come to light that calls for taking a fresh look at Pallas's wordlist in order to fill in some of the gaps of previous studies. New Mari lexical resources have appeared, especially the "Tscheremissisches Worterbuch" edited by Moisio and Saarinen (2008, hitherto referred to as TschWb) and the Mari-English Dictionary. (1) Furthermore, CepreeB (2000; 2002) has examined many 18th-century manuscript wordlists held in Russian archives, including several commissioned for Pallas's project. He has set out the principles by which Mari was usually represented in Russian orthography by early wordlist compilers, he notes cases of erroneous transmission, and he even identifies the geographical provenance of certain Mari word forms that would also appear in Pallas's printed book.

Thus, the present study aims to evaluate hitherto unrecognized items in Pallas in the light of newer references, bringing us a more complete understanding of what Pallas tells us about the Mari language in the late 18th century. Although the two volumes of the "Linguarum Totius Orbis Vocabularia comparativa" have been republished in recent decades (Pallas 1977, a photographic reproduction of the copy at the Staats- und Universitatsbibliothek Hamburg), for this study I have used a higher-quality scan derived from the copy at the Taylor Institution Library, Oxford, and made freely available on the internet. (2) Unless otherwise mentioned, all Mari dialectal material is derived from TschWb and material from the MariE literary language is derived from the Mari--English Dictionary.

In the headings below, I give the Mari item from Pallas's "Vocabularia comparativa", the Russian headword under which it is found, and the number of this headword.


Sebeok marks this item as unclear but notes Szilasi's jorla 'poor'. The Cyrillic representation speaks against a round vowel, however. Alhoniemi (1979 : 212) is unable to identify it with any specific Mari word and only compares it to MariE [TEXT NOT REPRODUCIBLE IN ASCII] 'Masern'.

If one accepts the possibility of a misunderstanding between informant and compiler such that a verb was elicited instead of a noun, Pallas's item can be identified as the 3 sg. pres. form of MariE [TEXT NOT REPRODUCIBLE IN ASCII] '(Hund) knurren, (Mensch) wutend werden, murren'. Although Cyrillic u does not typically denote the vowel [??] (CepreeB 2002 : 102ff.), the Cyrillic representation does match a form of this word attested from the dialect of Bolsoj Kil'mez in which first-syllable i appears instead: irla.


Sebeok (1960 : 332) correctly identifies the second element as Mari ssuj 'head' but only speculates that the first element is su 'neck', which is not acceptable. Alhoniemi (1979 : 227) only marks the item with a question mark.

The first sequence yu- should be interpreted as an attempt to represent the front rounded high vowel u in Mari. The placement of the stress tells us that the vowel denoted by Cyrillic a is likely the reduced vowel 3. Consequently, Pallas's item can be identified with MariE lit. [TEXT NOT REPRODUCIBLE IN ASCII] 'sotnik (lieutenant in Cossack army)'. Formed from [TEXT NOT REPRODUCIBLE IN ASCII] '100' and ssuj 'head', this Mari compound is a partial calque of Russian COTHUK < CTO '100' + derivational suffix -HUK. That this word known today from the MariE literary language can indeed traced back to 18th-century Mari is confirmed by the entry [TEXT NOT REPRODUCIBLE IN ASCII] ' [TEXT NOT REPRODUCIBLE IN ASCII]' in an 18th-century manuscript forming part of Damaskin's dictionary that has been examined by [TEXT NOT REPRODUCIBLE IN ASCII] (2000 : 98; 2002 : 42). The difference in meaning between Russian [TEXT NOT REPRODUCIBLE IN ASCII] and the Mari word can be explained as a misunderstanding where the wordlist compiler's request for a noun 'power, authority' was answered with a term for a particular person holding power or authority.


Alhoniemi (1979 : 210), who translates the Russian headword as 'Gestalt', was unable to identify the Mari word. It is unclear why he did not accept the interpretation of Sebeok (1960 : 299), who identified this as a derivation cum-ast-a 'growth (lit. he stretches)'. While a derivation containing -[TEXT NOT REPRODUCIBLE IN ASCII]- is not known from other sources, evidence for MariE camem, W tsamem in TschWb supports Sebeok's interpretation. The attested meanings 'spannen (Kleidungsstuck, Leder), dehnen, vergrossern, weiter machen (Stiefel, Hut)' match the Russian headword. Furthermore, the Cyrillic letter io is a common representation of the vowel [??], which is found in MariE (Upsa) [TEXT NOT REPRODUCIBLE IN ASCII], and the vowel a as found in MariW tsamem (see CepreeB 2002 : 102-110).


Sebeok (1960 : 320) accepts this word as it is, but Alhoniemi (1979 : 226) only lists it with a question mark. This can be identified as a deverbal noun in -[TEXT NOT REPRODUCIBLE IN ASCII] from MariE [TEXT NOT REPRODUCIBLE IN ASCII] W Nw sata '(aus)keimen, hervorkeimen (Getreide)', probably from the Western or Northwestern Mari areas due to the use of Cyrillic letters denoting front vowels and the final soft sign. Such a derivational form is attested in MariE lit. [TEXT NOT REPRODUCIBLE IN ASCII] 'shoot, sprout'.


Sebeok (1960 : 299) transliterates Pallas's Cyrillic item as cumratarmas and interprets it as consisting of two combined words, of which the first reflects MariE [TEXT NOT REPRODUCIBLE IN ASCII] 'rund' and the second "perhaps a root tertes- 'round'".

Alhoniemi (1979 : 210) maintains the same interpretation of this item as a combination of two words, and he also identifies the first component with MariE cumaras, but he provides no explanation for the second part. However, Alhoniemi misreads the item from Pallas, mistakenly citing it as [TEXT NOT REPRODUCIBLE IN ASCII]. Alhoniemi evidently had access to only a low-resolution reproduction of Pallas's work, but from the high-resolution scan consulted for the present study, it is clear that the word is actually [TEXT NOT REPRODUCIBLE IN ASCII] as in Sebeok's commentary, and Alhoniemi was led astray by the similar letter forms for [??] and T in Pallas's typeface.

After establishing that Pallas's printed book presents a form [TEXT NOT REPRODUCIBLE IN ASCII], we can further suppose a mistake during the typesetting phase by which a manuscript's T was mistakenly replaced with M. Thus, we can identify this item with MariE tartas 'Ball, kugelformig', or possibly its MariW form tartas if the final soft sign represents frontness of the second-syllable vowel.


Sebeok (1960 : 315) interpreted this as a compound, seeing the first part as MariE p[??]l ~ W pal 'sky', but only conjecturing that the second part is related to MariE [TEXT NOT REPRODUCIBLE IN ASCII] 'dirt'. Alhoniemi (1979 : 222) only marks it with a question mark.

This item can be identified with MariE pulam[??]r 'Alarm, Storung, Larm, Trubel, Schrecken', attested from the Morki region in TschWb. This is not a compound involving 'sky' at all, but rather a borrowing from Tatar (cf. Tat. lit. [TEXT NOT REPRODUCIBLE IN ASCII]). The word subsequently passed into the Meadow Mari literary language as [TEXT NOT REPRODUCIBLE IN ASCII] 'agitation, rebellion, unrest, disorder, revolt, turmoil, commotion, stir, panic, etc.'.


Alhoniemi (1979 : 224) only marks this with a question mark, but Sebeok (1960 : 317) was already on the right track when he wrote for this word, translated as 'steam' in Russian, ''probably a misunderstanding, for the word should mean 'like that'". Sebeok did not mention the specific Mari form he had in mind, but we are clearly dealing here with MariE [TEXT NOT REPRODUCIBLE IN ASCII]. Of the five forms of this word in TschWb, all have stress marked on the initial syllable. For two dialects, stress is denoted on both the first and last syllables, but for the other three dialects (Birsk, Bol'sojKil'mez, Upsa) the stress falls solely on the first syllable, and the final vowel in the Birsk and Upsa dialects is partly reduced. The Cyrillic letter ? is a common means of denoting a weak front vowel in 18th-century wordlists (see [TEXT NOT REPRODUCIBLE IN ASCII] 2002 : 106-108).


Sebeok (1960 : 311, 312) proposes that this is a variant [TEXT NOT REPRODUCIBLE IN ASCII] of MariE meyge, W manka, but no such variant is attested elsewhere. Alhoniemi (1979 : 220) marks the word with a question mark but also compares it to MariE meyge.

Because of the overwhelming phonological similarity, this item should be identified with MariE munij 'Krote', namely the form munej attested from the dialect of Bol'sojKil'mez. As the two words meyge and munej would have followed closely together in an alphabetically arranged manuscript wordlist, we can assume that at some point in the preparation of the material for press, the Russian translation of Mari 'stake' was mistakenly associated with the Mari word for 'toad'.


These items should be examined together. With regard to ????? '????', Sebeok (1960 : 309) marks this as unattested, but Alhoniemi (1979 : 218) identifies the word as MariE (Birsk) kunzo 'Last'. The word is not found in TschWb, but it is present as kund'zo in Paasonen's dictionary of Eastern Mari, based on the Birsk dialect.

The other item, [TEXT NOT REPRODUCIBLE IN ASCII], is similarly marked as unattested by Sebeok (1960 : 308), while Alhoniemi (1979 : 218) only marks it with a question mark. Sinor (1961 : 172-173) compared this item in Pallas to Mongolian gunje 'radeau, canot' and similar items in the Tungusic languages, but as Middle Mongolian loans in Mari were typically mediated through Turkic (see e.g. Rona-Tas 1982), such a connection remains fanciful without Tatar or Chuvash evidence.

CepreeB (2002 : 32-33, 177) has mentioned that a form [TEXT NOT REPRODUCIBLE IN ASCII] is attested in a manuscript wordlist compiled for Pallas that shows clear traits of the Malmyz dialect. The item spelled [TEXT NOT REPRODUCIBLE IN ASCII] in Pallas's printed book may be seen as another representation of the same word: the final A can be interpreted as the reduced vowel [??] but also marking palatalization of the preceding consonant, i.e. [TEXT NOT REPRODUCIBLE IN ASCII]. The difference in meaning between 'cart' and 'boat' is understandable, as besides the simple fact that a boat was also a common means of conveying a load in this region, the use of one and the same word for 'cart' and 'boat' is a feature of the neighboring and strongly influential Tatar language, namely Tat. [TEXT NOT REPRODUCIBLE IN ASCII].


Sebeok (1959 : 298) breaks this word down into the components [TEXT NOT REPRODUCIBLE IN ASCII], though he doesn't comment on the root cipt- that he sees at the heart of the word. Alhoniemi (1979 : 210), on the other hand, sees here a verb root plus the infinitive suffix -as, but he does not identify the root.

The nomen actionis suffix -mas has been identified by both Sebeok and Alhoniemi in other items from Pallas, and the marked productivity of this suffix in 18th-century Mari records in general has been noted by [TEXT NOT REPRODUCIBLE IN ASCII] (1975 : 228) and Cepreeb (2000 : 84; 2002 : 83, 141). This item in Pallas can be identified as such a derivational form stemming from a verb root [TEXT NOT REPRODUCIBLE IN ASCII]-. Such a verb with this meaning is known from the MariE literary language: [TEXT NOT REPRODUCIBLE IN ASCII] 'to cover with a mat, to cover with a bast mat; (figuratively) to attack, to fall upon, to charge'. The Mari-English Dictionary views both senses as ultimately stemming from MariE lit. [TEXT NOT REPRODUCIBLE IN ASCII] 'mat, bast mat, matting'.

TschWb attests Cdpta 'Bastmatte, locker gewebte Pferdedecke aus Bast' from several MariE dialects (Birsk, Sernur, Morki) and the first-syllable vowel is denoted, presumably under the influence of the initial c-, as fronted. Consequently, the use of Cyrillic ? to represent the first-syllable vowel is to be expected.


Sebeok (1960 : 315) mistook the Cyrillic letter N for b and thus incorrectly reads this as [TEXT NOT REPRODUCIBLE IN ASCII]. He then marks this item as unattested. Alhoniemi (1979 : 223) reads Pallas's Cyrillic correctly as [TEXT NOT REPRODUCIBLE IN ASCII], but he only lists the item with a question mark.

The Mari-English Dictionary offers the verb [TEXT NOT REPRODUCIBLE IN ASCII] 'to become frail, to become sickly, to grow weak; to be depressed, to be dispirited; to lose interest, to grow cold towards', which in form and meaning closely resembles another MariE lit. word [TEXT NOT REPRODUCIBLE IN ASCII] 'to come to ruin, to grow poor'. It was presumably the latter word in the MariE literary language that led the compilers of TschWb to list a headword pulnem, but the sole attested form (from the Volga dialect of MariE) that TschWb offers under this headword is palnem. Pallas's item can be explained as a deverbal abstract noun in -as from palnem.

The semantic link between Pallas's 'victory' and the meaning 'become frail' is highlighted by the existence of a derived transitive verb MariE lit. [TEXT NOT REPRODUCIBLE IN ASCII] 'to stifle, to overwhelm'. That is, the victory of one side of a conflict is the weakening of the other side. As confirmation of this link, the same root is seen in [TEXT NOT REPRODUCIBLE IN ASCII] in a late 18th-century manuscript wordlist kept in the state archives of the Kirov district (see CepreeB 2000 : 39-42, 141).


Sebeok (1960 : 321) marks this as unattested and Alhoniemi (1979 : 227) provides no comment for it in his list. Located far from any ocean, the Mari would supposedly not have had their own word for the marine animal. Indeed, CepreeB (2000 : 77; 2002 : 76) notes that an 18th-century Mari wordlist collected by one Mendier Bekdorin follows Russian kut with " [TEXT NOT REPRODUCIBLE IN ASCII]". In another 18th-century wordlist collected for Pallas (??????? 2000 : 144) the Mari informant appears to have answered with the Russian word: " [TEXT NOT REPRODUCIBLE IN ASCII]".

As the notions 'whale' and 'sea monster' were very closely connected in earlier eras, we may suppose that the word which the Mari informant provided was MariE sert NW sert 'boser Geist'. This supernatural creature was identified with bodies of water and could catch those who had gone to the water to swim or fish (Sebeok, Ingemann 1956).

Though the use of the Cyrillic letter ? in Pallas's representation [TEXT NOT REPRODUCIBLE IN ASCII] seems to suggest a back vowel (see CepreeB 2002 : 106), the MariNw form with its front vowel remains a possible source--in his analysis of another 18th-century attestation of Mari, namely the poem in honor of Catherine the Great, Veenker (1981) notes that the Cyrillic letter y can denote MariNw [??] after [??]; the final soft sign in Pallas may also suggest front articulation.


Sebeok (1960 : 298) marks this as unattested and Alhoniemi (1979 : 208) only marks it with a question mark. Because Mari does not have initial b-, we would have to suppose that the Cyrillic letter [??] represents [beta], as it in fact does in several other items in Pallas's book, e.g. [TEXT NOT REPRODUCIBLE IN ASCII] id. (185), [TEXT NOT REPRODUCIBLE IN ASCII]. (101). After a labial consonant, Cyrillic A typically represents the front low vowel a. No Mari word ssaj is known, however.

If we examine the purportedly Mari item ??? in the context of the full entry in Pallas, we see that is nearly identical to the Udmurt item [TEXT NOT REPRODUCIBLE IN ASCII] two lines below it. [TEXT NOT REPRODUCIBLE IN ASCII] (1966) considers the purportedly Udmurt [TEXT NOT REPRODUCIBLE IN ASCII] to represent Udm. vaj 'give! (imperative)', mistakenly reflected among words for 'in'. One can view the Mari entry as simply an erroneous duplication of the Udmurt entry. This would not be the only such mix-up in Pallas; Alhoniemi (1979 : 217) has pointed to how Mordvin koda 'how' (represented as [TEXT NOT REPRODUCIBLE IN ASCII]) was mistakenly placed under Mari in Pallas's item 270 '????'.


Sebeok (1960 : 328) marks this as unattested, while Alhoniemi (1969 : 209) only marks it with a question mark. The item as it appears in Pallas (listed after [TEXT NOT REPRODUCIBLE IN ASCII] corresponding to MariE W ssara 'danach, dann, spater') is best regarded as a misprint for MariE ssarase 'letztere(r/s)'.


Sebeok (1960 : 325) marked this word as unattested, while Alhoniemi (1979 : 230) only lists it with a question mark.

While Pallas's book spells the item with a final soft sign, suggesting a palatalized sibilant, CepreeB (2002 : 35, 183) cites the word as [TEXT NOT REPRODUCIBLE IN ASCII], with a final hard sign pointing to a non-palatalized s, from one of Pallas's source manuscripts that represents a dialect transitional between the Krasnoufimsk and Kungursk varieties of Eastern Mari. The final soft sign in Pallas's book can therefore be regarded as a misprint.

Due to the item's Eastern Mari provenance and the final hard sign that CepreeB established, this item can be identified with ' umsez '[TEXT NOT REPRODUCIBLE IN ASCII]' found in the dictionary compiled by [TEXT NOT REPRODUCIBLE IN ASCII] (2011) on Mari dialects of Tatarstan and Udmurtia. This Mari word consists of Russian yM 'mind' followed by the caritive suffix-[TEXT NOT REPRODUCIBLE IN ASCII] of Tatar origin. It is thus a partial calque of Russian [TEXT NOT REPRODUCIBLE IN ASCII]. We can suppose a misunderstanding between the wordlist compiler and the Mari informant such that the compiler's request for an item 'without' was answered with a Mari word whose Russian translation is a compound containing [TEXT NOT REPRODUCIBLE IN ASCII]-'without'.

Items that remain unclear

The following items in Pallas remain unexplained, but some comments can nevertheless be made about certain items:

[TEXT NOT REPRODUCIBLE IN ASCII] (10), CepreeB (2002 : 35, 181) notes that this word is present in a manuscript wordlist, reflecting the Krasnoufimsk dialect of Eastern Mari, that was compiled for Pallas, but he only assumes this is a Mari word that has fallen out of use;


[TEXT NOT REPRODUCIBLE IN ASCII] (143), both Sebeok (1960 : 297) and Alhoniemi (1979 : 208) identify the first component as MariE araka 'Branntwein, Wein', but the second part remains unclear and Sebeok only suggests a metathesis of MariE mor 'Erdbeere';

[TEXT NOT REPRODUCIBLE IN ASCII] (145), CepreeB (2000 : 27-28, 98; 2002 : 34-35, 131, 184) notes that this item is found in a manuscript wordlist reflecting the Malmyz dialect and containing a large number of errors, but he only assumes that this is a Mari word that has fallen out of use;

[TEXT NOT REPRODUCIBLE IN ASCII] (147); [TEXT NOT REPRODUCIBLE IN ASCII] (182); [TEXT NOT REPRODUCIBLE IN ASCII] (198); [TEXT NOT REPRODUCIBLE IN ASCII] (219); [TEXT NOT REPRODUCIBLE IN ASCII] (220); [TEXT NOT REPRODUCIBLE IN ASCII] (236), if this is not the typesetter's misreading of MariE toltas 'bring' (not in [TEXT NOT REPRODUCIBLE IN ASCII] but known from MariE lit. [TEXT NOT REPRODUCIBLE IN ASCII] 'to convey, to transport' and attested in the 1926 dictionary of [TEXT NOT REPRODUCIBLE IN ASCII] under [TEXT NOT REPRODUCIBLE IN ASCII]), as the two sequences of letters could very easily be confused with one another when written in cursive Cyrillic script.


Thus a deeper understanding of the Mari data in Pallas, going beyond Sebeok and Alhoniemi ' s respective commentaries, can be reached by using the richer array of lexicographical resources which were not available to those earlier scholars, as well as well as by taking the peculiarities of Pallas s typeface and 18th-century manuscript handwriting into account. Where the Cyrillic representation of the item in question closely matches the phonology of an attested Mari form even when the meaning differs, as in the case of [TEXT NOT REPRODUCIBLE IN ASCII], [TEXT NOT REPRODUCIBLE IN ASCII], and [TEXT NOT REPRODUCIBLE IN ASCII], then we should assume a misunderstanding between wordlist compiler and informant.

If, as Sebeok notes in his commentary, Pallas's ambitious work has drawn criticism since virtually the moment it was published for its overly hasty execution and the fact that much in it remains unusable, it nevertheless remains impressive that the overwhelming majority of Mari items can be recognized from more recent references, and the work provides vital evidence for reconstructing 18th-century Mari. Furthermore, even if errors in compilation or transmission might render it impossible to fully explain all items, further breakthroughs may be possible through e.g. the full publication of the manuscripts languishing in Russian archives. The present article has offered a contribution to the study of the Mari words in Pallas, but the material continues to merit scrutiny.


Christopher Culver

E-mail: Phone: +40747371341


MariE -- East Mari; MariNw -- Northwest Mari; MariW -- West Mari; Tat -- Tatar; Udm -- Udmurt.


CHRISTOPHER CULVER (Helsinki-Cluj-Napoca/Kolozsvar)

(1) The Mari-English Dictionary, located at, is an electronic dictionary of literary Meadow Mari, developed by the Department of FinnoUgric Studies at the University of Vienna, which incorporates a number of previously published lexical sources as well as thousands of additional headwords.

(2) The two volumes can be downloaded at File:Linguarum_totius_orbis_1.pdf and Linguarum_totius_orbis_2.pdf respectively.
