Evidentiality in dialects of Khanty.

1. Introduction

In the last two decades, evidentiality as a semantic and grammatical category has shown an increase in attention from researchers (Bybee 1985; Evidentiality 1986; Willett 1988; Aikhenvald 2004). Although evidentiality, a linguistic strategy used to mark the source of information about an event, had been an object of study in linguistics, it was not until the 1980s that researchers began to expand the number of languages studied and bring existing studies on different languages into alignment with one another. The question of evidentiality has been debated at international conferences, producing a collection of anthologies and journal issues on the subject (e.g., Evidentials 2000; Studies in Evidentiality 2003; Journal of Pragmatics 33/3; LU XXXVIII 3).

Opinions diverge on the exact definition of evidentiality and its semantic category. According to some researchers, it belongs to epistemic modality (Bybee 1985) or the related concept of propositional modality (Palmer 2001). Others doubt the existence of evidentiality as an independent semantic category (Bybee, Perkins, Pagliuca 1994; Givon 2001: 326-329; de Haan 2005). Evidentiality can be coded in a variety of ways across the world's languages, such through the use of mood markers, special verb forms, particles, modifiers, adverbials, or independent phrases. The present analysis examines the occurrence of the indirect evidence verbal paradigm in Khanty and, to a lesser degree, Mansi. Although a diverse range of terms had been used to describe this mood across different language, recent shifts in typology have resulted in a standardization of terminology. Evidentiality is now almost exclusively the grammatical term used to mark information from an indirect source.

In this article, I briefly describe the nature of evidentiality as a grammatical category in Finno-Ugric languages. I then introduce the use of evidentiality in Khanty and Mansi, based primarily on the work of Nikolaeva, Kaksin, and Skribnik. The primary focus of the analysis is on historical linguistics, as I investigate the origins of grammaticalized evidentiality in the Ob-Ugric languages through a comparison of the grammar of spoken language and of the language used in songs. Two possible paths to grammaticalization are discussed. The eastern Khanty dialect Surgut, which uses a different strategy to express indirect evidentiality, plays an important role in the analysis.

2. Coding of evidentiality in the Uralic languages

Uralic languages are spoken in a broad range stretching across Northern Eurasia from Scandinavia to the Yenisei River in Siberia. Within this large expanse, three separate areas can be identified where grammaticalized evidentiality can be found in verb forms: the Baltic region (Estonian, Livonian), the Volga-Kama region (Komi, Udmurt, Mari), and northwestern Siberia (Northern Khanty and Mansi, Nenets, Enets, Selkup). Of these ten languages, the World Atlas of Typology identifies only three in its chapter on evidentiality (de Haan 2005a; 2005b). This can be attributed to the fact that only these Uralic languages have been included in linguistic discussion of evidentiality in the past twenty years (Perrot 1996; Leinonen 2000; Kehayov 2002; Metslang, Pajusalu 2002), although the "non-eyewitness mood" had been long documented in Finno-Ugric languages (Haarmann 1970). In the introduction to her article, Marja Leinonen lists all the Finno-Ugric languages that use grammatical marking to indicate an indirect evidential source of information (2000: 419-420).

Interest in evidentiality has grown in the literature within functional typology, adding new perspectives to traditional descriptive grammar. Similar features of Estonian and Livonian dialects have been recently documented with a functional typological approach (Erelt 2002; Kehayov, Metslang, Pajusalu 2012). More recently, studies of Samoyedic languages--Nenets ([TEXT NOT REPRODUCIBLE IN ASCII] 2002; Jalava 2008), Enets (Kunnap 2002), and Selkup ([TEXT NOT REPRODUCIBLE IN ASCII] 2002)--have built on the studies discussed in the introduction.

In the aforementioned languages, indirect acquisition of information is marked by a verb form that developed from a participle or other nonfinite. Cases in which periphrasis is used also include a nonfinite verb (Bereczki 1983: 219).

The development from nonfinite to finite is a well-known grammatical process in languages around the world. Evidentiality with nonfinite origins can be found not only in the Finno-Ugric languages but in the Turkish, Tungusic, and Baltic languages as well. While all languages can be understood to have the theoretical possibility of expressing evidentiality through grammatical means, not all do so in reality. Grammaticalization can occur as an internal process, but external factors can also play a role, such as the influence of neighboring languages. In the Baltic region, the Baltic (Latvian, Lithuanian) and Baltic-Finnic languages mutually influenced each other's development. In the Volga-Kama region, the Turkish languages have shown the most influence on Finno-Ugric languages. In northwestern Siberia, influences may include Paleo-Siberian (Ket, Yukaghir) or Tungusic languages. In the case of the Ob-Ugric languages, the influence of Komi should also be considered. Furthermore, Siberian Tatar may be an influence on the Ob-Ugric languages and Selkup.

3. Evidentiality in the Ob-Ugric languages

Although the grammatical marking of evidentiality can be found in the northern dialects of Khanty and Mansi, no such category appeared in the grammars written about these languages until the 1980s, nor was the phenomenon described by another name. Reasons for this gap are discussed in Section 4. The feature was first observed by linguists in the Novosibirsk School. The primary research focus of the Soviet (later Russian) Academy of Science was the Turkish languages of Siberia, and study of the Ob-Ugric languages occurred as an extension of this project. Even specialists of the Turkish languages observed that in northern Khanty "non-eyewitness action" ([TEXT NOT REPRODUCIBLE IN ASCII]) could be described. Evidentiality was appropriately included in a classification of verbal moods by the late 1980s: realis and irrealis, with the indicative and non-eyewitness moods falling under the former and in the imperative and conditional under the latter ([TEXT NOT REPRODUCIBLE IN ASCII] 1988: 102-118; [TEXT NOT REPRODUCIBLE IN ASCII] 1989: 7-18.). It is worth noting that within the realis category the indicative and evidential are not opposites: the Khanty indicative does not code any value of evidentiality. Skribnik and Janda (2012) proposed a similar system for Mansi; it differs in that the realis category is characterized by various tenses and the irrealis by the lack of tense.

In various anthologies summarizing his work of the last twenty years, Andrej Kaksin devotes a significant amount of attention to evidentiality ([TEXT NOT REPRODUCIBLE IN ASCII] 2008: 112-290). The subject was brought into linguistic discourse on the international level by Irina Nikolaeva. In her first summary of the Obdorsk dialect, she named the mood latentive ([TEXT NOT REPRODUCIBLE IN ASCII] 1995: 126-132); she switched to the term evidential several years later. In a thorough discussion of the subject (Nikolaeva 1999a: 132), she proposed the following schema to illustrate Khanty verb conjugation:

                    Indicative       Evidential

          Present   -l- + Px         -t- + Px
          Past      -s-+ Px          -m-+Px
          Present   -l- +-a(j)+ Px   -ti
          Past      -s-+-a(j)+ Px    -(e)m

Markers of the indicative mood -l- and -s- were originally tense markers, while the morphemes -t- and -m- were originally markers of present and past participles. Person marking also differs accordingly. In her study, Nikolaeva uses the abbreviation Px to indicate personal agreement markers, but these personal markers differ in the conjugation of the two different categories. In the indicative mood, person marking on the verb attaches to the verb (Vx), whereas in the evidential mood, agent marking (PPx) is used, which is more similar to personal possessive marking.

With respect to semantics, Nikolaeva divides the evidential into four different categories (1999a; 1999b: 88-94.) Naturally, there is often overlap and the categories cannot always been clearly distinguished. The following examples are from the Obdorsk dialect.

3.1. Use for hearsay

Obdorsk Khanty requires use of the evidential if the main sentence contains a verbum dicendi:

(1) luw law-es jilep xap wer-m-al

he say-PST.3SG new boat make-EV.PST-3SG

'He said that he made a new boat' (Nikolaeva 1999a: 133)

Use of a verb here in the indicative (wer-es, make-PST.3SG) would produce an ungrammatical sentence.

3.2. Resultative use

Resultative action verbs often appear in evidential mood. The speaker does not witness the process, only its result.

(2) ma kese-m xarnajet-m-al

I knife-1SG be.ruined-EV.PST-3SG

'My knife is ruined' (I can see that there is rust on it) (Nikolaeva 1999a: 142)

3.3. Inferential use

The evidential mood may be used when the speaker infers the occurrence of an event based on personal observations or other signs.

(3) juwan joXet-m-al

John come-EV.PAST-3SG

'John has come' (I can see his coat on the rack or his canoe on the shore) (Nikolaeva 1999a: 137)

3.4. Mirative use

The evidential mood can be used to mark events that have taken the speaker by surprise, even though such cases do involve a direct source of information. The surprise can be attributed to the fact the speaker was not part of the process resulting in the event, as in the following example:

(4) nawrem-l-al lawili-t-el

child-PL-3SG speak-EV.PRS-3PL

'(It turns out) that her children (can) speak' (Nikolaeva 1999a: 147)

Sentence (2) can also express surprise, if, for example, the speaker notices the rust on the knife just as he or she plans to use it. It is possible that intonation is used to distinguish sentences of ambiguous meaning, but unfortunately there is no audio material available to support this hypothesis. The relationship between evidentiality and mirativity is also the subject of a lively debate (Lazard 1999; 2001; Dendale, Tasmowski 2001; DeLancey 2001; Plungian 2001; Kugler 2003). In the case of the Ob-Ugric languages, there is no need to settle the debate, since hearsay, directly acquired information, and unexpected events are all marked with the same mood.

In Mansi, the use of evidentiality is limited to describing unexpected events, according to Elena Skribnik ([TEXT NOT REPRODUCIBLE IN ASCII] 1998; Skribnik 1999). In a joint lecture with Gwen Janda, she claimed that the similar verb forms in Mansi, also originating from nonfinites, do not actually have evidential meaning, noting that their primary meaning is mirative, with evidential use secondary (Skribnik, Janda 2012).

(5) nasati, sana-y-e-asa-y-e tiy joXti-m-iy

behold mother-DU-3SG father-DU-3SG here come-EV.PST-3DU

'Behold, mother and father have arrived' (Skribnik 1999: 407)

Although such usage is secondary, the evidential can be seen in the following examples. In the northern Mansi story, the verb used to describe the moment when the hero learns of his father's identity (ons-um have-EV.PST.3SG) is structurally the same as the evidential, but can also be interpreted as a mirative, since he was surprised by the information.

(6) ja-ta, piy-risakwe Xuntaml-as-te,

well boy-DIM hear-PST-3G(OBJ)

taw pakw posi wojkan oter as ons-um

3SG pine seed white hero father have-EV.PST.3SG

'Well, the boy heard that (allegedly) his father was the hero White Pine Seed' ([TEXT NOT REPRODUCIBLE IN ASCII] 2001: 28)

A few lines later, the boy tells his mother what he found out. At this point, he has gotten over his surprise and uses the first-person verb os-m-um, which marks information learned from hearsay. The main 'say' verb derives from the same etymological root as the verb in sentence (1) from Obdorsk Khanty, in which use of the evidential is required:

(7) oma, maxum nawram-et tox ti lawe-y-at:

mother people child-PL PCL say-PRS-3PL

am pakw posi wojkan otar as os-m-um

1SG pine seed white hero father have-EV.PST-1SG

'Mother, the children of our people say that my father was the hero White Pine Seed' ([TEXT NOT REPRODUCIBLE IN ASCII] 2001: 28)

Originating from nonfinites, the verbal forms used in Mansi to express the mirative can be more broadly understood to be evidentials. Derivational morphemes of participles grammaticalized into tense markers and voice (passive) markers. The present participle marker -n became a present tense marker and the past participle marker -m a past tense marker. The passive voice is marked by -ima, the marker of the past passive participle.

                    Indicative    Evidential


          Present   -[??]- + Vx   -n- + Vx
          Past      -s- + Vx      -m- + Vx


          Present   -we + Vx      --
          Past      -wes + Vx     -ima

The two Ob-Ugric languages followed different paths of development as they grammaticalized. In Mansi, the subjective (indefinite) and objective (definite) conjugations can be distinguished in the active voice in both the indicative mood and evidential mood. This is not possible, however, in Khanty. Evidential forms in Khanty are identical to nonfinite + PPx forms, whereas Mansi uses personal suffixes (Vx). In this way, Mansi is a step ahead of Khanty with respect to nonfinite > finite grammaticalization.

4. Limitations of evidentiality research in Khanty

As seen above, evidential forms in Khanty are derived from nonfinites. In many cases, they are structurally identical to agent-marked participles. This grammatical homonymy had confused earlier authors of linguistic grammars, who considered predicative nonfinites to be forms temporarily undergoing verbalization, rather than nonfinites. The phenomenon was considered to fall outside the domain of grammar, and it is only mentioned briefly (Redei 1965: 74) or not discussed at all (Honti 1984). The reason for its omission is the fact it was considered a unique feature of songs (Sarkadi Nagy 1913: 252; Steinitz 1937 [1980]: 221; A. Jaszo 1969). W. Steinitz also observed a difference in how informants treated songs: when the song was sung, a nonfinite form would be used, and when written on paper, a finite verb would be used (Steinitz 1939 [1975]: 226.)

The modality of nonfinite-derived verb forms was not addressed by researchers until the last third of the twentieth century. Nonfinites formed with the derivational morpheme -m- were observed with a special tense, which W. Steinitz called the historic perfect (1937 [1980]: 221; 1939 [1975]: 50). Use of similar forms found in the southern Khanty dialect Konda was characterized as "long ago" and "in narration" by Karjalainen (Karjalainen, Vertes 1964: 83). Moreover, evidentiality was omitted because grammars had, until recently, dealt only with examples from folklore texts, where these forms genuinely could not be found.

Non-eyewitness modality found in the language of Khanty hero songs was first investigated by Anna A. Jaszo. She was unable to uncover a consistent rule behind its usage, which she attributed to the ancient, mystified nature of the phenomenon (A. Jaszo 1976).

Research of nonfinite-derived finite verbs in the language of songs is made difficult by two factors. First, the first hundred years of documentation of the Khanty language includes only folklore texts, stretching from the middle of the nineteenth century to roughly the middle of the twentieth century. Although spoken-language sources were collected by Steinitz in the 1930s, he also only came across nonfinite-derived finite verbs in the language of songs. Second, the songs that can be found do not display the same use of the evidential as in modern northern Khanty. The problems of the verbalization of nonfinites and the origin of evidential modality must, therefore, be solved without documentation of the older forms of the language. In order to investigate these questions, we must familiarize ourselves with the characteristics of the language of songs in Khanty and eastern dialects of Khanty.

4.1. The language of songs in Khanty differs greatly from the spoken language. Although there is no room to detail these differences here, it suffices to note that besides the most obvious archaisms, there are certain phonological, morphological, syntactic, and lexical phenomena used as stylistic features that cannot be explained as fossilized archaisms (Steinitz 1939 [1975]: 225-230; 1941 [1976]: 1-61; Csepregi 2009).

4.2. In eastern dialects of Khanty, the spoken language does not have a grammaticalized evidential mood. Participles are most rarely found at the end of a sentence in predicate position (see below in sentences (12) and (13)), but they do not become finite verbs; this can only occur in the language of songs. Song examples from eastern Khanty can only be provided from Surgut Khanty.

5. Nonfinite-derived finite verbs in Surgut Khanty

In the language of songs in Surgut Khanty, both original finite verbs and nonfinite-derived finite verbs are used. It is only stylistic difference, not semantic, that motivates which type will be used. In the example, the word 'I'm thinking' is sometimes used with the present-tense verb nomeqse-l-em (think-PRS-1SG) and sometimes with the person-marked present participle nomeqse-t-am (think-PTC.PRS-1SG). The phenomenon appears not only in mythical songs but can also be observed in songs performed in non-spiritual circumstances. Its use in stories, however, follows a different rule than its use in songs.

Songs differ from prosaic language not only in their language but also in their cognitive background. Surgut Khanty singers believe that they have received mythical songs as a gift from the heavens, rather than learning them from other singers. Singers adopt the perspective of the hero of the song, rather than their own. These characteristics are true of the folklore of Siberian peoples as well (Hobhk 2012). The information status and its linguistic marking are also unique. In the words of one of my informants, the events in a song occur "in a different reality," and this can also be detected in the language, such as in the use of nonfinites in predicate position.

The myth of the six-legged moose has been documented both in its song and prosaic version from informants in the same family. The story was recorded by Laszlo Honti in 1976 from a male informant (Honti 1978), and I recorded the song twenty years later from the man's daughter (Csepregi 2003). Comparison of the two texts, which contain the same content and word choice but differ in form, produced interesting results. In the prosaic version, the story is narrated in third person; as a song, it uses first person. In the prosaic text, only the originally finite verbs appear in predicate position. The song, however, is full of grammaticalized verb forms formed from -t- and -m- morphemes, thus lacking clear marking of present or past tense. Due to these differences, the verbalization of nonfinites can be said to be a characteristic only of songs, rather than folklore in general.

(8) Surgut Khanty prose (Honti 1978: 128)

temi (...) t'et wanki-l-el

behold here trudge- PRS-3SG

'Behold, here trudges along ...'

(9) Surgut Khanty song (Csepregi 2003: line 79)

t'el tom wanki-le-t-al luw

from here that one trudge-FREQ-PTC.PRS-3SG PCL

'Behold, that one trudges along, indeed'

The following examples show a past-tense verb in objective conjugation, from prosaic use, and then a nonfinite-derived passive structure, from a song. The latter structure is unique to Surgut, because the person marking on the passive past participle does not indicate the agent, as would be expected for passive participles in attributive function, but instead indicates the patient. Verbalization is at a more advanced stage in Surgut Khanty than northern Khanty; Surgut allows person marking on nonfinites derived from passive participles, whereas northern Khanty does not (refer to Section 3).

(10) Surgut Khanty prose (Honti 1978: 129)

ma wal-em

1SG kill-PST.1SG.OBJ

'I killed it'

(11) Surgut Khanty song (Csepregi 2003: line 192)

ma-ne-pa wal-iley el-m-al


'it was indeed killed by me'

It cannot be said that there is no overlap between the grammar of the song and prosaic genres. Verb forms that are similar to those in sentence (11)--nonfinite-derived and functioning as verbs in predicate position--can be found in prosaic sources as well, but their occurrence is very rare. I have come across only two examples in Surgut Khanty:

(12) Surgut Khanty prose (Csepregi 2011: 14/10)

qat- lumi-ne toj-em awi panki-ne

house uninhabited-LOC have-PTC.PST girl fly agaric mushroom-LOC


get drunk-PTC.PST-3SG

'The girl of the spirit of the house got intoxicated on mushrooms'

(13) Surgut Khanty folk tale ([TEXT NOT REPRODUCIBLE IN ASCII] 2004: 147/607)

taqa jey-iw-ne tasen-ke warente-m-iw

PCL father-1PL-LOC rich-TRA do-PTC.PST-1PL

'Well, our father has made us rich'

Semantically, both sentences display inferential modality, which is referred to as a mental construct or reasoning (Willett 1988: 57). The statement in sentence (12) is based on the fact drunk singing could be heard; the passerby thereby concluded that the some fly agaric had been left in the house and was eaten by the girl of the spirit of the house, who then became intoxicated. According to a Khanty belief, uninhabited houses become occupied by spirits after seven years. The statement in sentence (13) was made by the protagonists of the story when they realized that the seemingly useless items they inherited from their father were in fact very valuable.

Since both of the sentences above display passive voice, the agent is marked with the LOC suffix, and the person of the patient is marked on the nonfinite, as in sentence (11), unlike in northern Khanty. This sentence structure is so rare as to make it impossible to talk about a consistent rule. It can be considered to be the early development of what will become a marker of the evidential mood, which will be a unique innovation of the Surgut dialect.

6. Other ways to express inferential modality in Surgut Khanty

No grammaticalized strategy, therefore, exists in spoken Surgut Khanty to express evidentiality. A strategy exists to mark information acquired from observation, which defies typological expectations: it is formally a participial attributive (or possibly postpositional) structure, but it appears at the end of the sentence, behaves like a verbal predicate, and marks tense and person. Its structure is as follows: present/past participle + Px + 'place'. In Khanty dialects, the word tayi 'place' has grammaticalized into a nominalizer (e.g., 'living place' > 'life'), with verb-like functions only present in Surgut (Csepregi 2008). In one of the stories, the heroine stays at an old woman's home and wakes up the next morning to find the old woman busying herself around him. The hero concludes:

(14) Surgut Khanty (Csepregi 1998: 74)

tu imi quntinte kil-m-al tayi

that old woman a long time ago get up-PTC.PST-3SG place

'(It seems) the old woman got up a long time ago'

If she had actually seen the old woman getting up, the speaker would have used a verb in the indicative mood:

(14a) tu imi noq kit

that old woman get up[PST.3SG]

'The old woman got up'

The guest leaves to make his way, and the old woman tells her what to do, but she does not follow her advice. When she returns, the old woman suspects something is wrong and offers the following metaphor:

(15) Surgut Khanty (Csepregi 1998: 76)

qow arey qow mant ente tuw-m-a tayi

long song long story NEG bring-PTC.PST-2SG place

'(It seems) you didn't bring a long song, a long story (that is, you don't have much to say, you weren't successful)'

Sentence (14) suggests a conclusion based on experience and the senses; in sentence (15), the conclusion is more likely based on reasoning.

Sentences (14) and (15) contain active verb. In the case of nonfinites, person marking indicates the agent. Although less common, the tayi structure can also be used with the passive voice, in which case the agent is indicated by person marking on the nonfinite. The following sentence is from the same story as sentence (13) and is a variation on the same sentence:

(16) Surgut Khanty folk tale ([TEXT NOT REPRODUCIBLE IN ASCII] 2004: 147/612)

taqa jey-iw-ne tas-at way-at

PCL father-1PL-LOC richness-INSF money-INSF

mej-m-iw tayi

give-PTC.PST-1PL place

'Well, our father gave us richness and money (it seems)'

7. Origins of the evidential in Khanty

Due to gaps in the historical documentation of Khanty, as discussed in Section 4, two possibilities paths of development of the evidential in Khanty can be considered:

7.1. Nonfinite-derived finite verbs first appeared in the language of songs, without the evidential, merely as a stylistic tool. Their evidential usage then developed in northern Khanty and Mansi. The process was likely strengthened by the fact that Komi, which is in contact with both of the northern Ob-Ugric languages, also uses the evidential mood. The shift occurred independently in Khanty and Mansi, as the morphological and semantic features of the evidential in each language is different. The language of songs in eastern Khanty has preserved the use of nonfinites in predicate position--a feature that sets it apart from spoken language. The use of nonfinites in predicate position is very rare in modern spoken language, but if we consider the example of northern Khanty and Mansi, it is possible that it will spread to mark inferential modality. Signs of this shift can be observed in sentences (12) and (13).

7.2. Evidentiality was found in dialects of Khanty and Mansi, having entered the language of songs via spoken language. As time passed, eastern Khanty lost the evidential category. A similar phenomenon was described by Olga Kazakevic as occurring in the Selkup language. Use of the evidential has gradually disappeared from the upper Taz dialect in the last decade, but it has been preserved in folklore ([TEXT NOT REPRODUCIBLE IN ASCII] 2010: 324.) This process may occur in Surgut Khanty as well, where the use of tayi structures to mark inferential modality appeared as a unique development.

Due to insufficient information about the history of the Ob-Ugric languages, we cannot decide on any of the possible paths of development with complete certainty. The important characteristic of Siberian, and specifically Ob-Ugric heroic epics, is that the singer imagines a different world and performs the song not as him or herself, but as a number of heroes and gods. The source of information has a different role in the complicated relationship between the audience and the hero of the song, which presents a genuine need for a grammatical marker. An obvious solution is the use of nonfinites rather than verbs linked to specific tenses; this allows the speaker to express nothing more than relative time relations.


1--1st person; 2--2nd person; 3--3rd person; DIM--diminutivizer; DU--dual; EV--evidential; FREQ--frequentaziver; INF--infinitive; LOC--locative; NEG--negative word; OBJ--objective conjugation; PASS--passive; PCL--particle; PL--plural; PRS--present tense; PST--past tense; PTC.PRS--present participle; PTC.PST--past participle; Px--possessive person marking; PPx--marking of the agent on a nonfinite; SG--singular; Vx--verbal person marking.


Marta Csepregi

Eotvos Lorand University, Department of Finno-Ugrian Studies


* This study was conducted as part of Hungarian Scientific Research Fund (OTKA) research projects no. 104249 and 107793. Furthermore, I would like to thank Nora Kugler for her help with the final draft of this article. I have learned a great deal from her work (Kugler 2003; 2004; 2012).
