Printer Friendly

PRONOMINAL DOUBLING IN ESTONIAN COMPLEX WH-QUESTIONS.

Introduction

Wh-fronting in content questions is common to most European languages, including the Finno-Ugric languages such as Estonian (see Metslang 1981), Finnish (see Vainikka 1989; Huhmarniemi 2012), Hungarian (see Toft 2001), Udmurt (see [phrase omitted] 1970), Komi and the Saami languages (see The Uralic Languages 1998). Example (1a) is an Estonian neutral declarative sentence with the SVO word order, whereas in (1b) the interrogative pronoun questioning the object of the verb occupies the left periphery and leaves behind an empty position.
(1) a. Liisa  loeb  ajalehte
Liisa-NOM read-3SG newspaper-PART
'Liisa is reading a newspaper'


In contrast, there are languages where wh-phrases mostly remain in situ. One of such Uralic languages is Tundra Nenets (see Salminen 2012), which has an SOV neutral word order. The wh-pronoun functioning as the object thus follows the subject, as in (2), although complex and topicalized interrogative phrases may behave differently (Mus 2015 : 130-134, 153-160).
(2) Pid namge-m? xeta?  ([phrase omitted] 2005 : 48)
3sg  what-ACC  say-PST-3SG
'What did he say?'


The movement of the wh-phrase to the front of the clause is a case of movement to a non-argument position, which has been referred to as "A'-movement" in the transformational theories of syntax, of which the minimalist program is the most recent (see Chomsky 1995; 2000). The wh-phrase may also move out of the embedded clause to the front of the matrix clause, passing a possible final landing site at the lower clause-initial position. Such movement, often called "long-distance wh-movement", is illustrated in (3) and (4) for Hungarian and (5) for Finnish. In Hungarian, the distribution of long-distance wh-movement is subject to dialectal variation (Maracz 1991 : 153-154). However, if it is considered acceptable by speakers, then subjects, objects as well as adverbials may each move out of the CP within which they originate (Toft 2001 : 194-195). Finnish, on the other hand, only allows long-distance object and adjunct extraction, while subject extraction may be possible in colloquial speech to a limited extent (Huhmarniemi 2012 : 96-98).

In a number of languages, alternative strategies are used to form such long wh-questions, namely identical and non-identical pronominal doubling (see Fanselow 2006; Rett 2006; Schippers 2010a). In the former case, commonly referred to as " wh-copying", the higher and the lower clause are introduced by the same interrogative pronoun, as exemplified for German in (6). In the latter case, referred to as "partial wh-movement" or "wh-scope marking", the two wh-pronouns differ in phonological shape, as can be seen in the Hungarian example (7). The matrix clause is then introduced by a pronoun equivalent to what (or how, as in Polish; see Lubanska 2004), which provides no evident semantic content; its sole function appears to be to extend the scope of the lower wh-phrase, i.e. to signal that it is a direct question (Dayal 1994 : 138). The complementizer may either become silent in the doubling constructions, like in German (normally, dass would introduce the embedded CP), or it may remain in its position at the front of the CP, like hogy in Hungarian.
(6) Wen  glaubst  du  wen  sie  liebt?
who-ACC believe-2SG you-NOM who-ACC she-NOM love-3SG
'Who do you think she loves?' (Pankau 2013 : 1)

(7) Mit  gondolsz, hogy kit  latott  Janos?
what-ACC think-2SG that who-ACC see-PST-3SG Janos-NOM
'Who do you think Janos saw?' (Horvath 1997 : 510)


Long-distance dependencies and syntactic doubling have hitherto not been studied in Estonian, even though both identical and non-identical doubling are used in long wh-questions (likewise, e.g. Romani, Frisian and some varieties of German and Dutch have been listed among such languages). There is, however, an interesting asymmetry.

Identical doubling rather occurs if an inanimate object is questioned, as in (8), where the matrix clause and the subordinate clause are mutually introduced by the interrogative pronoun mis 'what'. If a person is questioned, non-identical doubling is preferred, as shown in (9), where the pronoun kes 'who' referring to a human remains in the subordinate clause, while the wh-scope marker in the matrix clause is mis--the least specified wh-phrase in Estonian. Whether these superficially different cases of pronoun-doubling actually entail different syntactic structures or not, is a question in itself, which is also tackled in this paper.
(8) Mis  sa  arvad,  mis  juhtus?
what-NOM you-NOM think-2SG what-NOM happen-PST-3SG
'What do you think happened?'

(9) Mis  sa  arvad,  kes  seda  tegi?
what-NOM you-NOM think-2SG who-NOM it-PART do-PST-3SG
'Who do you think did it?'


In case of regular long-distance wh-movement, the wh-phrase moving successive-cyclically to the higher left periphery is spelled out once. The subordinate clause is then introduced by a declarative complementizer et, which is always obligatory. Note that in (10) kes surfaces in the partitive form keda, partitive being the default object-marking case in Estonian. In case of subject extraction, long-distance wh-movement seems to be limited similarly to Finnish.
(10) Keda  ta  utles,  et  ta  tunneb?
who-PART (s)he-NOM say-PST-3SG that (s)he-NOM know-3SG
'Who did (s)he say (s)he knew?'


Verbs that permit long wh-extraction from a finite CP have been referred to as "bridge verbs" (Felser 2004 : 549) and include epistemic verbs, such as arvama 'think', utlema 'say', vaitma 'claim', uskuma 'believe', lootma 'hope' and otsustama 'decide' in Estonian. Nevertheless, Featherston (2004 : 205) concludes, based on German, that the bridge feature should be seen as a continuum, not a categorical distinction, meaning that there is no absolute group of bridge verbs; language users perceive some verbs as more natural than others in long-distance wh-questions.

The purpose of the study presented in this article was to determine to what extent different pronominal patterns are attested in Estonian complex wh-questions, and in which way their acceptability is related to (a) the animacy/inanimacy of the object being questioned, (b) subject and object question, and (c) lexical factors. The object of study is restricted to biclausal wh-questions that contain the verb arvama 'think' or utlema 'say' in the matrix clause, which presumably entail somewhat different preferences for the long wh-questions. This study concerns subject and partial object questions, thereby excluding extraction of total object and other arguments.

In order to test the acceptability of different pronominal patterns, I conducted a survey among 89 non-philologist native speakers of Estonian. In addition, I used corpus analysis to see what evidence large online corpora provide for long wh-questions in both standard written Estonian and rather informal written language use. The methodology is described in Section 2, the results are reported and discussed in Section 3. The following section gives an overview of different pronominal combinations in Estonian complex wh-questions and proposes a syntactic analysis for each pattern, which is a further important aim of this paper.

1. Possible pronominal patterns and their syntactic structure

1.1 Interrogative pronouns and nominative-partitive case alternation

The Estonian interrogative/relative pronoun kes 'who' primarily refers to a human object, whereas the pronoun mis 'what' refers to an inanimate object, action or situation. Kes can also refer to a group of people, an institution or an animal if they are considered an actor in the situation. If they are treated as an undergoer, mis is preferred. At the same time, the singular-plural opposition is neutralized. Kes and mis usually preserve their singular form even when the NP they refer to is in plural (Metslang 1981 : 63-72; Erelt, Erelt, Ross 2007 : 560-561).

Based on Dutch, Boef (2013) has shown that the patterns of pronominal doubling in long-distance dependencies are best accounted for by the feature specifications of the relevant pronouns, namely that the distribution of the pronouns depends on which features they can spell out. In Estonian, these features are semantic animacy (roughly human/non-human) and the type of the argument questioned (subject/object), the latter being derived from the case form of the pronoun.

When functioning as a subject, kes and mis constantly appear in the default nominative case form. In the partial object function, kes is always assigned the partitive case and surfaces as keda, e.g. Keda sa ootad? 'Who are you waiting for?'. Mis, on the other hand, can preserve its nominative form, so that both case forms are possible, e.g. Mida ~ Mis sa teed? 'What are you doing?'; Mida ~ Mis sa poest toid? 'What did you bring back from shopping?'. (1)

This kind of nominative-partitive case alternation also applies in complex wh-questions. This is what gives rise to the non-identical doubling patterns in long wh-questions with a non-human referent. Firstly, the matrix clause can either contain a nominative or a partitive what-phrase, independently of whether we have a subject or object question, whereas the subordinate clause can be introduced by the partitive form only if an object is questioned. Therefore, the identical pattern mida-mida (11a) and the non-identical pattern mis-mida (11b) cannot occur in a complex subject question.
(11) a. Mida  ta  utles,  mida  ta  teha  otsustas?
what-PART (s)he-NOM say-3SG what-PART (s)he-NOM to-do decide-PST-3SG
'What did (s)he say (s)he decided to do?'  [OBJ question]

b. Mis  sa  arvad,  mida  Laura  Peetrile  kinkis?
what-NOM you-NOM think-2SG what-PART Laura-NOM Peter-ALL give-PST-3SG
'What do you think Laura gave to Peter?'  [OBJ question]


The reverse mida-mis pattern and the identical mis-mis pattern are compatible with both subject (12a, 13a) and object questions (12b, 13b).
(12) a. Mida  nad  utlesid,  mis  probleeme  tekitab?
what-PART they-NOM say-PST-3PL what-NOM problems-PART cause-3SG
'What did they say causes problems?'  [SUBJ question]

b. Mida  sa  arvad,
what-PART you-NOM think-2SG
mis vanaema seekord kupsetab?
what-NOM grandmother-NOM this time bake-3SG
'What do you think grandmother is baking this time?'  [OBJ question]

(13) a. Mis  te  arvate,  mis  see  on?
what-NOM you-NOM think-2PL what-NOM it-NOM be-3SG
'Whatdoyou thinkitis?'  [SUBJ question]

b. Mis  sa  arvad, mis  nad  vastasid?
what-NOM you-NOM think-2SG what-NOM they-NOM reply-PST-3PL
'What do you think they replied?'  [OBJ question]


In the non-doubling construction, mis can also be either nominative or partitive if extracted from an object position, as in (14).
(14) Mida ~ Mis  sa  arvad,  et  ta  utles?
what-PART / what-NOM you-NOM think-2SG that (s)he-NOM say-PST-3SG
'What do you think (s)he said?'  [OBJ question]


The case of the what-phrase can have a similar alternation in the matrix clause of long wh-questions with a human referent, independently of the case and function of the pronoun kes in the lower clause. An example of a subject question is given in (15).
(15) Mida~Mis  sa  arvad,  kes  Harriga  tantsis?
what-PART / what-NOM you think-2SG who Harry-COM dance-PST-3SG
'Who do you think danced with Harry?'


Following Schoorlemmer (2009 : 126-128) and Boef (2013 : 48-49), I take the syntactic feature specifications to be dependent on the morphological realization of these features, meaning that the lack of a value (under-specification) for an attribute corresponds to a morphologically unrealized feature. The nominative form of the wh-pronoun mis 'what' can then be considered completely underspecified, for it is underspecified for definiteness, number, animacy as well as case (since the nominative case is morphologically unrealized). It can function both as a subject and an object, whereas the nominative form of kes, which is sensitive to animacy, can solely function as a subject. Partitive forms of the pronouns are specified for case and are compatible with objects only. The referential possibilities of the interrogative pronouns mis and kes and their partitive forms mida and keda are presented in Table 1.

Potential pronominal combinations in Estonian long-distance wh-questions are given in Table 2. These possible patterns are theory-neutrally categorized as non-doubling, identical doubling and non-identical doubling patterns, judging by the surface form of the pronouns and leaving syntactic interpretations aside.

1.2. The structure of complex wh-questions: long-distance movement or not?

As has been shown, three strategies are used across languages to form complex wh-questions, and the presumption is that all of them may occur at least in the colloquial use of Estonian.

The case of pronoun-doubling labelled wh-copying, where the wh-phrase occupying the specifier (Spec) of the matrix CP is identical to that in the Spec of the subordinate CP, has in general been analyzed as a structural variant of long-distance wh-movement (e.g. Fanselow, Mahajan 2000; Felser 2004; Rett 2006; Pankau 2013). The two types of long-distance wh-dependency are taken to be derived by a successive-cyclic movement of the interrogative pronoun via the lower SpecCP. In regular long-distance wh-movement, only the head of the movement chain is spelled out, as shown in (16). In case of wh-copying, an intermediate copy of the wh-phrase gets spelled out in addition to the head of the chain, as illustrated in (17), assuming that such doubling of the pronoun kes would be acceptable to some speakers of Estonian.

One can suppose that similar syntactic analysis should also account for other identical doubling patterns in Estonian, although the available research on wh-copying has, to my knowledge, regrettably not paid attention to extraction of non-human pronouns. On the other hand, the higher pronoun of the mis-mis and mida-mida constructions, illustrated in (8), (11a) and (13), could be considered similar to the scope marking what-phrase in the non-identical doubling constructions, such as in (9), (11b) and (12), which are obvious cases of partial wh-movement.

Two major competing approaches have been proposed for the analysis of partial wh-movement. The direct dependency approach (DDA, first proposed by van Riemsdijk 1982; adopted by, e.g., McDaniel 1989; Sabel 1998; Cheng 2000; Barbiers, Koeneman, Lekakou 2008) assumes that there is a direct link between the what-phrase and the wh-phrase in the subordinate clause (e.g. by coindexation or partial spell-out). In the indirect dependency approach (IDA, first proposed by Davidson 1984; adopted by, e.g., Dayal 1994; Fanselow, Mahajan 2000; Horvath 2000; Felser 2001; Stepanov, Stateva 2006; Schippers 2010a; 2010b), the scope marker is analyzed as an argument or expletive that originates in the object position within the matrix clause from where it may independently move to the matrix SpecCP, whereas the contentful wh-phrase in the embedded clause also moves independently to its local SpecCP. This means that there are two separate wh-movement chains. Hence, while the DDA assumes both wh-copying and partial wh-movement to be surface alternatives of long-distance wh-movement, the IDA assumes no structural similarity between partial and long-distance wh-movement, in fact treating the former as an option for avoiding the latter. Nonetheless, partial wh-movement is only possible with bridge verbs (Klepp 2001; Schippers 2010b).

As Horvath (2000) has concluded, the attempts at a single uniform analysis for partial wh-movement have run into problems, and DDA and IDA simply account for different languages. I suggest that IDA should be adopted for Estonian as a language where the what-phrase is intuitively linked, i.e. coindexed to the entire embedded clause, not just to the wh-phrase contained in it, like illustrated in (18).

In Hungarian, which is the only other Finno-Ugric language where the doubling of wh-pronouns has been analyzed, the what-phrase is also associated with the whole subordinate CP and taken to originate in an argument position, in compliance with IDA (Horvath 1997). One argument is that the Hungarian wh-scope marker mi 'what' exhibits case-marking and triggers agreement inflection on the verb independently of the case and agreement properties of the embedded wh-phrase (see Horvath 2000 : 282 for examples). The presumption of two independent wh-chains would similarly explain why the Estonian pronoun mis may surface in different case forms in the matrix and subordinate clauses, as in (11b), (12a) and (12b).

Moreover, if the what-phrase is considered to be the standard wh-word for clausal complements, then the sentence (19b) would be analyzed as a version of (19a), as exemplified by Fanselow (2006 : 451) for German; a similar analysis seems to apply for the Estonian sentences (20a) and (20b). According to Dayal (2000 : 187) and Fanselow (2006 : 451), the partial wh-movement construction (20b) can also be analyzed as a monosentential counterpart of sequential questions shown in (20c).
(19) a. Was  denkst  du?
what-ACC think-2SG you-NOM
'What do you think?'

b. Was  denkst  du  wer  gekommen  ist?
what-ACC think-2SG you-NOM who-NOM come-PP be-3SG
'Who do you think has come?'

(20) a. Mis ~  Mida  sa  arvad?
what-NOM ~ what-PART you-NOM think-2SG
'What do you think?'

b. Mis ~  Mida  sa  arvad,  kes  tuli?
what-NOM ~ what-PART you-NOM think-2SG who-NOM come-PST-3SG
'Who do you think has come?'

c. Mis ~  Mida  sa  arvad?  Kes  tuli?
what-NOM ~ what-PART you-NOM think-2SG who-NOM come-PST-3SG
'What do you think? Who has come?'


Another argument pro IDA is that a clausal correlative equivalent to 'it' can be found in Estonian declarative sentences, such as (21). Dayal (1994 : 149-151), based on Hindi, and Stepanov and Stateva (2006 : 2144-2115), based on Slavic languages, assume the what-phrase to be a similar correlative of the subordinate clause, with which it forms a constituent.
(21) [Ma arvan  [seda.sub.i],  [[.sub.CPi] et  Anne suudles  Martinit]]
I-nom think-1SG it-PART that Anne kiss-PST-3SG Martin-PART
'I think that Anne kissed Martin'


The question remains whether the ambiguous patterns mis-mis and mida-mida represent partial wh-movement, or is wh-copying restricted by the animacy of the referent. In the former case, those patterns should occur and be accepted to a similar extent as the partial wh-movement pattern mis-kes and its variants. This would mean that the partial wh-movement construction could be used for both human and non-human referents, as well as both subject and object questions, leaving no obvious need for other constructions fulfilling the same purpose.

A tendency has been noted that long-distance wh-movement and partial wh-movement are generally in complementary distribution: The latter shows up in languages where the former is not freely permitted; either dominates the other. Wh-copying, on the other hand, co-occurs with long-distance wh-movement, compared to which it is always of a secondary nature (Stepanov, Stateva 2006; Schippers 2010a). If partial wh-movement is the preferred construction for forming complex wh-questions in Estonian, then one expects the non-doubling construction to have a limited use and the copy construction to be even rarer. The matter will be readdressed in Section 3 in the light of survey and corpus data.

2. Methodology

2.1. Corpus analysis

In order to observe the usage of complex wh-questions in written Estonian, I carried out corpus queries, using language data collected from fiction and newspapers on the one hand and from websites on the other hand. Firstly, I ran a query in three subcorpora of the "Eesti keele koondkorpus" (Estonian Reference Corpus; http://www.cl.ut.ee/korpused/segakorpus/), namely, Fiction from the year 1990 onwards (5.6 million words), Daily "Postimees" from the years 1995-2000 (32.9 million words) and Daily "Eesti Paevaleht" from the years 1995-2007 (87.9 million words), which all represent standard written Estonian. Then I repeated the query in the Estonian web corpus etTenTen (http://www2.keeleveeb.ee/dict/corpus/ettenten/about.html), which is based on 686,000 Estonian websites downloaded from the Internet (270 million words), assuming that Internet language use reflects colloquial language phenomena and therefore shows more variation in the pronominal patterns of long wh-questions.

I restricted the search phrase to the possible matrix clause sequences containing a second-person pronoun (singular sa or plural/polite te), e.g. mis sa arvad 'what do you-SG think' or keda te utlete 'who do you-PL say'. With the more frequent verb arvama, I formed search strings in the present tense only. With utlema, I expanded the query by including the imperfect verb forms, e.g. mida sa utlesid 'what did you say'. Out of the results, I took into account all bi-clausal subject/object questions, regardless of whether they formed a sentence by themselves or a part of a sentence that contains further subordinate clauses, such as (22).
(22) Mis  sa  arvad,  mida  sa  naed,
what-NOM you-NOM think-2SG what-PART you-NOM see-2SG
kui sa  umber poorad?  (Estonian web corpus etTenTen)
if you-NOM around turn-2SG
'What do you think you will see if you turn around?'


2.2. Acceptability judgment test

Since complex wh-questions are not very frequent and some pronominal patterns that do not occur in written language may be used in colloquial language, I conducted an acceptability judgement test among 89 native Estonian speakers.

The study involved four factors: 1) type of the pronominal pattern (identical doubling, non-identical doubling, non-doubling), 2) type of the argument questioned (subject, object), 3) animacy of the object being questioned (human, non-human), and 4) nominative-partitive case alternation.

I took all potentially possible pronominal combinations given in Table 2 into consideration when constructing the test sentences, which were each bi-clausal, i.e. long root wh-questions with the matrix verbs arvama 'think' and utlema 'say'. The test also included alternative constructions containing either a non-finite adverbial clause with a gerund (sinu/teie arvates 'in your-SG/your-PL opinion', literally 'in your thinking'), a postpositional phrase (Martini sonul 'according to Martin') or an adverb (kuuldavasti 'as they say').

All in all, I constructed 17 sentences with each of the bridge verbs and 4 alternative sentences, resulting in 38 test sentences. In doing so, I only employed the vocabulary represented among the 10,000 most frequent lemmas in Kaalep, Muischnek 2002, whilst avoiding unusual names. The bridge verbs appeared in second- or third-person forms, the verb arvama in the simple present and utlema in the simple past tense, e.g. Mida sa arvad, keda ta peole kutsub? 'Who do you think (s)he will invite to the party?', Mis ta utles, mis seda mura tekitab? 'What did (s)he say causes this noise?'

To prevent the informants from ruling out variants that they might find acceptable in spoken language but not in written language, the test sentences were presented orally, not on the questionnaire. The recorded sentences were played in a mixed order with 10 second breaks, during which the participants had to evaluate them using a five-point Likert-type scale, with the scale points defined as follows: 5--perfectly acceptable, 4--rather acceptable, 3--neutral, 2--rather unacceptable, 1--absolutely unacceptable. The sentences were not repeated because it was important that the judgements be based on immediate reaction.

The test was performed in three groups. The informants included 29 upper secondary school students (15 male, 14 female, aged 16-17), 37 university students (20 male, 17 female, aged 21-40), 23 volunteering employees of a private company (13 male, 10 female, aged 23-62). The mean age of the participants was 25.8 (SD = 11.1). It was required that the university students and employees do not have linguistic background. The secondary school students filled in a printed questionnaire, the other two groups gave their responses online. I preferred a conveniently accessible sample in order to carry the test out in controlled conditions, so that the respondents did not have a possibility to rewind the recording and rethink their judgements.

For every test sentence, I calculated the median (Mdn) to measure the central tendency of the judgements, which indicates the likeliest response, and the interquartile range (IQR), i.e. the difference between the first and third quartiles of the distribution, to measure dispersion. Due to the ordinal nature of the Likert type data where the scale values cannot be presumed equal, these measurements have been recommended in place of means and standard deviations, which could lead to a misinterpretation of the findings of a survey (see, e.g., Blalock 1979; Jamieson 2004; Allen, Seaman 2007; Kostoulas 2014).

Admittedly, the test was exploratory in nature. The absence of filler sentences and low number of test sentences may have affected the results. Designing the test, I took into account that, despite the abundance of control conditions, it should not become too long and elaborate for the participants to remain focused. The idea was to offer first insight for further research.

3. Results and discussion

3.1. Corpus analysis

Nearly all examples found in each of the corpora involved the verb arvama 'think'. The query in the reference corpus (126.4 million words) returned 37 relevant examples, while 272 relevant examples were found in the web corpus (270 million words). In either case, only one example contained the verb utlema 'say'. Both examples were in the simple past tense.

In the corpus of standard written Estonian compiled of fiction and newspaper texts, most examples of long wh-questions with the verb arvama questioned a non-human object, as outlined in (23). In this case, all 17 examples of subject questions involved the identical doubling pattern mis-mis. Out of the 12 object questions, most involved the nominative-partitive non-identical pattern mis-mida, while the identical pattern mis-mis was used once. In addition, seven examples were found of a long wh-question with a human referent. All of them questioned a subject and involved the non-identical doubling pattern mis-kes. The only example with the verb utlema, which was a subject question, also involved a human referent and the mis-kespattern. There were no examples of the non-doubling construction with the complementizer et, nor of identical doubling in case of a human referent.

(23) Results of the corpus query: fiction and newspapers
36 examples with arvama 'think':
  HUMAN, SUBJ question--7 examples of non-identical doubling
  (mis-kes)
  NON-HUMAN, SUBJ question--17 examples of identical doubling
  (mis-mis)
  NON-HUMAN, OBJ question--11 examples of non-identical doubling
  (mis-mida), and one (1) example of identical doubling (mis-mis)
one example with utlema 'say':
  HUMAN, SUBJ question: non-identical doubling (mis-kes)


In the etTenTen web corpus, where the data largely originates from blog and forum posts as well as from commentaries on news websites, significantly more examples were found, affirming the colloquiality of the phenomenon. Again, the majority of questions had a non-human referent, in which case all three types of pronominal patterns were attested this time. Table 3 presents the frequencies and percentages of the pronominal patterns represented in long wh-questions with the verb arvama.

In case of a human referent, again only non-identical doubling emerged with the nominative mis in the matrix clause. In wh-questions with a non-human referent, identical doubling (mis-mis) dominated if a subject was questioned, which was much more common than questions regarding an object, just like in the reference corpus. If an object was questioned, the non-identical doubling pattern mis-mida was preferred. All possible combinations of nominative and partitive forms of mis occurred, however, the patterns with mis in the higher clause were preferred to the patterns with the partitive form mida in the higher clause, probably for reasons of economy.

Non-doubling was observed as well, although much less frequently than the doubling patterns. It occurred both in case of subject (mis-et) and object extraction (mida-et). In the only example with the verb utlema, which involved a non-human referent and object extraction and where the nominative pattern mis-et was used, standard long-distance movement was employed.

In conclusion, the corpus data analyzed indicates that if a person is questioned, non-identical doubling is strongly preferred in Estonian complex wh-questions, while identical and non-identical doubling are similarly common in case of a non-human referent. One can further conclude that the identical doubling pattern mis-mis is best compatible with subject questions, while the nominative-partitive non-identical pattern mis-mida is best compatible with object questions. All in all, doubling patterns with a nominative mis in the matrix clause appear to be preferred; long wh-questions without pronominal doubling look rare.

Moreover, arvama appears to be a substantially more frequent bridge verb than utlema. This is in line with Featherston's (2004 : 182) claim that movement restrictions are not merely syntactic but must be related to lexical factors. Based on German, he has pointed out that the verb meinen 'think' allows long-distance wh-extraction, whereas hoffen 'hope' seems less natural and bezweifeln 'doubt' feels entirely inaproppriate in the construction. 'Think' has also been found to be by far the most frequent bridge verb in, e.g. Dutch and English (Verhagen 2006 : 334-335).

The corpus data further reveals that certain verbs tend to occur in the lower clause more often than others. These are olema 'be' (30.9% of all cases, e.g. Mis sa arvad, kes need inimesed on? 'Who do you think these people are?'), tegema 'do' (16.4%, e.g. Mis te arvate, mida mees tegi? 'What do you think the man has done?'), juhtuma 'happen' (9.6%, e.g. Mis te arvate, mis siis juhtus? 'What do you-PL think happened then?') and saama 'become, happen to' (7.4%, e.g. Mis sa arvad, mis riigist saab? 'What do you think will happen to the country?').

3.2. Acceptability judgement test

Similarly to the corpus data, the informants of the acceptability judgement test strongly preferred non-identical doubling in long wh-questions with a human referent. With the verb arvama 'think', all non-identical doubling patterns were found to be rather or perfectly acceptable by most of the informants (73.1%-83% depending on the pattern). In object questions, mis-keda (Mdn = 5, IQR = 1) was preferred to mida-keda (Mdn = 4, IQR = 2), whereas in subject questions there was little difference in the perception of the patterns mis-kes and mida-kes (Mdn = 4, IQR = 1 in both cases).

If the object was extracted, regular long-distance wh-movement (keda-et: Mdn = 4, IQR = 2.5) was also rated acceptable by more respondents (52.8%) than those who rated it unacceptable (29.2%). In case of subject extraction, the responses to the non-doubling pattern kes-et nonetheless vary to a greater extent (Mdn = 3, IQR = 2), being rather considered acceptable by 40.5% and unacceptable by 41.5% of informants.

Identical doubling patterns kes-kes (Mdn = 3, IQR = 3) and keda-keda (Mdn = 2, IQR = 2) were most often seen as rather or absolutely unacceptable (by 49.5% and 64.1% of respondents, respectively), even though the fact that some informants (32.6% and 16.9%, resp.) still found them acceptable implies that they may occur in colloquial language.

The distribution of judgements for pronominal patterns in long wh-questions that contain the verb arvama and question a person is presented in Table 4. The number of informants who considered a pattern rather or perfectly acceptable is in the light grey column, while the number of informants who considered a pattern rather or absolutely unacceptable is in the dark grey column, and the number of informants who remained neutral is in the white column (the same holds for subsequent tables). The modes, i.e. most frequent responses, are marked by giving the number of the respective responses in bold.

With the bridge verb utlema 'say', non-identical doubling was again notably preferred in case of a human referent. However, as shown in Table 5, only the patterns mis-kes (Mdn = 5, IQR = 1) and mis-keda (Mdn = 4, IQR = 2) seem to be generally accepted (by 82.1% and 65.1% of respondents, resp.). The patterns mida-kes (Mdn = 4, IQR = 2) and mida-keda (Mdn = 3, IQR = 2), which were found acceptable in 54% and 45% of the cases, resp., and received a neutral response in more than 20% of the cases, rather left the respondents indecisive.

The identical doubling patterns kes-kes (Mdn = 2, IQR = 1) and keda-keda (Mdn = 2, IQR = 2) were mainly rated unacceptable by the informants (79.8% and 71.9%, resp.). The non-doubling patterns kes-et and keda-et (Mdn = 3, IQR = 2 in both cases) were also more often rejected (by 48.4% and 39.4% of respondents, resp.) than approved (by 31.4% and 35.9%, resp.).

As soon as one turns to wh-questions with a non-human referent, identical doubling suddenly becomes perfectly acceptable, as illustrated by Table 6, which presents the judgements for sentences with the matrix verb arvama. The partitive pattern mida-mida used in an object question (Mdn = 4, IQR = 2) was, however, rated considerably lower than the nominative pattern mis-mis, which can be used both in subject (Mdn = 5, IQR = 0.5) and object questions (Mdn = 5, IQR = 1); the first pattern was perceived acceptable by 64.1% of informants, whereas the latter was accepted by nearly all of them, regardless of the argument being questioned.

Likewise, the non-identical pattern with the partitive form in the higher clause, mida-mis (SUBJ question: Mdn = 4, IQR = 2; OBJ question: Mdn = 4, IQR = 1), was found acceptable by fewer informants (65.2% and 78.6%, resp.) than mis-mida (Mdn = 5, IQR = 1) used in an object question, which was accepted by 93.2% of respondents.

In case of object extraction, the non-doubling patterns mis-et (Mdn = 4, IQR = 2) and mida-et (Mdn = 5, IQR = 1) were also mainly considered acceptable (by 73% and 79.7% of informants, resp.). In case of subject extraction, however, mis-et (Mdn = 3, IQR = 2) was found to be unacceptable slightly more often (in 38.2% of the cases) than acceptable (in 34% of the cases).

With the verb utlema, most of the patterns were again considered to be rather acceptable with the exceptions of the non-doubling mis-et if a subject is extracted (Mdn = 3, IQR = 2.5), just like in the case of arvama, and also the identical object extraction pattern mida-mida (Mdn = 3, IQR = 1). Nevertheless, even in these cases there were more informants who found the pattern acceptable (49.4% and 45%, resp.) than those who rated it unacceptable (29.2% and 32.6%, resp.), as can be seen in Table 7.

The non-identical doubling patterns were, in general, all perceived similarly (Mdn = 4, IQR = 2); mida-mis was considered acceptable by 66.3% of respondents, both in case of subject and object extraction, and the object extraction pattern mis-mida by 70.8%. The identical doubling pattern mis-mis (SUBJ question: Mdn = 4, IQR = 2; OBJ question: Mdn = 4, IQR = 1) was considered acceptable by more informants (66.3%) in case of subject extraction than in case of object extraction (52.8%). Similarly, the non-doubling object extraction patterns mis-et (Mdn = 4, IQR = 3) and mida-et (Mdn = 4, IQR = 2) were accepted by more than half of the informants (53.9% and 55%, resp.).

The test sentences containing the gerund arvates (Kes seda sinu arvates tegi? 'Who did it, in your opinion?', Mida ma teie arvates tegema peaks? 'What should I do, in your-PL opinion?'; Mdn = 5, IQR = 1 in both cases) were considered acceptable by 92.1% and 94.3% of informants, resp., thereby predominantly perfectly acceptable. This refers that such a non-finite construction may often be used as an alternative to long wh-questions with the verb arvama. Nevertheless, the biclausal subject question with the doubling pattern mis-mis and the object question with the pattern mis-mida were rated as highly as their uniclausal counterparts.

Out of the uniclausal alternatives to complex wh-questions with the verb utlema, the one featuring the postposition sonul according to' (Kes selle auto Martini sonul korda tegi? 'Who fixed this car, according to Martin?'; Mdn = 5, IQR = 1) was preferred to the one featuring the adverb kuuldavasti 'as they say' (Mida Anne kuuldavasti oppida kavatseb? 'What do they say Anne is planning to study?'; Mdn = 4, IQR = 2). While the former was found acceptable by 85.3% of respondents, the latter was accepted by just 50.6% and cannot be considered a common alternative.

Overall, the sentences with the matrix verb arvama received higher ratings, which is in accordance with the results of corpus analysis. However, while the corpus queries returned next to no examples of complex wh-questions with the verb utlema, the participants of the acceptability judgement test found such sentences to be largely acceptable if the non-identical doubling patterns mis-kes, mis-keda, mis-mida, mida-mis or the identical doubling pattern mis-mis (in SUBJ question) were used. This indicates that utlema colloquially figures as a bridge verb, although it has a more limited bridge feature than arvama, which allows more variation in pronominal patterns.

Complex object questions, especially those with a non-human referent, also give rise to greater variation. Firstly, subject questions do not permit the nominative-partitive case alternation of the pronoun mis in the embedded clause. More importantly, regular long-distance wh-questions rather seem to be permitted if an object is extracted from the subordinate clause. The difference turned out to be more significant if the matrix verb was arvama. While inanimate object extraction was found to be acceptable by most informants, subject extraction was rather considered unacceptable. The human object extraction was only accepted by slightly more than half of the respondents but was still rated considerably higher than the subject extraction. With the verb utlema, the difference was quite subtle, and if a person was questioned, object extraction was considered rather unacceptable, same as subject extraction.

Thus, long-distance wh-movement appears to be limited in Estonian, rather occurring with a non-human object and depending on the bridge verb. The only construction that can account for all accepted patterns of pronominal doubling observed in this study is partial wh-movement. That also includes the seemingly identical patterns mis-mis and mida-mida. In fact, analyzing them as wh-copying would contradict the crosslinguistic observation that in most languages partial wh-movement is not equally likely to co-occur with long-distance wh-movement nor its structural variant wh-copying. Namely, it would entail that in case of a non-human referent wh-copying would somehow be as acceptable as partial wh-movement and more acceptable than ordinary long-distance wh-extraction. Taking into account that non-identical doubling representing partial wh-movement was considered highly acceptable by native Estonian speakers both with a human and a non-human referent, the inanimate identical doubling patterns should also be analyzed as partial wh-movement. Adopting the IDA for Estonian would mean that, despite their similar form, the pronouns introducing the matrix and the subordinate clause are not part of the same movement chain and hence not coreferential.

The conclusion that partial wh-movement is the overall preferred construction in Estonian for forming complex wh-questions allows to place Estonian among languages as diverse as Hungarian, Russian, Serbian-Croatian, Czech, Romani, Polish, Albanian, Frisian, but also Hindi, Bangla, Kashmiri, Marathi, Iraqi Arabic, Warlpiri, and Passamaquoddy (Fanselow 2006 : 442-443). Stepanov and Stateva (2006) claim that long-distance wh-movement and partial wh-movement share the same derivational history, and the difference between languages with and without partial wh-movement lies in the lexical matter, namely, in whether the scope marker, i.e. the whatphrase is overt or silent in that particular language. A mixed example (24) found in the etTenTen web corpus, where both the complementizer and the interrogative pronoun emerge in the SpecCP of the subordinate clause, seems to support the assumption that the what-element is there even if no doubling is employed. Apparently, it may get spelled out even if not necessary, preventing the lower pronoun from moving out of the subordinate clause.
(24) Mis  te  arvate,  et  mis
what   that what
on 14aastase lapse jaoks koige pidurdavam tegur?
'What do you think is the most hindering factor for a 14-year-old
child?'


In order to obtain a more precise view and understanding of pronominal doubling and the limited nature of long-distance wh-extraction in Estonian, one henceforth needs to analyze the usage and perception of complex wh-questions that (a) question a total object (predominantly marked by the genitive case) or an adverbial, (b) contain other bridge verbs, or (c) express negation. It has been observed (by, e.g., Rizzi 1992) that negated predicates are mostly unavailable in partial wh-movement, so a regular non-doubling construction may be preferred in this case. In Hungarian, however, partial wh-movement is compatible with negation, although this only applies to certain verbs, such as admit, reveal, deny, notice, permit (Fanselow 2006 : 470-471). Investigating relative clauses which have an extra embedding, like example (25) taken from the etTenTen corpus, can also provide valuable information on pronoun-doubling in Estonian.
(25) Tihti  kusitakse  sult,  mis  sa  arvad,  mis
                              what             what
sa viie aasta parast teed
'You are often asked what you think you will be doing in five years'
time'


More crosslinguistic data is needed to draw comparisons between and conclusions about complex wh-questions in the Finno-Ugric and other Uralic languages. This study indicates Estonian to be similar to some dialectal varieties of Hungarian where partial wh-movement is preferred over regular long-distance wh-movement, which is even rejected by some native speakers. As argued, IDA, which has been proposed for Hungarian, should also account for Estonian partial wh-movement. Unlike in Hungarian, the complementizer usually does not co-occur with the preposed wh-phrase in an Estonian subordinate CP.

Like Finnish, Estonian rather seems to allow long-distance extraction of objects. Some Finnish examples randomly found online (e.g. Mita luulet, kuka saa kaveripiiristasi ensimmaisena lapsen? 'Who do you think will be the first from your friends to have a baby?'; Mita luulet, kenet Saimi ottaisi matkalle Suomen lapi Helsingista Inariin? 'Who do you think Saimi will take along for a trip through Finland from Helsinki to Inari?') suggest that partial wh-movement also occurs there to some extent. Since Finnish is closely related to Estonian, it would be interesting to compare the different pronominal patterns used in long wh-questions.

Conclusion

The present study provides a first insight into the use of wh-pronouns in complex subject and object questions of Estonian. The research was motivated by an ostensible animacy-related asymmetry between the distribution of identical and non-identical pronoun-doubling in Estonian, which was also confirmed by corpus queries and the acceptability judgement test conducted among native speakers. The results suggest that a similar syntactic structure underlies non-identical doubling and identical doubling with an inanimate referent, since the patterns occur with a similar frequency and are perceived as equally acceptable. Although identical doubling has generally been analyzed as wh-copying, i.e. a structural variant of long-distance wh-movement where the intermediate link of the movement chain gets spelled out, it is suggested that the Estonian inanimate doubling patterns mis-mis and mida-mida should be seen as cases of partial wh-movement, which is usually associated with non-identical doubling only.

All in all, partial wh-movement appears to be the construction employed in most cases when a complex wh-question is formed in Estonian. I further argue that each clause contains an independent wh-movement chain and the higher pronoun is coindexed to the entire embedded clause, thereby adopting the indirect dependency approach. In case of a human referent, identical doubling was accepted to a minor extent by the tested informants, leading to a conclusion that wh-copying is not intrinsic to Estonian. Regular long-distance wh-movement does not seem to be freely permitted either, receiving higher ratings with the matrix verb arvama 'think' and in case of object extraction.

In line with the crosslinguistic observation that the verb equivalent to 'think' tends to be the most common bridge verb, the verb arvama proved to be generally more accepted and to entail a greater extent of syntactic variation than utlema 'say'. The overall preference for partial wh-movement and restricted use of long-distance wh-movement in Estonian matches with the tendency that these constructions are in complementary distribution.

Acknowledgements

I am grateful to Professors Sjef Barbiers and Craig Thiersch, who were teaching the Micro- and Macrovariation course during my exchange period at Utrecht University in 2014, both for inspiring me to tackle this topic and for helpful advice regarding the pilot study. I would also like to thank Heete Sahkai from the Institute of the Estonian Language and Pille Eslon from Tallinn University for advising me throughout the research.

KAIS ALLKIVI (Tallinn)

Address

Kais Allkivi

kais.allkivi@gmail.com

Abbreviations

ACC--accusative; ALL--allative; comp--complementizer; CP--complementizer; DAT--dative; DDA--direct dependency approach; IDA--indirect dependency approach; IQR--interquartile range; Mdn--median; NOM--nominative; NP--noun phrase; OBJ--object; PART--partitive; PL--plural; 2PL--2nd person plural; 3PL--3rd person plural; POL--polite register; PP--past participle; PST--past tense; SG--singular; 2SG--2nd person singular; 3SG--3rd person singular; SUBJ--subject.

REFERENCES

Allen, I.E., Seaman, C.A. 2007, Likert Scales and Data Analysis.--Quality Progress. The Official Publication of American Society for Quality. http://asq.org/quality-progress/2007/07/statistics/likert-scales-and-data-analyses.html.

Barbiers, S., Koeneman, O., Lekakou, M. 2008, Syntactic Doubling and the Structure of Chains.--Proceedings of the 26th West Coast Conference on Formal Linguistics, Somerville, MA, 77-86. http://www.lingref.com/cpp/wccfl/26/paper1658.pdf.

Blalock, H. M. Jr. 1979, Social Statistics, New York.

Boef, E. 2013, Doubling in Relative Clauses. Aspects of Morphosyntactic Microvariation in Dutch, Utrecht. https://dspace.library.uu.nl/handle/1874/261909.

Cheng, L. 2000, Moving Just the Feature.--Wh-Scope Marking, Amsterdam, 77-100.

Chomsky, N. 1995, The Minimalist Program, Cambridge, MA.

-- 2000, Minimalist Inquiries. The Framework.--Step by Step. Essays on Minimalist Syntax in Honor of Howard Lasnik, Cambridge, MA, 89-155.

Davison, A.L. 1984, Syntactic Constraints on Wh-in-situ. Wh-Questions in Hindi-Urdu. Paper Presented at the Annual Meeting of the Linguistic Society of America.

Dayal, V.S. 1994, Scope Marking as Indirect Wh-Dependency.--Natural Language Semantics 2, 137-170.

Erelt, M., Erelt, T., Ross, K. 2007, Eesti keele kasiraamat, Tallinn.

Fanselow, G. 2006, Partial Wh-movement.--The Blackwell Companion To Syntax, Malden, MA, 437-492.

Fanselow, G., Mahajan, A. 2000, Towards a Minimalist Theory of Wh-Expletives, Wh-Copying, and Successive Cyclicity.--Wh-Scope Marking, 195-230, Amsterdam.

Featherston, S. 2004, Bridge Verbs and V2 Verbs. The Same Thing in Spades? --Zeitschrift fur Sprachwissenschaft 23 (2), 181-209.

Felser, C. 2001, Wh-Expletives and Secondary Predication. German Partial Wh-Movement Reconsidered.--Journal of Germanic Linguistics 13, 5-38.

-- 2004, Wh-Copying, Phases, and Successive Cyclicity.--Lingua 114 (5), 543-574.

Horvath, J. 1997, The Status of "Wh-Expletives" and the Partial Wh-Movement Construction of Hungarian.--Natural Language and Linguistic Theory 15, 509-572.

-- 2000, On the Syntax of "Wh-Scope Marker" Constructions.--Wh-Scope Marking, Amsterdam, 271-316.

-- 2007, Separating "Focus Movement" from Focus.--Phrasal and Clausal Architecture. Syntactic Derivation and Interpretation, Amsterdam-Philadelphia, 108-145.

Huhmarniemi, S. 2012, Finnish A'-Movement. Edges and Islands, Helsinki (Studies in Cognitive Science 2: 2012).

Jamieson, S. 2004, Likert Scales. How to (Ab)use them.--Medical Education 38 (12), 1217-1218.

Kaalep, H-J., Muischnek, K. 2002, Eesti kirjakeele sagedussonastik, Tartu. http://www.cl.ut.ee/ressursid/sagedused/index.php?lang=en.

Klepp, M. 2001, Partial Wh-Movement in German, Dublin.

Kostoulas, A. 2014, How to Interpret Ordinal Data. https://achilleaskos-toulas.com/2014/02/23/how-to-interpret-ordinal-data/.

Maracz, L. K. 1991, Asymmetries in Hungarian, San Sebastian.

Lubanska, M. 2004, Wh-Scope Marking in Polish.--Poznan Studies in Contemporary Linguistics 39, 73-88.

McDaniel, D. 1989, Partial and Multiple Wh-Movement.--Natural Language and Linguistic Theory 7, 565-604. http://ling.umd.edu/~staceyc/McDaniel.pdf.

Metslang, H. 1981, Kusilause eesti keeles, Tallinn.

Mus, N. 2015, Interrogative Words and Content Questions in Tundra Nenets, Szeged. http://doktori.bibl.u-szeged.hu/2764/1/Content_questions_in_TN_Nikolett_Mus_2015_06.pdf

Pankau, A. 2013, Replacing Copies. The Syntax of Wh-Copying in German, Utrecht. https://dspace.library.uu.nl/handle/1874/288519

Rett, J. 2006, Pronominal vs. Determiner Wh-Words. Evidence from the Copy Construction.--Empirical Issues in Syntax and Semantics 6, 355-374. http://www.cssp.cnrs.fr/eiss6/rett-eiss6.pdf

Riemsdijk, H. C. van 1982, Correspondence Effects and the Empty Category Principle, Tilburg (Tilburg Papers in Language and Literature 12).

Rizzi, L. 1992, Argument/Adjunct (A)symmetries.--Proceedings of North East Linguistic Society 22, 365-381.

Sabel, J. 1998, Principles and Parameters of Wh-Movement. Habilitationsschrift, Frankfurt/Main.

Salminen, T. 2012, Tundra Nenets. http://www.helsinki.fi/~tasalmin/sketch.html.

Schippers, A. 2010a, Partial Wh-Movement and Wh-Copying in Dutch. Evidence for an Indirect Dependency Approach.--Proceedings of the Thirty Sixth Annual Meeting of the Berkeley Linguistics Society. February 6-7, 2010, Berkeley, CA, 338-352.

--2010b, On the (Un)availability of Long-Distance Movement.--Movement and Clitics. Adult and Child Grammar, Newcastle upon Tyne, 39-62.

Schoorlemmer, E. 2009, Agreement, Dominance and Doubling. The morphosyntax of DP, Utrecht. https://openaccess.leidenuniv.nl/bitstream/handle/1887/13952/thesis_schoorlemmer_finaal.pdf?sequence=1.

Stepanov, A., Stateva, P. 2006, Successive Cyclicity as Residual Wh-Scope Marking.--Lingua 116, 2107-2153.

The Uralic Languages, London--New York 1998.

Toft, Z. 2001, Is There Ever Multiple Wh-Movement? Evidence from Superiority Effects and Focus in Hungarian.--Durham Working Papers in Linguistics 7, 126-144.

Vainikka, A. 1989, Defining Syntactic Representations in Finnish. PhD Dissertation, Amherst.

Verhagen, A. 2006, On Subjectivity and Long-Distance Wh-Movement.--Subjectification. Various Paths to Subjectivity, Berlin--New York, 323-346. http://www.academia.edu/8847362/On_subjectivity_and_long_distance_Wh-movement_.

[phrase omitted] 1970.

[phrase omitted] M. A. 2005, [phrase omitted].

(1) The partitive mida may demand a more specified answer, like in case of the question phrase mida teha 'what to do' that requires an answer in the da-infinitive form equivalent to to do in English. E.g. to the question Mida sul on plaanis teha? 'What are you planning to do?' one has to reply Lage varvida 'To paint the ceiling', whereas one should not reply using an NP, such as Lae varvimine 'The painting of the ceiling' or Too 'Work'. The question Mis sul on plaanis teha?, however, allows both types of answers (Metslang 1981 : 71).

https://dx.doi.org/10.3176/lu.2018.2.01

[phrase omitted]
Table 1 Nominal referents of the interrogative pronouns kes, mis, keda,
mida

nominal referent             kes 'who'  keda 'who-PART'  mis 'what'

[SG, HUMAN, SUBJ]                +
e.g. poiss 'boy-NOM'                           -             -
[SG, HUMAN, OBJ]
e.g. poissi 'boy-PART'           -             +             -
[SG, NON-HUMAN, SUBJ]
e.g. raamat 'book-NOM'           -             -             +
[SG, NON-HUMAN, OBJ]
e.g. raamatut 'book-PART'        -             -             +
[PL, HUMAN, SUBJ]
e.g. poisid 'boys-NOM'           +             -             -
[PL, HUMAN, OBJ]
e.g. poisse 'boys-PART'          -             +             -
[PL, NON-HUMAN, SUBJ]
e.g. raamatud 'books-NOM'        -             -             +
[PL, NON-HUMAN, OBJ]
e.g. raamatuid 'books-PART'      -             -             +

nominal referent             mida 'what-PART'

[SG, HUMAN, SUBJ]
e.g. poiss 'boy-NOM'                 -
[SG, HUMAN, OBJ]
e.g. poissi 'boy-PART'               -
[SG, NON-HUMAN, SUBJ]
e.g. raamat 'book-NOM'               -
[SG, NON-HUMAN, OBJ]
e.g. raamatut 'book-PART'            +
[PL, HUMAN, SUBJ]
e.g. poisid 'boys-NOM'               -
[PL, HUMAN, OBJ]
e.g. poisse 'boys-PART'              -
[PL, NON-HUMAN, SUBJ]
e.g. raamatud 'books-NOM'            -
[PL, NON-HUMAN, OBJ]
e.g. raamatuid 'books-PART'          +

Table 2 Possible pronominal patterns in long wh-questions

          [HUMAN] wh-questions

          non-doubling   kes + comp et
SUBJ      non-identical  mis + kes
question  doubling       mida + kes
          identical      kes + kes
          doubling
          non-doubling   keda + comp et

OBJ       non-identical  mis + keda
question  doubling       mida + keda
          identical
          doubling       keda + keda

          [NON-HUMAN] wh-questions

          non-doubling   mis + comp et
SUBJ      non-identical  mida + mis
question  doubling
          identical      mis + mis
          doubling
          non-doubling   mis + comp et
                         mida + comp et
OBJ       non-identical  mis + mida
question  doubling       mida + mis
          identical      mis + mis
          doubling       mida + mida

Table 3 Results of the corpus query: arvama 'think' (etTenTen web
corpus)

        HUMAN             Frequency  Percentage

Non-identical doubling       47       17.3%
SUBJ question  mis-kes       41       15.1%
OBJ question   mis-keda       6        2.2%
       NON-HUMAN
Non-identical  doubling      64       23.6%
SUBJ question  mida-mis      10        3.7%
OBJ question   mis-mida      54       19.9%
  Identical doubling        151       55.7%
SUBJ question  mis-mis      132       48.7%
               mis-mis       12        4.4%
OBJ question   mida-mida      7        2.6%
      Non-doubling            9        3.3%
SUBJ question  mis-et         4        1.5%
OBJ question   mida-et        5        1.8%

Table 4 Acceptability judgements: arvama 'think', HUMAN

Type of pronominal pattern  5--perfectly   4--rather   3--neutral
Non-identical doubling       acceptable    acceptable

               mis-kes          42           30           8
SUBJ question                  (47.2%)      (33.7%)      (9.0%)
               mida-kes         39           35           7
                               (43.8%)      (39.3%)      (7.9%)
               mis-keda         48           23           9
OBJ question                   (53.9%)      (25.8%)     (10.1%)
               mida-keda        33           32          13
                               (37.1%)      (36.0%)     (14.6%)
Identical doubling
                                15           14          16
SUBJ question  kes-kes         (16.9%)      (15.7%)     (18,0%)
OBJ question   keda-keda         7            8          17
                                (7.9%)       (9.0%)     (19.1%)
Non-doubling
SUBJ question  kes-et           12           24          16
                               (13.5%)      (27.0%)     (18.0%)
OBJ question                    22           25          16
               keda-et         (24.7%)      (28.1%)     (18.0%)

Type of pronominal pattern   2--rather    1--absolutely   Median   IQR
Non-identical doubling      unacceptable    unacceptable

               mis-kes          7              2            4      1
SUBJ question                  (7.9%)         (2.2%)
               mida-kes         3              4            4      1
                               (3.4%)         (4.5%)
               mis-keda         8              1            5      1
OBJ question                   (9.0%)         (1.1%)
               mida-keda        6              5            4      2
                               (6.7%)         (5.6%)
Identical doubling
                               20             24
SUBJ question  kes-kes        (22.5%)        (27.0%)        3      3
OBJ question   keda-keda       33             24            2      2
                              (37.1%)        (27.0%)
Non-doubling
SUBJ question  kes-et          19             18            3      2
                              (21.3%)        (20.2%)
OBJ question                   17              9
               keda-et        (19.1%)        (10.1%)        4      2.5

Table 5 Acceptability judgements: utlema 'say', HUMAN

Type of pronominal pattern  5--perfectly   4--rather    3--neutral
Non-identical doubling       acceptable    acceptable

               mis-kes          45            28          10
SUBJ question                  (50.6%)      (31.5%)      (11.2%)
               mida-kes         33            15          22
                               (37.1%)      (16.9%)      (24.7%)
               mis-keda         35            23          16
                               (39.3%)      (25.8%)      (18.0%)
OBJ question   mida-keda        12            28          19
                               (13.5%)      (31.5%)      (21.3%)
Identical doubling
SUBJ question  kes-kes           2            5           11
                               (2.2%)        (5.6%)      (12,4%)
OBJ question   keda-keda         6            7           12
                               (6.7%)        (7.9%)      (13.5%)
Non-doubling
SUBJ question  kes-et           10            18          18
                               (11.2%)      (20.2%)      (20.2%)
OBJ question   keda-et          14            18          22
                               (15.7%)      (20.2%)      (24.7%)

Type of pronominal pattern   2--rather    1--absolutely  Median  IQR
Non-identical doubling      unacceptable   unacceptable

               mis-kes          5              1           5      1
SUBJ question                  (5.6%)         (1.1%)
               mida-kes        12              7           4      2
                              (13.5%)         (7.9%)
               mis-keda        10              4           4      2
                              (11.2%)         (4.5%)
OBJ question   mida-keda       22              8           3      2
                              (24.7%)         (9.0%)
Identical doubling
SUBJ question  kes-kes         37             34
                              (41.6%)        (38.2%)       2      1
OBJ question   keda-keda       26             38           2      2
                              (29.2%)        (42.7%)
Non-doubling
SUBJ question  kes-et          28             15           3      2
                              (31.5%)        (16.9%)
OBJ question   keda-et         20             15           3      2
                              (22.5%)        (16.9%)

Table 6 Acceptability judgements: arvama 'think', NON-HUMAN

Type of pronominal pattern  5--perfectly   4--rather   3--neutral
Non-identical doubling       acceptable    acceptable

SUBJ question  mida-mis         28           30           16
                               (31.5%)      (33.7%)      (18.0%)
               mis-mida         64           19            4
OBJ question                   (71.9%)      (21.3%)       (4.5%)
               mida-mis         44           26           13
                               (49.4%)      (29.2%)      (14.6%)
Identical doubling
SUBJ question  mis-mis          67           13            7
                               (75.3%)      (14.6%)       (7,9%)
               mis-mis          61           14            8
OBJ question                   (68.5%)      (15.7%)       (9.0%)
               mida-mida        28           29           22
                               (31.5%)      (32.6%)      (24.7%)
Non-doubling
SUBJ question  mis-et           15           16           24
                               (16.9%)      (18.0%)      (27.0%)
                                44           21           13
OBJ question   mis-et          (49.4%)      (23.6%)      (14.6%)
               mida---et        22           25           16
                               (24.7%)      (28.1%)      (18.0%)

Type of pronominal pattern   2--rather     1--absolutely  Median  IQR
Non-identical doubling      unacceptable   unacceptable

SUBJ question  mida-mis        10              5             4    2
                              (11.2%)         (5.6%)
               mis-mida         1              1             5    1
OBJ question                   (1.1%)         (1.1%)
               mida-mis         4              2             4    1
                               (4.5%)         (2.2%)
Identical doubling
SUBJ question  mis-mis          1              1             5    0.5
                               (1.1%)         (1.1%)
               mis-mis          5              1             5    1
OBJ question                   (5.6%)         (1.1%)
               mida-mida        6              4             4    2
                               (6.7%)         (4.5%)
Non-doubling
SUBJ question  mis-et          24             10             3    2
                              (27.0%)        (11.2%)
                                8              3
OBJ question   mis-et          (9.0%)         (3.4%)         4    2
               mida-et         17              9             4    2.5
                              (19.1%)        (10.1%)

Table 7 Acceptability judgements: utlema 'say', NON-HUMAN

Type of pronominal pattern  5--perfectly   4--rather    3--neutral
Non-identical doubling       acceptable    acceptable

SUBJ question  mida-mis         37           22           20
                               (41.6%)      (24.7%)      (22.5%)
               mis-mida         34           29           14
OBJ question                   (38.2%)      (32.6%)      (15.7%)
               mida-mis         26           33           14
                               (29.2%)      (37.1%)      (15.7%)
Identical doubling
SUBJ question  mis-mis          29           30           15
                               (32.6%)      (33.7%)      (16,9%)
               mis-mis          19           28           21
OBJ question                   (21.3%)      (31.5%)      (21.3%)
               mida-mida        20           20           19
                               (22.5%)      (22.5%)      (24.7%)
Non-doubling
SUBJ question  mis-et           22           22           19
                               (24.7%)      (24.7%)      (21.3%)
               mis-et           27           21           14
                               (30.3%)      (23.6%)      (15.7%)
OBJ question                    27           22           21
               mida-et         (30.3%)      (24.7%)      (23.6%)

Type of pronominal pattern   2--rather    1--absolutely  Median  IQR
Non-identical doubling      unacceptable   unacceptable

SUBJ question  mida-mis         8            2             4     2
                               (9.0%)       (2.2%)
               mis-mida         8            4             4     2
OBJ question                   (9.0%)       (4.5%)
               mida-mis        14            2             4     2
                              (15.7%)       (2.2%)
Identical doubling
SUBJ question  mis-mis          9            6             4     2
                              (10.1%)       (6.7%)
               mis-mis         14            7
OBJ question                  (15.7%)       (7.9%)         4     1
               mida-mida       21            8             3     1
                              (23.6%)       (9.0%)
Non-doubling
SUBJ question  mis-et          19            7             3     2.5
                              (21.3%)       (7.9%)
               mis-et          20            7             4     3
                              (22.5%)       (7.9%)
OBJ question                   13            6
               mida-et        (14.6%)       (6.7%)         4     2


[Please note: Some non-Latin characters were omitted from this article]
COPYRIGHT 2018 Estonian Academy Publishers
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2018 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Allkivi, Kais
Publication:Linguistica Uralica
Date:Jun 1, 2018
Words:9203
Previous Article:THE OROGRAPHIC LEXICON IN THE KILDIN SAAMI LANGUAGE.
Next Article:ZUR BALTISCHEN HERKUNFT VON OSFI. mokka 'LIPPE, LEFZE'.

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters |