Resolution of anaphoric expressions in children and adults: evidence from eye movements.


Anaphora resolution is at present one of the central topics in psycholinguistics. Within anaphora studies, the acquisition studies are of particular importance for psycholinguistic theory since anaphora resolution requires a basis in linguistic theory (Lust 1986). In other words, since the reference of anaphora is, in general, dependent on relation to its antecedent, the resolution of its reference requires some computation described by linguistic theory. Therefore, both theoretically and experimentally, the need for computation in anaphoric relations means that its resolution cannot be determined by the properties of the anaphora (in experimental terms; the stimulus) itself. This point is particularly important for studying anaphora in Croatian, a pro--drop language in which the pronoun is generally expected to be omitted leaving only the null anaphora, which is rather difficult to study as a stimulus in a psycholinguistic experiment. Namely, anaphora is either expressed as a pronoun or the same information (person, number, gender) is given on the verb, as an affix; this is sometimes called incorporated anaphora (Poesio 2016), which yields a practical problem of comparing different categories as a single category in an experiment (overt pronoun with the null anaphor and the verb marked for person and number).

Referring is considered to be one of the basic functions of language. A passenger could not warn a car driver "He will stop you if you don't slow down!" without being able to refer to the police officer at the side of the road seen by both of them. Anaphoras refer to words used earlier in a sentence or a discourse, the antecedents. In this study we tried to find what influences the choice of the antecedent for a null and overt pronominal anaphora in Croatian. Additionally, possible differences between children and adults were explored. These differences may reflect the developmental changes in the process of language acquisition.

1.1. Background

The words that qualify for being the antecedents of an anaphora are said to be accessible (Arnold 2010). There is no unique definition of accessibility, nor agreement about the cognitive processes behind it (ibid. 189), but four approaches can be discerned. The first is syntactic prominence. Simply put, the grammatical subject tends to be more accessible to the anaphora than other arguments of a clause (e.g., Arnold et al. 2000, Brennan 1995). The second approach is related to the information structure, contrasting old and new information. Following the information structure terminology this is called givenness, but in Arnold (2010) this term can be related both to linguistic and non--linguistic information, for example, the visual context (Clark and Marshall 2002). However, the non--linguistic cues evoke pragmatic criteria for givenness as well. In Clark and Marshall's article these criteria are called "mutual or shared knowledge" and include propositional attitudes and presuppositions. The third approach to accessibility is through thematic prominence (Stewart et al. 2000). Namely, the semantic role of a discourse element may determine its accessibility. This can be observed in sentences with syntactic prominence controlled for, as in: "Peter admired John because he was a great player." and "Peter impressed John because he was a great player". Finally, the fourth approach to accessibility points to the recency effects (Givon 1983). Recent information is more likely pronominal (overt or null) than the information mentioned earlier in the discourse.

Psycholinguistically, it is advantageous to conceive anaphora and its antecedent as an accessibility relation. First, it is a concept easy to operationalise in an experiment due to the easiness of quantification (e.g., as fixation times in a reading experiment; Duffy and Rayner 1990). Quantification means that the choice of an antecedent in ambiguous sentences can be interpreted as speakers' preference for one or another antecedent expressed in percentages or ratio of any measured dependent variable. Second, it is not a technical term of any of the approaches mentioned above, i.e. it does not have any special meaning in any of them. Nevertheless, two approaches to anaphora resolution are relevant for this study, syntactic prominence and givenness. Syntactic prominence (grammatical structure) was held constant, while givenness was manipulated on the level of both information structure regarding the pro--drop feature of Croatian and visual context. This is described in more detail in the next sections. Recency is more suitable to study in larger discourses where it can be manipulated (e.g. Givon 1983) while thematic prominence is usually studied in a single sentence.

Pronominal anaphora is by default expressed as the null anaphora in Croatian. In this sense Croatian is similar to the "consistent null subject languages" such as Spanish, Italian or Quechua (Camacho 2013) in which "overt pronouns are typically used to indicate change of topic or contrast" (ibid: 31). However, the exact mechanisms are still disputed and cross--linguistic differences have been reported (e.g., Kaiser and Trueswell (2008) reported differences in salience (accessibility) for various word orders in Finish). In a computational linguistics framework Miltsakaki (2002) proposed a split--mechanism, i.e. a different mechanism for calculating the salience of the antecedent of an intra--sentential anaphora and a different one for inter--sentential anaphora. The first is based on thematic relation and implicit causality of the verb, while the second mechanism is based on the information structure (for arguments against the split--mechanism view see e.g. Champollion 2006). The pre--verbal subject position has been found more accessible as an anaphora antecedent in Italian and Spanish, but only for the null anaphora. For the overt pronoun, the object NP has been found more accessible, but some differences in effects were observed between the two languages (Carminati 2002, Filiaci et al. 2014, Runner and Ibarra 2016) suggesting that position, overtness and information structure do not have the same role in Italian and Spanish after all. The word order (i.e., the order of the syntactic components) and information structure has been discussed in Croatian (Peti--Stantic 2013), but with the position of clitics in focus. Based on structural similarities between Croatian, Italian and Spanish, similar differences between overt and null anaphora found in Italian and Spanish can be expected in Croatian.

1.2. Anaphora resolution in children and adults

According to Lust (1987) there are two ways of studying anaphora developmentally. First, anaphora can be studied in the context of research aimed at discovering the initial state of the Universal Grammar (UG) in order to find a set of constraints that are available to the child from the start. The second way is to study real facts about children's acquisition of anaphoric expressions and involves theory--internal constraints, as well as what Lust refers to as research methodology (ibid: 11) which includes experimental work on what children pay attention to or which strategies they employ in resolving anaphora (see also Hornstein and Lightfoot 1981). Previous studies observing differences between children and adults reported that there were no differences in anaphora resolution between the two groups regarding null anaphora in Italian, but differences were found in cataphoric expressions (Serratrice 2007). A recent study in Greek (Papadopoulou et al. 2015) did report differences between children and adults in resolving overt pronouns. Greek adults preferred the non--topic antecedents, while childrens' strategies remained unclear, as they showed no preference. Different strategies in anaphora resolution have also been reported for relying on mutual knowledge or context in a French study by Hickmann et al. (1995) which included children of different age groups. The reliance on mutual knowledge or context decreased with age. These different patterns of results in different languages motivated the inclusion of two groups of participants in this study. It is important to notice that this is not a developmental study, but a mere comparison of two groups as in the studies mentioned above.

1.3. Experimental aspects

Experimentally, anaphora resolution can be studied in two ways. The first is intra--sentential and the second is inter--sentential anaphora. This study is constrained to inter--sentential anaphora for two main reasons. First, theoretical syntax has been more focused to the intra--sentential anaphora defining the constraints on pronoun reference resolution, binding principles or co--indexation within a syntactic domain (for a detailed description of the phenomenon see e.g. Willer--Gold 2015). The formulation of formal principles leaves little space for psycholinguistic experimentation, apart from explaining behavioural results by these principles. Nevertheless, since the predictions would follow exactly from these principles, somewhat biased results, if not circular reasoning, would be difficult to avoid. Valid methodology requires designing experiments that would be unfavourable to the expected results (Crain and Thornton 1999) and it seems that this is easier to achieve with the inter--sentential anaphora. Second, one of the objectives of the study was to show possible differences between school--age children and adults in relying on different cues in searching for the anaphora antecedent. These cues can be better controlled in the inter--sentential context.

The aim of the current study was to examine the relative role of information structure and visual context in anaphora resolution in school--age children and adults. While measuring eye movements in a visual world paradigm, givenness was manipulated on two different dimensions: the information structure by overtness of the pronoun using the pro--drop feature of Croatian, and visual contexts by altering visual cues in the pictures.


2.1. Participants

Since one of the aims of the paper was to address the differences between young and adult speakers in anaphora resolution strategies, two groups of participants were included in the study. The group of adults consisted of 20 students (Mage=20;09) and a group of children involved 15 typically developing children in the first grade of elementary school (Mage=7;03) (see Table 1). The choice of this age group was guided by similar research in other language, as mentioned above. All participants had normal or corrected to normal vision and were without a history of neurological or psychiatric disease and/or language impairment. Prior to the testing, they were introduced to the procedure, but were blind to the actual research question. Adult participants signed a written consent form stating that they agree with the participation in the study, while children's participation was previously agreed upon with the elementary school principal and speech and language pathologist (SLP), who provided us with the documentation on cognitive and language skills of the children. Parents and teachers were first introduced to the study design and research questions, and the consent forms were sent out afterwards. The final decision on the children's inclusion in the study was made with the SLP, since the inclusion criteria were the absence of any cognitive and/or language delays. We did not perform any additional pre--tests because parents agreed for their children only to participate in the experiment.

2.2. Materials and procedure

For the purpose of this study a list of sentences and their corresponding visual stimuli have been developed. Each trial consisted of a discourse unit with two sentences (Figure 1). The first sentence introduced the common ground (Krifka and Musan 2012) consisting of two nouns (persons) and an action they were about to take, as represented on the accompanying picture. The two nouns were either in an associate construction, or the second one was the object of the sentence. The first noun (N1) was always in Nominative case, while the second noun (N2) was either in Instrumental expressing comitative meaning or Accompaniment (e.g. "father with son", Stolz et al. 2013) or in Accusative as a Recipient (e.g. "the father called his son to...") to avoid too much repetition. N1 was always the Agent of the sentence. The second sentence introduced a small complement to the common ground (tools or clothes for the action they were about to take). The second sentence contained either an overt pronoun or a null anaphora (Figure 1). The pronoun and the verb with the person, number and gender values expressed as an affix on the verb were always at the beginning of the second sentence. The gender of N1 and N2 was always identical; in half of the trials masculine, in half feminine. Since the sentences were presented to the participants as auditory stimuli, the prosodic devices that could have given rise to the change of topic were controlled for. The sentences were pre--recorded in a controlled setting. In total, there were 6 discourse units or sentence pairs. Nine filler stimuli (sentences and their corresponding pictures) have also been constructed. They were not ambiguous in any sense and were not included in the subsequent analyses.

The visual context for each sentence was given as a picture presented at the onset of each trial. The picture presented the scene with two persons on each side, and the object of their action in the middle to keep referents as far away as possible. The sides of the referents were controlled; i.e. in half of the trials the Agent noun (N1) was on the left side, and in the other half it was on the right. This structure of the pictures allowed for the consistent definition of the Areas of Interest (AOIs) necessary for the statistical analysis. The content of the visual cues corresponded to the new information given in the second sentence. There were three options for the visual cue: it was on the N1, on N2, or on both. This makes 6 experimental conditions in the audio--visual task:

1) overt pronoun with a visual cue on N1 (pro+ cue+);

2) overt pronoun with a visual cue on N2 (pro+ cue-);

3) overt pronoun with a visual cue on N1 and N2 (pro+ cue+-);

4) null anaphora with a visual cue on N1 (pro- cue+);

5) null anaphora with a visual cue on N2 (pro- cue-);

6) null anaphora with a visual cue on N1 and N2 (pro- cue+-).

The example of a sentence pair stimulus (discourse unit) and its corresponding picture (Figure 2) is presented below. In the pro--condition, the first sentence is the same as in the pro+ condition and is omitted for brevity.

pro+: Djecak-O je pozva-o djed-a Boy-NOM.Sg AUX.3.Sg invite-PART.Sg.M grandfather-ACC.Sg 'The boy invited the grandfather

da zajedno ber-u Uljiv-e. to together pick-3.Pl.PRES plum-ACC.Pl to go plums picking.'

On je na glav-u stavi-o kap-u He AUX.3.Sg on head-ACC.Sg put-PART.Sg.M cap-ACC.Sg 'He put a cap on his head

da se zaUtit-i od sunc-a. to REFL protect-3.Sg.PREZ from sun-GEN.Sg to protect himself from the sun.'

pro- Na glav-u je stavi-o kap-u On head-ACC.Sg AUX.3.Sg put-PART.Sg.M cap-ACC.Sg 'He put a cap on his head

da se zaUtit-i od sunc-a. to REFL protect-3.Sg.PREZ from sun-GEN.Sg to protect himself from the sun.'

A pre--test containing 10 sentence pairs was administered prior to the experiment in order to observe the adults' choice of the antecedent off--line. The test was a questionnaire sent to various student mail groups at the University of Zagreb with sentences that had the same structure as the experimental stimuli, but accompanied with a simple "Who did it?" question for the second sentence. The results of the first 100 participants were analysed. None of these students participated in the eye--tracking study. The results indicated that one third of the participants shifted reference from N1 to N2 when the pronoun was overt, i.e. that the accessibility of N2 increased with the overtness of the pronoun by 1/3.

For the purpose of the study, an SMI RED--M 120 Hz device was used. Each participant was tested individually, and the trial lasted about 15 minutes. Prior to the experiment, participants were instructed to listen carefully to the sentences and try to figure out whether they matched the pictures. After the calibration and a short familiarisation phase, randomised stimuli started appearing on the screen. Participants were simultaneously listening to the pre-recorded sentences via headphones, and looking at the pictures on the screen, while the device recorded their gaze direction and the duration of their fixations. The decision of including or excluding questions after target stimuli is particularly important since it has been suggested that this might affect online processing and even alter the results (Swets et al. 2008). This point was first mentioned by Yarbus (1967) who found that simply altering the instructions given to the observer, and thus their task while viewing, had a profound effect on the inspection behaviour of the observer. At the end of the trial, participants were given refreshments, while children also received Thank you notes, as certificates for participation. A short description of the entire procedure is provided in Figure 3.

2.3. Data collection and analyses

Visual world paradigm (VWP) is an online experimental method applied when the connection between sentence processing and visual attention is studied (Tanenhaus et al. 1995; Van Gompel and Pickering 2007) or, as Huettig et al. (2010) put it, when the interplay between linguistic and visual information processing is in focus. Moreover, since there is evidence that eye movements are mediated by language since early childhood, the VWP is the best choice among online methods in developmental psycholinguistics (Sekerina 2014). Nevertheless, one has to be aware of the existing debates in the field regarding the assumptions on what the observed scenes and the corresponding eye movement behaviour actually indicates (Henderson and Ferreira 2004).

In some situations longer dwell time might reflect higher processing costs and an increased cognitive load (such as difficulty in extracting information during reading). On the other hand, when participants are asked to search for a certain item in the visual context, their dwell time increases up to the point of final selection, thereby indicating the level of certainty of their choice (see Holmqvist et al. 2011: 386-389). Since in this testing situation both groups of participants were searching for the correct referent, as in similar studies using preferential look, i.e. visual world paradigm, longer dwell time was an indicator of certainty in their conscious choice. This is based on the mind--eye hypothesis (Just and Carpenter 1980) in which listeners tend to look at what they attend to in the visual world (Sekerina 2014).

Since it has been suggested and repeatedly verified that differences in gaze direction and duration reveal different strategies of language processing (Just and Carpenter 1980), the participants' gaze direction and duration at the key moment of each trial (fifth to sixth second from the beginning of the trial, i.e. the start of the second sentence) was observed. By measuring the amount of time spent in a predefined AOI on left or right side of the screen (the total duration of the dwell time) on one of the two parts of the screen, one can reveal which cues a person mostly relies on while resolving anaphoric expressions. The collected data (duration of all fixations) were processed in the SMI BeGaze software and further analysed using repeated measures ANOVA.


3.1. Descriptive statistics

The following table summarises the average performance on the variable dwell time for both groups of participants across each condition measured in a critical time frame, for both potential referent sites (N1 and N2).

A brief look at the descriptive data in Table 2 in the table suggests that both groups seem to look longer at N1 when the visual cue corresponds to it (cue+). However, when the visual cue was changed (cue-), i.e. the object (tool, cloths) was placed on N2, the accessibility of N2 increased irrespective of pro manipulation. These data imply that it was the visual cue that guided the choice of the antecedent as indicated by the grey lines added to the table. The most interesting is the ambiguous condition, in which the cue was visually placed on both potential referent sites, i.e. on both N1 and N2 (cue +-). Here there are observably smaller differences between dwell time spent on potential referents, which is especially evident in the group of children when the pronoun is overt. It seems that N1 is more acessible for adults where visual cue is not available regardless of whether the pronoun is overt or null as indicated by the dotted lines.

3.2. Inferential statistics

Repeated measures ANOVA was performed in the 2 x 2 x 2 x 3 design: group (G), overtness (O), referent (R), visual cue (C) has been performed: (G: adults vs. children as a between--group factor) x (O: overt pronoun vs. null) x (R: N1 vs. N2) x (C: on N1 vs. on N2 vs. ambiguous). Interactions and main effects are reported below.

The normality of results distributions was assessed using a Shapiro--Wilk test and in terms of skewness and kurtosis, where values between -2 and +2 were considered indicative of a normal univariate distribution (George and Mallery 2010). According to these parameters, all of the variables were normally or approximately normally distributed. Therefore, in subsequent analyses parametric statistical methods have been performed.

Only one main effect, that of referent, has been obtained (F(1, 33) = 10,12; p = 0,003, [mathematical expression not reproducible]=0,235) which means that all participants spent more time looking at N1 at the relevant time point in the discourse unit. Interestingly, the main effects of visual cue and overtness have not been obtained. The between--group factor of group has also proved not to be significant, i.e. overall differences between children and adults were not statistically significant. However, two interactions have been found significant (or marginally significant), referent x visual cue (F(2, 33) = 25,7; p < 0,001, [mathematical expression not reproducible]=0,602) and referent x group (F(2, 33) = 3,58; p = 0,067, [mathematical expression not reproducible]=0,098). This indicates that the change of visual cue affected the choice of referent, i.e. the anaphora antecedent, and that there were different tendencies in reference resolution, i.e. in the choice of the referent, in different groups. These results are graphically represented on Figure 4.

The overall results suggest that both children and adults rely to a certain extent on visual cues even if they are not consistent with the information structure, i.e. the switch in topic signalized by the overt pronoun in the second sentence. Nevertheless, between--group differences are observed when the visual cue is ambiguous. With this respect, the results suggest that the adults choose predominantly N1, while the behaviour of 7--year--old children is at chance. These results are in line with the ones obtained in similar French and Greek studies (Hickmann et al. 1995; Papadopoulou et al. 2015) where children relied more on the common ground information than on linguistic cues.

The results regarding the reliance on the information structure are similar to the results found in the Italian and Spanish studies (Carminati 2002, Filiaci et al. 2014, Runner and Ibarra 2016) concerning the overall clear preference to N1 for null anaphora. However, the overt pronoun did not yield an effect as strong as in Italian and Spanish. This difference may have arisen from the differences in the study designs. In the mentioned Italian and Spanish studies reading times were measured while the present study used auditory stimuli without any intonation cues. This could have decreased the accessibility of N2 across experimental conditions in our study. The results of the pre--test indicating the increase in the accessibility of N2 by 1/3 for the overt pronoun speak in favour of this interpretation.

3.3. Limitations and future directions

This study aimed at dissociating the impact of the information structure and context on anaphora resolution. The information structure was represented only by the overtness of the pronoun with the overt pronoun meant to indicate the switch of topic. However, future studies should also manipulate the intonation parameters, not just control for them.

Similarly, children data only point to the direction in which the developmental differences should be sought for. A more comprehensive developmental study should include more age groups. With more psycholinguistic data on anaphora resolution, developmental studies should be taken separately and be based upon the data known for the target language. Some limitations of this study might have arisen from a relatively small sample and its division in two groups. Including more participants could have given more statistical power to corroborate our conclusions.


The main aim of this paper was to study the impact of information structure features on anaphora resolution exploiting the pro--drop feature of Croatian and assuming that the overt pronoun signalized the switch in topic. Additionally, the differences in the strategies of anaphora resolution between children and adults were sought for. To this end overtness and visual cues on the pictures accompanying the auditory sentence stimuli were manipulated. The results indicate that both groups relied mostly on visual cues in anaphora resolution. The switch in topic signalled by the overt pronoun did not produce the expected change. Without the visual cue available, the data and the results suggest that the adults chose the Agent (N1) as a referent in pro+- condition (see capture under Figure 4.). These results are similar to the ones obtained in comparable Italian and Spanish studies, although some differences were observed. However, they may be attributed to the differences in experimental design. The obtained differences between adults and children are similar to French and Greek studies, as well. The adults had the preference for N1 where no other cues were available and they switched to N2 when the pronoun was overt, while children did not show any consistent behavioural pattern. This implies that there are differences in the accessibility of N1 and N2 for the two groups in using the structural information, i.e. in making assumptions based on the information structure only.

Acknowledgments: This work was supported by the national project Adult language processing (HRZZ; UIP-11-2013-2421). We are especially grateful to the Elementary School Vukomerec (mostly to the school principal, teachers and an SLP) as well as to all the participants and their parents.


Odredivanje anafore kod djece i odraslih: analiza pokreta ociju

Metoda vizualne paradigme metoda je u kojoj se jezicna informacija ispitaniku prikazuje istovremeno vizualnim i sluUnim kanalom. U vezu dovodi usmjerenje ispitanikove paPnje i sluUno predstavljenu jezicnu informaciju. U ovome su se istraPivanju proucavale strategije koje odrasli govornici hrvatskoga i djeca upotrebljavaju kako bi odredili antecedent anafore.

Eksperiment se sastojao od dviju recenica i odgovarajuce slike. Prva je recenica uvodila dva referenta i zajednicku radnju. Druga je dala dodatnu informaciju koja je odgovarala sadrPaju vizualnog kljuca (>>Unuk je pozvao djeda da zajedno beru Uljive. Na glavu je stavio kapu da se zaUtiti od sunca.<<, s kapom kao vizualnim kljucem). Manipuliralo se informacijskom strukturom i vizualnim kontekstom. Promjena u informacijskoj strukturi recenice temeljila se na pro--drop obiljePju hrvatskoga, tj. promjenu u topikalizaciji signalizirala je uporaba zamjenice (>>On je na glavu stavio kapu...<<). Vizualni je kljuc bio na jednom od referenata ili na obama, tj. nije bio dostupan.

Rezultati analize pokreta ociju pokazali su s jedne strane da u prisutnosti vizualnoga kljuca razlike u odredivanju antecedenta anafore medu skupinama ispitanika nema: u tome se slucaju i odrasli ispitanici i djeca vode upravo njime. S druge strane, u odsutnosti vizualnoga kljuca odrasli se govornici viUe oslanjaju na informacijsku strukturu, dok ponaUanje djece ne otkriva posebnu strategiju. Pro--drop obiljePje hrvatskoga nije se pokazalo znacajnim ni u jednome eksperimentalnom uvjetu.

U pogledu razlike u informacijskoj strukturi signalizirane uporabom zamjenice ti se rezultati mogu usporediti sa slicnim istraPivanjima u talijanskome i Upanjolskome jeziku. U pogledu razlika izmedu odraslih i djece rezultati su u skladu s onima iz istraPivanja u grckome i francuskome.

Keywords: anaphora, pro--drop feature, eye movements, contextual information, language comprehension, Croatian

Kljucne rijeci: anafora, pro--drop obilje`je, pokreti oka, kontekstualne obavijesti, jezicno razumijevanje, hrvatski jezik

Marijan Palmovic, Ana Matic, Melita Kovacevic

Faculty of Education and Rehabilitation Sciences, University of Zagreb,,

Table 1. Characteristics of both groups of participants included in
the study; Adult group (AG) and Children (CG)

Participants      Gender        Age
(groups)      N
                  M  F   M      SD    Min    Max

AG            20  -  20  20;09  1,37  19;10  23;02
CG            15  6  9   7;03   0,54  6;11   7;11

Table 2. Descriptive statistics: Average performance in dwell time
(DT; in ms)

                           pro+      pro+      pro+      proConditions
 Group (N)      cue+      cue-      cue+-     cue+
                           M(SD)     M(SD)     M(SD)     M(SD)

N1                          413.01    270,47    389,79    503,23
            Children (15)  (199,94)  (181,39)  (202,48   (185,22)
N2                          294,66    424,96    370,74    204,70
            Children(15)   (150,56)  (191,16)  (215,97)  (113,47)
N1                          481,70    294,24    410,57    484,13
            Adults (20)    (169,05)  (173,29)  (168,30   (171,39)
N2                          258,65    339,58    252,80    219,21
            Adults (20)    (198,21)  (169,11)  (121,12)  (160,28)

            pro-      proConditions
 cue-      cue+
           M(SD)     M(SD)

N1           297,17    266,99
            (149,38)  (166,43)
N2           380,06    330,27
            (189,88)  (160,79)
N1           262,95    378,32
            (149,92)  (191,71)
N2           414,47    294,46
            (171,98)  (167,22)
