Printer Friendly

Auditory, visual, and auditory-visual identification of emotions by nursery school children.

Perception of nonverbal information is essential for communication and social interaction because the full meaning of any communicative message is conveyed not only by its verbal content but also by nonverbal cues. The nonverbal part of the message relates to the speaker's mood, emotional state, and attitude toward the message, with components including facial expression, tone of voice, and rate of speech. Ability to perceive a message's nonverbal components is essential for appropriate, successful interpretation as well as for continuation of the social interaction (Denham, Zoller, & Couchhoud, 1994; Dunn, 1995; Feldman, Philippot, & Custrini, 1991).

Nonverbal information on the emotional state of the speaker comprises both visual and auditory cues. Although people may differ in their sensitivity to emotion perception, research has demonstrated that accurate judgments of emotions and feelings can be made from wordless vocal messages (Hammermeister & Timms, 1989; Siegman, 1987; Wallbott & Scherer, 1986). The clearest, most consistent auditory factors in signaling a speaker's emotional state were found to be the mean value of the fundamental frequency [(F.sub.0]), its range, and its rate of changes. Duration of production and changes in voice intensity were also described as important parameters (Johnstone & Scherer, 2000; Scherer, 1982, 1986; Siegman, 1987). Sad utterances, for example, showed only small and slow changes in [F.sub.0], leading to a relatively flat contour shape along a long utterance. In contrast, the contour shape for angry utterances showed high [F.sub.0], with a considerably greater range of [F.sub.0] compared to the range for neutral utterance. Also, angry utterances demonstrated a higher intensity and rate of speech (Johnstone & Scherer, 2000; Scherer, 1986; Williams & Stevens, 1972).

Visual transmission of emotions is considered to be highly efficient. Visual cues regarding emotions are clear, and observers can reliably identify different emotions based on them (Fridlund, Ekman, & Oster, 1987; Hess, Kappas, & Scherer, 1988; Most, Weisel, & Zaychik, 1993; Wallbott & Scherer, 1986). Researchers have suggested that anger, disgust, happiness, fear, sadness, and surprise are each expressed visually in a universally distinct way (Ekman, 1982). For example, fear is expressed by raised eyebrows, wide-open eyes, and open mouth, whereas sadness is expressed by knitted eyebrows, narrowed eyes, and downward-drawn mouth corners (Camras, Malatesta, & Izard, 1991).

Research on adults' identification of emotions when cues were presented either auditorily or visually indicated that the visual modality always yielded superior identification compared to the auditory modality. When the stimulus was presented via the combined auditory-visual modality (AV), its identification surpassed the auditory modality alone but did not always significantly surpass the visual modality alone (Fridlund et al., 1987; Gross & Ballif, 1991; Hess et al., 1988; Wallbott & Scherer, 1986). Most and colleagues investigated the identification of emotions via the different modalities in adolescents with and without hearing impairments (Most et al., 1993) and in adolescents with and without learning disabilities (Most & Greenbank, 2000). They reported that adolescents in all of these groups perceived emotions least well through the auditory channel alone. Adolescents with normal hearing and those without learning disabilities also showed better perception of cues through the combined AV modality than through the visual modality alone. Likewise, adolescents with learning disabilities identified emotions better through the combined AV modality than the visual modality alone, although they made poorer use of the auditory cues provided in this modality when compared to their peers without learning disabilities. The adolescents with hearing loss, however, did not demonstrate the advantage of the combined modality in comparison to the visual modality in their identification of emotions.

The perception of nonverbal auditory and visual information that is essential for social interactions among adults may be even more important for interactions between adults and young children. During the early stages of language development, young children often lack the vocabulary or grammatical knowledge necessary for correct interpretation of communicative messages. It has been suggested that in such instances young children lean on the identification of nonverbal information in order to interpret the speaker's intentions as well as the message's content. Morton and Trehub (2001), for example, reported that the ability to perceive emotions based on intonation develops and improves between the ages of 4 and 10 years of age until reaching the same level of adults' performance in an auditory perception task.

The timeline for children's ability to detect different emotions is a highly intriguing research question that has attracted much scientific attention. Meltzoff (1996) and Meltzoff, Gopnik, and Repacholi (1999) postulated that human infants are born with an innate ability to distinguish between various facial expressions, and that this ability develops over the first two years of life (see also Flavell, 1999). Cumulative empirical evidence suggests that when babies are only a few months old, they already perceive facial expressions and respond to them (Ekman, 1982; Flavell, 1999; Russel & Fernandez-Dols, 1997). For example, it has been found that newborns are able to perceive and imitate adult facial twists and expressions (Meltzoff & Moore, 1983). Babies are particularly attracted to human eyes, and from the age of a few weeks they are able to follow the eye gaze of another person (Brooks & Meltzoff, 2002).

Repacholi (1998; Repacholi & Gopnik, 1997), who studied young infants' and toddlers' "social referencing" regarding adults' emotional expressions, argued that from the age of 15 months onward toddlers begin to respond to the nonverbal emotional reactions of adults. For example, if the experimenter shows signs of disgust when looking into a closed box, toddlers of this age will reject the object in that box, whereas if the experimenter expresses enthusiasm, toddlers will approach the same object with a positive attitude. Feldman et al. (1991) found that the capacity for recognizing facial expressions of emotions was strongly correlated with social functioning in children aged 3-5 years. The ability to perceive nonverbal emotional cues improves as children grow older, and between the ages of 3 to 5 years, children are already familiar with all the universal facial expressions conveying emotions (Etcoff & Magee, 1992; Fridlund et al., 1987; Gross & Ballif, 1991).

Analyses of verbal terms that children use in spontaneous social interactions also indicate an increase in awareness of emotions among children aged 2 to 5 years. Bartsch and Wellman (1995) reported that around age 2 years, children begin to utter their first words referring to basic emotions such as happiness, fear, sadness, and anger. The mental lexicon grows with age, and by 5 years children begin to describe emotions of others and to project their emotions onto their toys. They also use emotional terms during descriptions of past events. Gradually, terms that refer to more complex emotions such as surprise, excitement, boredom, and loneliness emerge.

In trying to assess which facial expressions are easier to perceive, Smiley and Huttenlocher (1989) presented young children with short films of an actor expressing the emotions of happiness, anger, fear, sadness, and surprise. The researchers found that at ages 2;5 to 3 years participants could best identify happiness. At ages 3 to 3; 5 years, participants successfully identified fear and anger, and by the age of 3;5 years all five basic emotions were identified successfully. In another study (Camras & Allison, 1985), 4-year-old children succeeded in identifying all six emotions (happiness, sadness, anger, fear, disgust, and surprise) depicted by pictures along with a short appropriate story describing a situation that led to the particular feeling. In this study, sadness and happiness were the easiest emotions to identify. Denham (1986) claimed that positive emotions such as happiness are always easier to identify than negative emotions such as sadness.

With regard to the identification of emotions on the basis of auditory cues, Grossmann, Striano, and Friederici (2005) reported that the ability to detect emotional variations based on intonation already exists during infancy. They measured the electric brain responses of 7-month-old infants when exposed to semantically neutral words that were expressed in happy, angry, and neutral voices. The results indicated an enhanced sensory processing of the emotionally loaded stimuli - happy and angry - in comparison to the neutral voice. These findings demonstrated that, very early in development, the human brain detects emotionally loaded words and shows differential responses depending on their emotional valence. Mumme and Fernald (1996) also found that 1-year-old babies were able to identify fear based on auditory cues alone.

In all, researchers agree that young children show a growing ability to react to the emotions of adults and that the understanding of nonverbal information promotes identification of emotions and therefore fosters children's development of normal social functioning. Nevertheless, only a few experimental studies have investigated young children's identification of emotions based on different sources of nonverbal information. In particular, controlled research is lacking on very young children below age 36 months. Meltzoff and his colleagues (1999) discussed methodological difficulties in testing children who cannot yet use formal linguistic means to express their under-standing of emotional cues. The present study aimed to examine the developmental trajectory of emotion identification by utilizing a well controlled task that was previously performed with adolescents and adults. The choice of this particular age group stemmed from the gap in the literature with respect to that age range (30-52 months). Our aim was to explore how these children of nursery school age identify emotional information, by comparing their perception of emotions presented auditorily (via the speaker's vocal information), visually (by facial expressions), or simultaneously through the combined auditory and visual modality. To the best of our knowledge, very few studies directly addressed the empirical question of how young children detect nonverbal emotional information through the two modalities in isolation and when combined.

We hypothesized that older children (4;5 years; months) would succeed in the task through all modalities better than younger children (2;5). On the basis of findings on adults, we further hypothesized that the perception through the combined modality (AV) would surpass perception of presentation through either the visual (V) or the auditory (A) modalities alone. We also anticipated that presentation via the auditory modality alone would be hardest to identify. Finally, in line with prior research outcomes, we hypothesized that the positive emotion of happiness would be easier to recognize than the negative emotions.



Participants were 40 nursery school children (24 boys, 16 girls) ages 30-52 months, divided into two groups according to age: 20 younger children (10 boys, 10 girls) ages 30-41 months (M = 36.2, SD = 2.97) and 20 older children (14 boys, 6 girls) ages 42-52 months (M = 48.05, SD = 2.72). All children were enrolled in the same private nursery school in central Israel. Only children who were native Hebrew speakers and who used Hebrew as their only language were selected for the study. At the recruitment phase, the records of each child were checked to assure a typical course of early development, and children with a developmental delay were excluded from the sample. In addition, we asked the nursery school administrator (who holds a Master's degree in human development) to evaluate if each child demonstrated typical cognitive and social behaviors, according to her observations. Only those children reported by her as developing typically were included in the current sample.

All participants in the current study came from a middle-class socioeconomic background. They were all children of working mothers and fathers and therefore all of them attended the nursery school for six days a week from 7:30 a.m. to 4:00 p.m. The nursery school environment was highly stimulating for young children, and the educational program included music, art, and gymnastic classes. Most group activities were held in two age groups that coincided with our two experimental groups.


Facial Expression Pretest. This pretest was administered to verify that all the participants possessed a minimal ability to under-stand the four emotions targeted in the study: happiness, fear, anger, and sadness. To avoid reliance on these young preschoolers' verbal responses, an innovative nonverbal task was devised that required only pointing at images. At such a young age, not all children use verbal emotional terms with certainty; therefore, we assumed that a nonverbal task would enhance the preschoolers' motivation to participate in the experiment as well as raise the validity of their responses.

To familiarize the participants with the nonverbal task requirements, the following procedure was performed. One at a time, the experimenter showed each of four pictures depicting children in prototypical situations that evoked one of the four target emotions. Each picture was presented to the child along with a schematic pictorial representation of the target emotion. For example, while showing the picture of the "happy" situation (i.e., a child receiving a birthday present; see Figure 1), the experimenter: (a) verbally described the story line and the ensuing emotion experienced by the illustrated child (i.e., "This child has a birthday and she received a present. She is happy now"); (b) presented the schematic pictorial representation of a happy face (see Figure 2); and (c) placed that schematic happy face card on a faceless doll to introduce a playful activity into the task.



After completion of this familiarization procedure for the set of four situations, each participant was administered the facial expression pretest. Each child was asked to point to the appropriate schematic facial expression (out of the four optional "faces" spread out on the table in front of the child) when prompted by the following verbal request: "Show me a happy face. The four prompts were given in random order. Only children who passed this pretest were recruited to the study, thus excluding 7 children from participation in the research (5 from the younger age group and 2 from the older age group).

Emotion Identification Test. Based on the Emotion Identification Test (EIT) originally created by Most et al. (1993) for adults, we created the current EIT for use with young children.

Child instrument development. The 72-item EIT developed for children presents four different emotions (happiness, anger, sadness, and fear) in three different modalities (auditory, visual, and AV), as follows. An actress acted out each of the four different emotions while stating the same simple sentence with neutral content: "I'm going and I'll be right back." Differences in emotional content were expressed only by nonverbal auditory and visual cues. The choice of these four specific emotions was based on previous studies suggesting that these emotions are all simple, and that their expression as well as understanding emerge early in childhood (Bartsch & Wellman, 1995; Dunn, 1999; Wellman, Harris, Banerjee, & Sinclair, 1995).

First, the actress was video recorded in a close-up of her face, across the full screen, as she repeated each emotion six times, thus creating a pool of 24 presentations. Next, to select the most easily recognizable production for each emotion, the entire pool was presented to 6 college students through the combined AV modality. These adult judges were asked to identify the emotion expressed in each presentation and to rate their certainty about its clarity on a scale ranging from 1 (not certain) to 5 (very certain). The "best presentation" of each emotion was selected to appear in the test (i.e., the production that elicited correct identification of the emotion by all 6 judges with the most certainty).

Next, each of these four "best presentations" was duplicated 18 times. Six repetitions of each emotion were used for the auditory presentation (hearing only without watching), 6 for the visual presentation (watching without hearing), and 6 for the combined AV presentation (hearing and watching). Thus, the final test included 72 items (4 emotions X 6 repetitions X 3 modalities).

Subsequently, to verify the ease of emotion recognition, the full test was administered to a group of 20 undergraduate students (ages 25-35 years) in a single session. They were asked to identify the emotion that the actress conveyed in each presentation. All items were identified successfully in all three modalities of presentation.

Finally, to ascertain the task's and procedure's suitability for children in the age range (2;5 to 4.5), the test was piloted on 5 children in a different nursery school who did not participate in the current study. On the basis of this pilot, we decided to administer the test in two successive sessions within the same week, to eliminate concentration and fatigue effects.


In each of the two sessions, 36 items were presented, 12 in each of the three modalities. Using a counterbalanced sequence, in the first session children were exposed to either the auditory or the visual presentation first, and then to the other, followed by the AV modality. In the second session, children were exposed to the opposite order of the single presentations, followed again by the AV modality. The AV presentation was always last to prevent learning effects from full exposure to the items. Sequence of items within each modality was random. The child was situated facing the screen (0 to 45 degrees), about 1 meter from the screen. For the auditory presentation, the television screen was darkened and the child listened to the test items at a normal conversational level. For the visual presentation, the child only watched the screen without sound. For the AV presentation, the child listened and watched the screen.

Following presentation of each test item, the child was guided to select one of the four schematic facial expression cards familiarized in the pretest (e.g., Figure 2) - the face that best depicted the emotion expressed by the actress - and then to attach that card to the faceless doll (e.g., to select a happy face when identifying the actress's happiness). To facilitate correct choices by the child, the original four pictures used in the pretest to describe prototypical emotion-evoking situations (e.g., birthday, lion; see Figure 1) were placed adjacent to their respective schematic facial expression cards on the table in front of the child, comprising four picture-card pairs.

At the start of each modality's presentation, participants were instructed in the task through the use of a few practice items. The children's observed lack of hesitation when performing the task in all three EIT presentation modes suggested that the schematic, clear quality of the faces enhanced participant decisions in all test situations. Likewise, the children's observed enjoyment and collaboration suggested that the task's playfulness facilitated participant cooperation through to the end. The total number of correct responses in each modality was calculated, as well as number of correct responses for each of the four emotions in each modality.


Parents gave consent for their child's participation in the study following an appeal made to them by the nursery school administrator. The procedure was explained to them as well as the importance of this study.

The experiment took place in the nursery school. Each child was tested individually in a quiet room by one of the experimenters (D.B.). The two administration sessions with each participant took place within a one-week interval. Each session lasted about 20 minutes.

The experimenter administered the Facial Expression Matching Pretest at the beginning of the first session, and if the child failed to match the correct facial expression to any of the four emotions depicted in the pictures, they were excused from further testing and did not complete the IET. For the 40 participants who passed the pretest and completed the IET in the first session, the same pretest was administered again at the start of the second session to familiarize them with the emotions to be tested and to maintain consistent conditions with the first session.


Table 1 presents the means and standard deviations for the correct emotion identifications in the younger and older groups of children for each of the three modalities. A two-way multivariate analysis of variance with repeated measures was conducted with age and presentation modality as independent variables. Findings revealed a significant main effect of modality, F (2, 76) = 44.00, p <.001, but no significant main effect of age or significant interaction between age and modality (p >.05).
Table 1. Means and Standard Deviations for Correct Emotion
Identifications in the Two Age Groups for Each of the Three

 Younger (n Older (n =
 = 20) (CA 20) (CA =
 = 30 to 41 42 to 52
 months) months)
Modality M SD % M SD

Auditory 10.05 3.35 41.88 10.30 3.83

Visual 16.00 4.00 66.66 16.75 3.34

Auditory-visual 15.80 3.88 65.83 14.95 4.56

Modality %

Auditory 42.91

Visual 69.79

Auditory-visual 62.29

Following the Bonferroni formula for adjustment for dependent multiple comparisons, multiple t-tests were used to compare specific means, and only those t values with p <.05 were considered significant. Analyses revealed that, in line with our hypothesis based on previous studies of adolescents and adults, identification of emotions through the auditory modality alone was significantly lower than through the visual and the AV modalities. No significant difference emerged between the visual and the AV modalities for either age group.

Comparison of the Four Emotions. Table 2 presents the means and standard deviations for the correct emotion identifications in the younger and older groups of children for each of the four emotions in each of the three modalities. A three-way ANOVA with repeated measures was conducted with age, presentation modality, and emotion type as independent variables. Analysis revealed a significant main effect for emotion type, F (3, 114) = 52.95, p <.001, and a significant interaction between emotion type and presentation modality, F (6, 228) = 6.54, p <.001. The main effect for age and the interaction between type of emotion, modality of presentation, and age were not significant (p >.05). Thus, there were no significant differences between the two age groups.
Table 2. Means and Standard Deviations for Correct Identification
of the Four Emotions in the Two Age Groups for Each of the
Three Modalities

 Younger Older Total(N
 (n = (n = = 40)
 20) 20)
 (30-41 (42-52
 mo.) mo.)
Modality Emotion M SD M SD M

Auditory Happiness 2.85 1.46 2.70 1.75 2.78

 Sadness 3.05 1.85 2.90 1.65 2.98

 Anger 2.00 1.26 2.55 1.28 2.28

 Fear 2.15 1.73 2.15 1.27 2.15

Visual Happiness 5.35 1.23 5.55 76. 5.45

 Sadness 4.75 1.21 4.70 1.22 4.73

 Anger 2.65 1.39 2.95 1.36 2.80
 Fear 3.25 1.83 3.55 1.67 3.40

Auditory-visual Happiness 5.15 1.23 5.10 1.07 5.13

 Sadness 4.80 1.1 4.20 1.64 4.50

 Anger 2.60 1.73 2.50 1.79 2.55

 Fear 3.25 1.48 3.15 1.92 3.20

Modality Emotion SD

Auditory Happiness 1.59

 Sadness 1.73

 Anger 1.28

 Fear 1.49

Visual Happiness 1.01

 Sadness 1.20

 Anger 1.36

 Fear 1.74

Auditory-visual Happiness 1.14

 Sadness 1.41

 Anger 1.74

 Fear 1.70

Figure 3 presents the mean correct identifications for the total sample (because there was no age effect) for the different emotions in each of the presentation modalities. As can be seen in the figure, a similar pattern of identification of the different emotions emerged for the visual modality and the AV modality. However, the pattern for the auditory modality was different: In this modality, the identifications of all the emotions were very poor and similar to each other.


Following the Bonferroni formula, multiple t-tests were used to compare the specific means. These tests revealed that in both the visual and AV modalities, happiness and sadness were identified significantly better than fear or anger (p <.05). In the auditory modality, no significant differences emerged among the different emotions. All of them were very difficult to identify. Accordingly, in the comparison among the modalities, happiness, sadness, and fear were identified significantly better through the visual and AV modalities than through the auditory modality. No significant difference emerged in the identification of anger among the different modalities. This emotion was difficult to identify through all three modalities.

Confusions in Emotion Identification. Table 3 presents the number and percentage of participants who gave inaccurate responses in attempting to identify the different emotions in each of the modalities across all the participants. The most frequent confusions are marked in bold on the table.
Table 3. Emotion confusion matrices in the different modalities
across all participants.

 Happiness Sadness Anger
Target n % N % n %


Happiness - - 31 24.80 61 48.80

Sadness 28 22.76 - - 40 32.52

Anger 72 51.42 37 26.42 - -

Fear 54 35.52 52 34.21 46 30.26

 Visual Modality

Happiness - - 6 26.08 11 47.82

Sadness 8 15.68 - - 20 39.21

Anger 46 36.22 34 26.77 - -

Fear 16 15.38 25 24.04 63 60.57


Happiness - - 12 34.28 11 31.42

Sadness 10 16.66 - - 18 30.00

Anger 38 27.53 40 28.98 - -

Fear 14 12.50 43 38.39 55 49.11

 Fear Total
Target n % n

Happiness 33 26.40 125

Sadness 55 44.71 123

Anger 31 22.14 140

Fear - - 152

Happiness 6 26.08 23

Sadness 23 45.09 51

Anger 47 37.01 127

Fear - - 104

Happiness 12 34.28 35

Sadness 32 53.33 60

Anger 60 43.47 138

Fear - - 112

As can be seen in the table, in the auditory modality, the most frequent emotion inaccurately given for happiness was anger (48.8%), the most frequent emotion inaccurately given for sadness was fear (44.7%), the most frequent emotion inaccurately given for anger was happiness (51.4%), and the most frequent emotion inaccurately given for fear was either happiness or sadness (34-35%). In the visual modality, the most frequent emotion inaccurately given for happiness was anger (47.2%), the most frequent emotion inaccurately given for sadness was fear (45%), the most frequent emotion inaccurately given for anger was either happiness or fear (36-37%), and the most frequent emotion inaccurately given for fear was anger (60.6%). In the AV modality, the most frequent emotions inaccurately given for happiness were sadness and fear equally (34%), the most frequent emotion inaccurately given for sadness was fear (53.3%), the most frequent emotion inaccurately given for anger was fear (43.5%), and the most frequent emotion inaccurately given for fear was anger (49.1%). In general, happiness was confused the most with anger, sadness with fear, anger with happiness or fear, and fear with anger.


This study examined nursery school children's attempts to identify nonverbal emotional messages presented through three different conditions: auditory alone, visual alone, and the combined AV modality.


Contrary to our hypothesis, no significant differences emerged in emotion identification performance between the two age groups (30 to 41 months and 42 to 52 months). The groups demonstrated similar performance to each other, for all three conditions. These findings support studies that showed that as early as the age of 2;5 (years; months), children are capable of perceiving emotions by watching spontaneous interactions of others (Smiley & Huttenlocher, 1989).

Although these young children demonstrated some ability to identify emotions using nonverbal cues, that ability was only partial. Our results indicated that their identification performance ranged from approximately 42% (via the auditory modality) to approximately 70% (through the visual modality). That is, the ability to fully perceive feelings based solely on non-verbal messages was still incomplete among the children who participated in the study. Children under the age of 4.5 years may not have completely developed the cognitive capabilities required for such identification. It appears that young children at the ages studied here need to continue developing their emotional perceptiveness through the preschool and early school years before achieving a level that resembles that of adults for the same presentation conditions.

Previous research findings indicate that children's perception of nonverbal emotional cues improves with age, and between 3 to 5 years children already become familiar with all the universal facial expressions conveying emotions (Etcoff & Magee, 1992; Fridlund et al., 1987; Gross & Ballif, 1991). Denham (1986) reported that while 4 to 5 year olds make initial judgments about emotions by utilizing visual and auditory cues, 6 year olds already understand hidden emotions like a fake smile that conceals disappointment (Harris, Donnelly, Guz & Pitt - Watson,1986). Morton and Trehub (2001) suggested that it takes a number of years for young children to develop the ability to perceive emotional states on the basis of intonation. Children reach a level equal to that of adults only when they reach 10 years of age.

However, we wish to point out that only children who passed our pretest and understood the pictures representing prototypical emotion-eliciting situations were included in the present sample. Our inclusion criterion might have created a situation whereby all of the children who participated in the study showed an initial ability to recognize emotions. It may well be that the exclusion from the study of those 7 children (5 from the younger group and 2 from the older group), who were unable to perform the facial expression selection task, may have contributed to the relative high response rate in our two age groups. In future studies, inclusion of children who initially had difficulties identifying emotions could provide further data on the gradual development of this ability.

Therefore, at this point, the question remains: At what age do children fully refine their perception of emotions and how does this transpire? The partial aptitude for identifying emotions via nonverbal cues that children in the present study demonstrated regardless of their age level supports the developmental hypothesis that initial abilities in this area exist early in infancy and develop throughout the first few years following cumulative social experiences. Meltzoff and his colleagues theorized that infants are born with a mechanism for understanding emotions, which develops gradually and is affected by social experiences (Meltzoff, 1995; Meltzoff et al., 1999; Meltzoff & Brooks, 2001). This "starting-state nativism" (as opposed to "final-state nativism") refers to the innate mechanism that enables babies to interpret their social environment, which results in additional developmental changes. In other words, these innate capabilities operate both as an incentive for the baby to attend to people and as an incentive for adults to communicate with and respond to the baby (Flavell, 1999).


It has been repeatedly argued in the literature that the identification of emotions fosters social development throughout the preschool years. For that reason, information on how children utilize nonverbal cues for emotion, prior to their full command of linguistic terms that encode those contents, is of high merit. All participants in the current investigation were least efficient at identifying emotions via the auditory modality alone, compared to the other two modalities, but no significant differences were found between the combined modality and the visual modality. Nursery school children's better performance via the visual and the combined AV modalities supports previous findings (e.g. Fridlund et al., 1987; Most & Greenbank, 2000). Thus, visual cues like eyebrow, eye, and mouth expressions are significantly easier to identify than auditory cues such as intonation, duration, and intensity of the voice, enabling observers to identify emotions more reliably when depending solely on those facial expressions (Ekman, 1982).

It should be noted that explanations for the source of difficulties exhibited by young children when decoding auditory cues can be found in the literature. Although the ability to understand intonation cues seems to exist from infancy (Grossmann et al., 2005; Mumme & Fernald, 1996), a disparity may exist between the capacity to understand intonation for language acquisition purposes and the capacity to use it in order to identify emotions. For example, research on the features of motherese (i.e., the input language addressed to the baby) shows that mothers tend to raise the volume of their speech, introduce long pauses, and apply exaggerated patterns of intonation. It is assumed that these natural changes originate from the fact that babies are highly attentive to prosodic cues such as intonation, rhythm, and emphasis (Fernald, 1989).

As our results demonstrated, children at the age ranges that we studied found it difficult to recognize simple emotions on the basis of auditory cues alone. Morton and Trehub (2001) argued that despite the initial tendency of babies to attend to prosodic information, in early childhood children seem to attend more to the semantic content of words, whereas adults pay more attention to the manner in which words are spoken (Morton & Trehub, 2001). These researchers found that 4 year olds were able to interpret the emotional prosodic characteristics of sentences presented in a foreign language (thus neutralizing the contents), although they found this difficult to do for sentences presented in their native tongue (where semantic contents could be a distracter). In contrast, 10 year olds and older performed equally well in the two experimental conditions.

Along the lines of this explanation, we might postulate that our participants found it hard to attend to the prosodic auditory cues due to their attention to the semantic meaning of the sentence. It might well be that while the children listened to the sentence "I am going and I'll be right back" they thought about the contents (e.g., "Where is she going?") and found it difficult to concurrently focus on how the sentence was uttered. It is possible that the use of Hebrew sentences containing neutral but understandable content distracted the children from the prosodic characteristics. Hence, we recommend that future studies include foreign language or nonsense words in the task stimuli, in addition to Hebrew sentences, to test the hypothesis that neutralizing contents improves awareness of an utterance's prosodic characteristics.

Regarding the other modalities, in contrast to the vast majority of previous research on adults and adolescents (e.g., Most & Greenbank, 2000; Rosenthal, Hall, DiMatteo, Royers, & Archer, 1979), which consistently reported better emotion identification via the combined modality (where participants could both see and hear the expression of emotion), our findings for the current sample were non-significant for the visual versus AV modality. In other words, in the present cohort of children the addition of auditory cues to visual cues did not always improve identification accuracy.

Future studies should explore when children benefit from the additional auditory cues in emotion perception in a sample ranging from toddlers to adults. In addition, in view of the predominance of available data acquired from typical populations, we recommend that the issue of identifying emotions via nonverbal cues be studied in young children with developmental disorders such as autism, whose disabilities include difficulties both in understanding emotions and in interpreting facial expressions (Dawson, Webb, Carver, Panagiotides, & McPartland, 2004).

In summary, the findings of the present study indicated that nursery school children at the age ranges of 30 to 54 months attended much more to visual cues than to auditory cues when they were asked to identify emotions. When additional auditory information was present in the combined AV mode it was not easier for them to identify the emotion in comparison to the visual mode alone. This finding points to the predominance of visual cues in the development of understanding emotional messages.


The hypothesis that differences might emerge in children's ability to identify different emotions was supported by the findings. Distinctive differences were found the ability to identify the four emotions (happiness, sadness, anger, and fear), with the same pattern emerging for the visual modality alone and for the combined modality. Happiness was identified significantly better than fear and anger, and, in addition, sadness was identified significantly better than fear and anger. The lack of significant differences between the various emotions when stimuli were presented via the auditory modality alone was probably related to a "floor effect." In other words, the poor performance in perception through this modality did not enable detection of any differences among the various emotions.

Previous studies likewise reported that not all emotions are perceived with the same accuracy (Camras & Allison, 1985; Denham, 1986; Fridlund et al., 1987; Gross & Ballif, 1991; Johnstone & Scherer, 2000; Most & Greenbank, 2000; Most et al., 1993; Mumme & Fernald, 1996; Sincoff & Rosenthal, 1985; Smiley & Huttenlocher, 1989). The current findings supported prior developmental evidence that the first emotions identified by children are positive emotions such as happiness and that only subsequently do children identify negative emotions such as sadness (Denham, 1986; Wellman et al., 1995). The present findings that happiness was most easily identifiable but fear was very difficult to identify are comparable to prior studies that compared perception of the various emotions via facial expressions (visual stimulus) and via the combined AV presentation (Fridlund et al., 1987; Gross & Ballif, 1991; Most & Greenbank, 2000; Most et al., 1993).

In the present study, fear and anger were less successfully recognized in comparison with happiness and sadness. This finding corresponds to studies reporting that anger is perceived relatively later, by 4 to 5 year olds, whereas children succeed in recognizing happiness and sadness earlier (Camras & Allison, 1985; Denham, 1986; Gross & Ballif, 1991). Other studies, however, have reported that happiness and anger were the easiest to identify (Fridlund et al., 1987; Most & Greenbank, 2000; Most et al., 1993; Sincoff & Rosenthal, 1985; Smiley & Huttenlocher, 1989). The inconsistencies among the studies possibly stem from diverse research methods and tasks (Sincoff & Rosenthal, 1985; Smiley & Huttenlocher, 1989).

Examination of the types of mistakes made by the children in their attempts to identify the nonverbally cued emotions revealed consistent switching within the domain of negative emotions. For example, across all of the modalities, children often named fear instead of the targeted sadness, and in the visual and AV modalities, children sometimes mistakenly thought that fear was anger. Gross and Ballif (1991) explained that such errors generally stem from confusion between similar visual characteristics in certain negative facial expressions. For example, the facial expressions depicting anger and sadness both involve knitted brows and narrowed eyes (Camras et al., 1991). Thus, the two expressions may be confused with each other, especially in view of findings attesting to children's heavy reliance on the mouth and the eyes when identifying facial expressions (Brooks & Meltzoff, 2002; Gross & Ballif, 1991). In other words, young children pay more attention to certain features than to others and therefore are more easily confused when those features are shared by multiple emotions.

Another source of confusion between similar emotions could relate to the current mode of stimulus presentation. A full-screen presentation mainly of the face, without any other nonverbal cues, provides limited emotional information in comparison with the wealth of data sources available from real social interactions, including body language and gestures. Previous studies proposed that when children are exposed to a multitude of information sources during natural interactions, they can appropriately interpret the rich nonverbal clues and thereby comprehend their friends' emotions and respond accordingly (Denham, 1986; Gross & Ballif, 1991).

The types of errors found in the present study may also be explained by very young children's tendency to blur the boundaries between emotions and to identify emotions based on features such as their extent of "arousal" and pleasure (Bullock & Russell, 1985; Etcoff & Magee, 1992). If children link emotions to their degree of arousal, they may confuse anger and fear, and even anger and happiness, which are all characterized by a high arousal level. Another explanation for replacing anger with fear, and vice versa, might be that these two emotions stem from a similar experience of the child. For example, when someone is angry at a young child, he/she becomes afraid.

It should be noted that, in the auditory modality, there were bidirectional exchanges between anger and happiness. This outcome could relate to acoustic characteristics common to both of these emotions - high intonation, loud voice, and rapid rhythm - which point to a charged emotional note and create a rich auditory stimulus (Grossmann et al., 2005; Johnstone & Scherer, 2000; Kappas, Hess, & Scherer, 1991; Scherer, 1986; Siegman, 1987).

Altogether, the present results call for an array of future studies in this area. More detailed information on how different factors affect the social and affective judgments of young children and lead to the identification of emotions are crucial for understanding both the course of typical development and the educational applications for young children who experience socioemotional difficulties.


Participation in social interaction depends on one's ability to identify and interpret the verbal content of others' utterances, but likewise relies on one's capacity to recognize and process nonverbal portions of such messages. Perception of nonverbal cues during interaction, such as facial expressions, intonation, and rapidity of speech, add to the emotional content and facilitate better understanding of the speaker's intentions. Children who have difficulties in identifying and understanding nonverbal cues exhibit clear-cut social difficulties; they are unable to understand others' social behaviors and emotions, do not reliably grasp others' relationships to themselves, and are unable to predict the results of their own social behavior (Grossman, 1986; Meltzoff & Brooks, 2001; Most & Greenbank, 2000).

Based on the results of the present study, professionals and parents should be aware that young children rely primarily on visual cues in the process of emotion perception. Thus, during social interactions it is important to reinforce children's attention by emphasizing facial expressions and drawing the children's attention to their own mimicry. We also foster the idea that parents and professionals can help young children to attune their attention to nonverbal auditory stimuli about emotion; thus, we recommend that toddlers be increasingly exposed to auditory cues in games, music classes and other interpersonal activities (i.e., imitating types of intonation contours).

Future studies would do well to continue developing innovative new methodologies for testing very young children's ability to identify and interpret emotional messages, in addition to the nonverbal task that was used in the present study. In view of the fact that the ability to recognize emotions begins so early in life and has such a central role in social understanding and in social development, it is important to further explore the natural course of its long-term development. Further evidence on the impact of children's awareness to nonverbal cues in various age groups and among participants with typical and atypical early development will increase the feasibility of identifying children in need and fostering social development of preschool children without developmental difficulties.


Bartsch, K., & Wellman, H. M. (1995). Children talk about the mind. New York: Oxford University Press.

Brooks, R., & Meltzoff, A. N. (2002). The importance of eyes: How infants interpret adult looking behavior. Developmental Psychology, 38, 958-966.

Bullock, M., & Russell, J. A. (1985). Further evidence on preschoolers' interpretation of facial expressions. International Journal of Behavioral Development, 8, 15-38.

Camras, L., & Allison, K. (1985). Children's understanding of emotional facial expressions and verbal labels. Journal of Nonverbal Behavior, 9, 84-94.

Camras, L. A., Malatesta, C., & Izard, C. E. (1991). The development of facial expressions in infancy. In R. S. Feldman, & B. Rime (Eds.), Fundamentals of nonverbal behavior (pp. 73-105). New York: Cambridge University Press.

Dawson, G., Webb, S. J., Carver, L., Panagiotides, H., & McPartland, J. (2004). Young children with autism show atypical brain responses to fearful versus neutral facial expressions of emotion. Developmental Science, 7(3), 340-359.

Denham, S. A. (1986). Social cognition, prosocial behavior and emotion in preschoolers: Contextual validation. Child Development, 57, 194-201.

Denham, S. A., Zoller, D., & Couchhoud, E. A. (1994). Socialization of preschoolers' emotion understanding. Developmental Psychology, 30, 928-936.

Dunn, J. (1995). Children as psychologists: The later correlates of individual differences in understanding of emotions and other minds. Cognition and Emotion, 9, 187-201.

Dunn, J. (1999). Making sense of the social world: Mindreading, emotion, and relationships. In P. D. Zelazo, J. W. Astington, & D. R. Olson (Eds.), Developing theories of intention (pp. 229-242). Mahwah, NJ: Lawrence Erlbaum.

Ekman, P. (1982). Emotion in the human face (2nd ed). Cambridge, England: Cambridge University Press.

Etcoff, N. L., & Magee, J. J. (1992). Categorical perception of facial expressions. Cognition, 44, 227-240.

Feldman, R. S., Philippot, P., & Custrini, R. J. (1991). Social competence and non-verbal behavior. In R. S. Feldman, & B. Rime (Eds.), Fundamentals of nonverbal behavior (pp. 329-350). New York: Cambridge University Press.

Fernald, A. (1989). Intonation and communicative intent in mothers' speech to infant: Is the melody the message? Child Development, 60, 1497-1510.

Flavell, J. H. (1999). Cognitive development: Children's knowledge about the mind. Annual Review in Psychology, 50, 21-45.

Fridlund, A. J., Ekman, P., & Oster, H. (1987). Facial expressions of emotions. In A. W. Siegman, & S. Feldstein (Eds.), Nonverbal behavior and communication (pp. 143-224). Hillsdale, NJ: Lawrence Erlbaum.

Gross, A. L., & Ballif, B. (1991). Children's understanding of emotion from facial expressions and situations: A review. Developmental Review, 11, 368-398.

Grossman, G. (1986). Problematic children in regular classrooms. Tel Aviv: Cherikover. (Hebrew).

Grossmann, T., Striano, T., & Friederici, A. D. (2005). Infant's electric brain responses to emotional prosody. Neuroreport, 16, 1825-1828.

Hammermeister, F., & Timms, M. (1989). Nonverbal communication: Perspectives for teachers of hearing-impaired students. The Volta Review, 91(3), 133-142.

Harris, P. L., Donnelly, K., Guz, G. R., & Pitt-Watson, R. (1986). Children's understanding of the distinction between apparent and real emotion. Child Development, 57, 895-909.

Hess, U., Kappas, A. & Scherer, K. R. (1988). Multichannel communication of emotion: Synthetic signal processing. In K. R. Scherer (Ed.), Facets of emotion: Recent research (pp. 161-182). Hillsdale, NJ: Lawrence Erlbaum.

Johnstone, T., & Scherer, K. R. (2000). Vocal communication of emotion. In M. Lewis, & J. M. Halivand-Jones (Eds.), Handbook of emotions (2nd ed., pp. 220-234). New York: Guilford Press.

Kappas, A., Hess, U., & Scherer, K. R. (1991). Voice and emotion. In R. S. Feldman, & B. Rime (Eds.), Fundamentals of nonverbal behavior (pp. 200-238). New York: Cambridge University Press.

Meltzoff, A. N. (1995). Understanding the intention of others: Re-enactment of intended acts by 18-month-old children. Developmental Psychology, 31, 838-850.

Meltzoff, A. N. (1996). Understanding intentions in infancy. In A. Leslie (Chair), Children's theory of mind 6 PSRVLXP FRQGXFWHG DW WKH;9,QWHUQHWLRQDO &RQJUHVV of Psychology, Montreal, Canada.

Meltzoff, A. N., & Brooks, R. (2001). "Like me" as a building block for understanding other minds: Bodily acts, attention, and intention. In B. F. Malle, L. J. Moses, & D. A. Baldwin (Eds.), Intensions and intentionality: Foundations of social cognition (pp. 171-191). Cambridge, MA: MIT Press.

Meltzoff, A. N., Gopnik, A., & Repacholi, B. M. (1999). Toddlers' understanding of intentions, desires, and emotions: Explorations of the dark ages. In P. D. Zelazo, J. W. Astington, & D. R. Olson (Eds.), Developing theories of intention: Social under-standing and self-control (pp. 17-41). Mahwah, NJ: Lawrence Erlbaum.

Meltzoff, A. N., & Moore, M. K. (1983). Newborn infants imitate adult facial gestures. Child Development, 54, 702-709.

Morton, J. B., & Trehub, S. E. (2001). Children's understanding of emotion in speech. Child Development, 72, 834-843.

Most, T., & Greenbank, A. (2000). Auditory, visual, and auditory-visual perception of emotions by adolescents with and without learning disabilities and its relation to their social skills. Leaming Disabilitles - Research Practice, 15, 171-178.

Most, T., Weisel, A., & Zaychik, A. (1993). Auditory, visual and auditory-visual identification of emotions by hearing and hearing impaired adolescents. British Journal of Audiology, 27, 247-253.

Mumme, D. L., & Fernald, A. (1996). Infants' responses to facial and vocal emotional signals in a social referencing paradigm. Child Development, 67, 3219-3237.

Repacholi, B. M. (1998). Infants' use of attentional cues to identify the referent of another person's emotional expression. Developmental Psychology, 34, 1017-1025.

Repacholi, B. M., & Gopnik, A. (1997). Early reasoning about desires: Evidence from 14- and 18-month olds. Developmental Psychology, 33, 12-21.

Rosenthal, R., Hall, J. A., DiMatteo, M. R., Royers, P. L., & Archer, D. (1979). Sensitivity to nonverbal communication: The PONS Test. London: John Hopkins University Press.

Russel, J. A., & Fernandez-Dols, J. M. (1997). What does a facial expression mean? In J. A. Russel, & J. M. Fernandez-Dols (Eds.), The psychology of facial expression (pp. 3-30). Cambridge, England: Cambridge University Press.

Scherer, K. R. (1982). Methods of research on vocal communication: Paradigms and parameters. In K. R. Scherer, & P. Ekman (Eds.), Handbook of methods in nonverbal behavior research (pp. 136-198). Cambridge, England: Cambridge University Press.

Scherer, K. R. (1986). Vocal affect expression: A review and a modality for future research. Psychological Bulletin, 99, 143-165.

Siegman, A. W. (1987). The telltale voice: Nonverbal messages of verbal communication. In A.W. Siegman, & S. Feldstein (Eds.), Nonverbal behavior and communication (pp. 351-434). Hillsdale, NJ: Lawrence Erlbaum.

Sincoff, J. B., & Rosenthal, R. (1985). Content-masking methods as determinants of results of studies of nonverbal communication. Journal of Nonverbal Behavior, 9, 121-129.

Smiley, P., & Huttenlocher, J. (1989). Young children's acquisition of emotion concepts. In C. Saarni, & P. L. Harris (Eds.), Children's understanding of emotion (pp. 27-49). Cambridge, England: Cambridge University Press.

Wallbott, H. G., & Scherer, K. R. (1986). Cues and channels in emotion identification. Journal of Personality and Social Psychology, 51(4), 690-699.

Wellman, H. M., Harris, P. L., Banerjee, M., & Sinclair, A. (1995). Early understanding of emotion: Evidence from natural language. Cognition and Emotion, 9, 117-149.

Williams, C., & Stevens, K. (1972). Emotions and speech: Some acoustical correlates. Journal of the Acoustical Society of America, 52(4), 1238-1250.

(1.) Please address correspondence to the first author.

The authors would like to express their appreciation to Dee B. Ankonina for her editorial contribution. Thanks are also extended to the nursery school teacher, Mrs. B. Aran, the parents and the children who participated in this study.



School of Education

Tel Aviv University

Israel 69978.

Phone: 972-3-640-8472

Fax: 972-3-640-9477



School of Education

Tel Aviv University

Israel 69978.



School of Education

Tel Aviv University

Israel 69978.


Tova Most, Dorit Bachar, and Esther Dromi Tel-Aviv University
COPYRIGHT 2012 Behavior Analyst Online
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2012 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Most, Tova; Bachar, Dorit; Dromi, Esther
Publication:The Journal of Speech-Language Pathology and Applied Behavior Analysis
Article Type:Report
Date:Aug 1, 2012
Previous Article:The effect of speaker's gender and number of syllables on the perception of words by young children: a developmental study.
Next Article:Experimental comparison of brief behavioral and developmental language training for a young child with autism.

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters