Printer Friendly

Taste interacts with sound symbolism.

Sound symbolism holds that the vocal sounds in words have meanings in and of themselves. As early as 1929, Kohler found that adults tend to associate nonsense words containing rounded vowels (bouba) with curved shapes and words with nonrounded vowels (kiki) with spiked shapes (Kohler, 1947). This effect has been replicated consistently since that time and has also been found to occur in toddlers (Maurer, Pathman, & Mondloch, 2006). Imai, Kita, Nagumo, and Okada (2008) found that sound symbolism could function to facilitate verb learning in 3 year old children. Nygaard, Cook, and Namy (2009) presented evidence that sound symbolism aids cross-lingual word learning in adults. These studies and the many others like them indicate that perceptual cues influence features of language and aid in word learning. These data support the idea that language learning can be facilitated by naturally occurring biases such as sound shape associations. These cross-modal associations between shape and sound potentially facilitate language learning and could practically be used to create meaningful, memorable brand names. Additionally, the design of product packaging could also benefit by conveying additional harmonious sensory information about the product inside (Klink, 2000), such as taste or fragrance.

A foundation for sound symbolism and cross-modal matching comes from the study of synesthesia. Research suggests that all adults contain the connections between senses seen in synesthetes in a muted form (Ramachandran & Hubbard, 2001). Ramachandran and Hubbard also propose that the representations in motor brain maps of certain lip and tongue movements may be symbolically associated with certain sounds and phonemic representations. Supporting this idea is the discovery of mirror neurons which fire when viewing another person perform an action. These mirror neurons could be the link between sound and motor lip and tongue movements. Ramachandran and Hubbard hold that these key factors, symbolic associations between sound and shapes and between sounds and oral movements, combined to produce language. These innate associations between visual, auditory, and motor function could be an important evolutionary phenomenon constraining language development.

Evidence now exists that cross-modality matching can occur in sensations in chemical senses, such as odor and taste, which suggests olfaction and gustation might also be constrained by innate biases. Seo and colleagues (2010) found that odors were systematically associated with certain shapes; that is, pleasant odors (e.g., vanilla, banana, violet, honey melon, and mint) were paired with rounded symbols while unpleasant odors (e.g., parmesan cheese, truffle, and pepper) were paired with angular or square symbols. Additionally, Seo and colleagues measured event-related potentials (ERP) recordings to determine if congruent symbols would modify olfactory perception and olfactory event-related potentials. The results of olfactory ERPs indicated an association between abstract symbols and odors in relation to congruency. When visual and olfactory stimuli were congruent, responses were faster to the stimuli. Results also suggested that these associations occurred at early levels of processing.

Investigations have expanded on this research from olfaction to gustation. Spence and Gallace (2011) found that certain foods and beverages (sparkling water, cranberry juice, and chocolate-covered malt honeycomb candies) were more likely to be associated with angular shapes and words with nonrounded vowels (kiki and takete) while still water, Brie, and chocolate-covered caramel candies were better associated with rounded shapes and words with rounded vowels (bouba and maluma). These results clearly show that the occurrence of sound symbolism goes beyond vision to include the sense of taste. Additionally, the different associations between the two types of chocolate indicate that the sense of sound may affect perception of the food product. The authors report that the chocolate-covered malt honeycomb candies create more noise when eaten as compared to the chocolate-covered caramel candies and that this difference in sound and texture may have altered the responses of the participants.

Research also demonstrates that the effects of a single flavor alone can depend on its constituents. Ngo, Misra, and Spence (2011) explored the differences seen in associations of different types of chocolate by testing chocolates with varying cocoa content. They used commercially available products including milk chocolate with 30% cocoa content and two pieces of dark chocolate, one with a cocoa content of 70% and the other with 90% cocoa. Participants matched the milk chocolate with 30% cocoa content with the rounded shape and the words containing rounded vowels (lula and maluma), while the dark chocolates with 70% and 90% cocoa content were matched with the angular shape and words containing nonrounded vowels (tuki and takete). Ngo and colleagues suggested that the chocolate samples' bitterness is most likely the basis of these associations, as chocolates with higher cocoa content are perceived as more bitter, resulting in more sharp sound and visual connotations.

Data are lacking on whether a chemical sensation such as taste would influence the bouba/kiki effect to such a degree as to alter the expected associations between sound and shape when participants hear as well as see the text of the nonsense word. To address this, we developed the present study to investigate whether the sense of an incongruent taste would influence the manifestation of sound symbolism. We hypothesized that a smooth beverage (chocolate milk) would affect matches to a nonrounded vowel and that a tart beverage (cranberry juice) would affect matches to the rounded vowel depending on the visual stimulus that was likely in short-term memory, or what we term as visual recency. This follows from previous research where cranberry juice was matched with nonrounded vowels and spiked shapes (Spence & Gallace, 2001) whereas milk chocolate (with lower cocoa content) was matched to rounded vowels and rounded shapes (Ngo, et al., 2011). Bottled water was used as a control beverage to provide participants the experience of sampling a beverage but without an associated taste. All beverages were readily available at local grocery stores, providing the opportunity to assess real world food products.



Undergraduate students (N = 120) at the University of Alabama in Huntsville participated in this study in return for one research credit for introductory psychology classes. The average age of participants was 21 years; 92 were women and 26 were men. All APA ethical guidelines were followed and the study was approved by the IRB. In accordance with laws in the state of Alabama, volunteers under the age of 19 obtained parental consent before participating.


We used a 2 x 3 (Form: spike or curved, by Beverage: water, cranberry juice, chocolate milk) between subjects design. The form refers to the image (spiked or curved) that was manipulated to be presented on the right side of the screen. We measured ratings (1-10) of confidence that the picture matched the word and likelihood that someone else would choose the same picture.

We also recorded the choice of the form to determine the presence of the sound symbolism effect by deriving a match score. A match score for each subject was calculated based on selecting the rounded form for the words with rounded vowels and the angular form for the words with nonrounded vowels. Thus, a score of 1 indicated choosing in the expected direction on the trial (i.e., choosing the round shape when the word with rounded vowels was presented) and a score of 0 represented the unexpected choice (i.e., choosing the round shape when the word with non-rounded vowels was presented and vice versa). Because each participant had two trials, the upper limit of the score was 2 and the lower limit was 0.


A computer used PowerPoint[R] to present bouba or kiki as auditory stimuli through speakers while simultaneously presenting the graphemes of the words on the screen in Arial font set at a 2 s interval. The figures were then set to appear for 5 s as adjacent images in left-right counterbalanced order, with one rounded shape with convex curves and one spiked shape with acute angles jutting from central area of figure (see Figure 1). We created a rating scale to assess the confidence the participants had that the picture matched the word and their expectation of the likelihood that someone else would choose the same picture with the same scale (1= highly unlikely, 10= highly likely). A video (3 min 18 s) on study skills retrieved from was administered as a distracter task as a means to distract participants between the two trials of the sound symbolism tasks. Participants answered two short questions pertaining to the video.


Commercially available beverages of bottled water, cranberry juice, and low-fat chocolate milk were given to the participants in 3 oz disposable cups of white plastic. Manipulation checks consisted of questions regarding the extent of art background, consistent association of color with words, music, or sounds, languages spoken. These questions were used to assess the possibility that the participants had been exposed to the sound symbolism paradigm or perceived crossmodal associations previously. Additional questions included hunger ratings and beverage preference. Demographic questions included age, sex, and major.


Participants were individually scheduled for sessions of 30 min and randomly assigned to beverage and order conditions. Participants were seated at a computer after providing consent and instructed that they would taste a sample of liquid after which they would match a name to a picture. Beverages were kept in a small refrigerator in the testing room against the wall opposite of the computer terminal which ensured that all beverages were kept out of sight of the participant. A sample of the assigned beverage was poured into the 3 oz cup while facing away from the participant and then handed to the participant.

After tasting the sample beverage, the first word with either rounded or nonrounded vowels was presented as a grapheme displayed on the computer screen simultaneously with auditory presentation via speakers for 2 s. The images of the curved and spiked figure then appeared simultaneously as adjacent images on the screen for 5 s in counterbalanced order between subjects. The participant selected the figure that matched the word by pointing to the image on the screen and the technician noted their choice. The participants then rated their confidence that their choice matched the word (1-10) and the likelihood that someone else would make the same selection (1-10) by circling their response on an answer sheet on the table in front of them. The participant then watched a short video on study skills as a distracter task. The procedure was repeated once more for a total of two trials with the participant tasting the same liquid followed by selecting the image that matched the other word and providing the same ratings. Demographic information and manipulation check questions were distributed and participants were debriefed and released.


The Goodness of Fit Chi Square supported the presence of the bouba/kiki effect by establishing the number of times bouba was associated with a rounded figure and kiki was associated with the spiked figure. The number of expected matches was compared to the number of unexpected matches to reveal significantly more expected matches (approximately 77% overall), [X.sup.2] (1, N = 250) = 71.824, p < .001, confirming the presence of sound symbolism in general.

A 2 x 3 (Form by Beverage) between subjects ANOVA was performed on the matched score and revealed a significant interaction for Beverage and Form, F(2, 119) = 3.34, p = .04, [[eta].sup.2] = .053. Figure 2 shows this cross-modal interference with the sound symbolism effect. As expected, sound symbolic matches decreased when the taste presented was incongruent with the visual form that was last seen because it was presented on the right (i.e., visual recency). The spiked image on the right decreased matches after tasting chocolate milk while the rounded image on the right decreased matches after tasting cranberry juice. There was no significant main effect for Beverage, F(2, 119) = 7.69, p = . 19, or for Form, F(1, 119) = .01, p = .94. Note that the expected matches were approximately equal after tasting only water. The results indicated that the association between the visual stimulus and taste was more influential than the sound of the word.


Confidence ratings did not reveal a main effect of Beverage, F(2, 119) = .925, p = .399; for form, F(1, 119) < 1; or interaction, F(2, 119) < 1. Likewise, there was no main effect or interactions for the likelihood ratings that others would make the same selection, F < 1, although the data suggest the convergence of the spiked figure with the cranberry juice was related to slightly higher ratings. Tables 1 and 2 present the M and SE for these data.

Only 4 participants were Language majors, which makes it unlikely that the majority of participants were familiar with Sound Symbolism. Few participants reported an extensive background in art (i.e., major, minor, or several formal classes) to indicate minimal exposure to possible theories in art that may include synesthetic or cross-modal instruction. The majority of participants reported that they consistently associated color with words, sounds, or music. Although this question was intended to assess perceptual phenomenon, it did not distinguish for common learned associations (such as those used in product design: red = hot, blue = cold). Almost all of the participants spoke English as their first language; differences in those who spoke other languages were not assessed. Overall, participants rated their hunger level as neutral. The beverages were given a slightly above average rating.


The data provided evidence of cross-modal interference with the bouba/kiki effect via the chemical senses. This adds further support to previous research in which cranberry juice was matched with nonrounded vowels and spiked shapes (Spence & Gallace, 2001) while milk chocolate was matched to rounded vowels and rounded shapes (Ngo, et al., 2011). After tasting water, the number of expected matches was approximately equal regardless of the form on the right side of the screen. The flavorless water did not alter the bouba/kiki effect possibly because there was no competing gustatory sensation to interfere with the expression of innate biases.

The expected matches decreased when the taste presented was incongruent with the image on the right of the screen (i.e., chocolate milk incongruent with spiked shape and cranberry juice incongruent with rounded shape). The images were presented contiguously and simultaneously on the computer screen, but would be subject to a left-right bias that would have promoted a recency effect for the form on the right side of the screen. The majority of participants were native English speakers and readers, who in previous studies have been shown to exhibit a left-right bias when scanning attentional displays (Spalek & Hammad, 2005). This indicates that participants would initially focus on the image on the left before moving on to focus on the image on the right (i.e., the last image seen). This explains why visual recency was the important factor in producing the interference.

Additionally, these results suggest that the visual form of the shape was more important in modulating the effects than the sound of the word. The recency effect potentially mediates this interference, as the last image seen is presumably still in working memory (Talmi & Goshen-Gottsein, 2006). If incongruent information in memory conflicted with the standard bouba/kiki effect, there was a reduction in the number of expected matches which supports the existence of cross-modal associations between vision and taste. This is not unexpected given the importance of visual appearance in food selection in forming expectations for food products. Hurling and Shepherd (2003) demonstrated that expectations of liking based on appearance influenced the final assessment of the product during consumption. External cues, such as brand name and labeling, can also influence expectations and perceived product quality. These expectations about a product before its use can influence the post-use assessment of that product, further emphasizing the importance of creating cohesive brand names and labels. This suggests that cross-modal associations and factors that interfere with them could have applications in marketing. Maximizing the convergence of the visual cues and associated sounds with the taste of a product should promote product satisfaction because product expectations would be in accord. It is possible that cross-modality interference as demonstrated in the present study could diminish product satisfaction.

The present data added to the growing body of evidence supporting cross-modal associations in normal humans. These associations could be used to create more meaningful, memorable brand names and product packaging to convey additional harmonious sensory information about the product inside. This study differed from others reported here in that participants heard the sound of the nonsense words over speakers in addition to seeing the text of the word. The previous studies primarily presented participants with the text of the words with rounded vowels and those with non-rounded vowels on opposite ends of Likert scales. Consequently, the physical appearance of the word could have been responsible for the presence of the effect in those instances. Hearing the sound of the word in addition to seeing the text could mediate the effect. Future studies should disentangle the effect of hearing the phonemes while seeing the grapheme. Additionally, cross-modality matching or interference could potentially be influenced when an observer sees a person's mouth movements as the word is pronounced, because presumably mirror neurons would fire upon seeing a person pronounce the word. This activation would not occur when only hearing the sound of the word or seeing the text of the word. This additional kinesthetic feedback would be another avenue for exploring how innate biases promote or interfere with cross-modal matching as was demonstrated with taste in this study.


Hurling, R. & Shepherd, R. (2003). Eating with your eyes: Effect of appearance on expectations of liking. Appetite, 41, 167-174. doi:10.1016/S01956663(03)00058-8

Imai, M., Kita, S., Nagumo, M., & Okada, H. (2008). Sound symbolism facilitates early verb learning. Cognition, 109, 54-65. doi:10.1016/j.cognition 2008.07.015

Klink, R. R. (2000). Creating brand names with meaning: The use of sound symbolism. Marketing Letters, 11, 5-20.

Kohler, W. (1947). Gestalt psychology. New York: Liveright Publishing Corporation.

Maurer, D., Pathman, T., & Mondloch, C. J. (2006). The shape of boubas: Sound-shape correspondences in toddlers and adults. Developmental Science, 9, 316-322. doi: 10.1111/j.1467-7687.2006.00495.x

Ngo, M. K., Misra, R., & Spence, C. (2011). Assessing the shapes and speech sounds that people associate with chocolate samples varying in cocoa content. Food Quality and Preference, 22, 567-572. doi:10.1016/j.foodqual.2011. 03.009

Nygaard, L.C., Cook, A.E., & Namy, L.L. (2009). Sound to meaning correspondences facilitate word learning. Cognition, 112, 181-186. doi: 10.1016/j.cognition.2009.04.001

Ramachandran, V.S., & Hubbard, E.M. (2001). Synaesthesia--A window into perception, thought, and language. Journal of Consciousness Studies, 8, 3-34.

Seo, H. S., Arshamian, A., Schemmer, K., Scheer, I., Sander, T., Ritter, G., & Hummel, T. (2010). Cross-modal integration between odors and abstract symbols. Neuroscience Letters, 478, 175-178. doi:10.1016/j.neulet. 2010. 05.011

Spalek, T.M. & Hammad, S. (2005). Inhibition of return is due to the direction of reading. Psychological Science, 16, 15-18. doi: 10.1111/j.0956 7976.2005.00774.x

Spence, C., & Gallace, A. (2011). Tasting shapes and words. Food Quality and Preference, 22, 290-295. doi: 10.1016/jfoodqual.2010.11.005

Talmi, D., & Goshen-Gottsein, Y. (2005). The long-term recency effect in recognition memory. Memory, 14, 424-436. doi:10.1080/ 09658210 500426623

Cassie A. Stutts and Aurora Torres

The University of Alabama in Huntsville

Author info: Correspondence should be sent to: Dr. Aurora Torres, Department of Psychology, The University of Alabama in Huntsville, 301 Sparkman, Huntsville, AL 35899. E-mail:
TABLE 1 Mean and SE for Confidence Ratings

                   Spike Last                 Curve Last

                 M            SE           M            SE

Water           5.3          .52          5.7          .58
Choc Milk       5.4          .56          5.0          .61
Cran Juice      6.0          .56          6.0          .49

TABLE 2 Mean and SE for Likelihood Ratings

                  Spike Last            Curve Last

                M          SE         M          SE

Water          6.3        .34        6.3        .43
Choc Milk      6.2        .46        6.2        .51
Cran Juice     7.0        .33        6.4        .40
COPYRIGHT 2012 North American Journal of Psychology
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2012 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Stutts, Cassie A.; Torres, Aurora
Publication:North American Journal of Psychology
Article Type:Report
Geographic Code:1USA
Date:Mar 1, 2012
Previous Article:Does religious faith improve test performance?
Next Article:The head and shoulders psychology of success project: an examination of perceptions of Olympic athletes.

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters