Printer Friendly

Apprehending the live voice: hearing and prehearing.

TOO OFTEN, IN A WORLD OF RECORDED VOICE and cognitive sciences, theoretic writers address voice on every level but that of material, physical voice. Even many professional voice users have only an instinctive sense concerning the importance of live, human, vocal sound. These oversights traditionally have been seen as fairly inconsequential, and would be still were it not for recent research in neurophysiology. Through that research, however, we can now see that some things teachers of singing have said for hundreds of years--such as, "stay in the moment" "mean something" and "reach yourself to reach your audience"--are neither quaint pedagogic idealizations, nor unfounded philosophic excesses.

To that end, a solid (albeit minimal) overview is given here of interrelations between hearing, what can be called prehearing, and the somatosensory effect that well produced live human voice has on the human nervous system. This effect occurs simultaneously, and mutually, in the nervous systems of both speaker/singer and hearer.

Short descriptions of basic hearing principles and Acoustic Startle Reflex (ASR) are given to illustrate the immediacy of effect of sound (vocal and nonvocal) on the autonomic nervous system. These are followed by a look at auditory cues, voiced laughter, tonal relations between laughter and song, and the known psychological effects of human vocal sounds in general. Finally (the new part), basic descriptions of aspects of voice production are referenced, in terms of the psychophysiological properties of the vagus nerve. Here, sound apprehension and voice production are discussed in relation to Polyvagal Theory as presented by neurophysiologist Stephen W. Porges. (1)

Sound is an energy form transmitted as a longitudinal pressure wave through a medium (instrument, voice, etc.). The human ear's purpose in the area of hearing is to convert sound waves into nerve impulses. (2) When we hear, sound waves are collected by soft cartilage and nerves in the skin of the outer ear and directed through the outer ear canal. They make the eardrum vibrate, causing three tiny bones in the middle ear to vibrate, and that vibration travels to hairlike structures in the cochlea of the inner ear. This triggers the generation of nerve signals that are sent to the brain. Once the nerve signals reach the cortex, cognitive processing begins.

But several sensory things happen before vibration reaches the eardrum, much less the cortex. Sound apprehension in its earliest stage directly influences unconscious, precognitive, prelanguage, behaviors and reactions. Cognitive areas of the brain are found, in fact, to be little concerned with sounds per se; sound vibration alone does not provide the patterning necessary for a sound to be "noticed" by the language processing areas of the brain. (3)

The vibration of vocables (voiced vocal sounds, vowels) are the early actors in vocal sound apprehension, activating the somatic nervous system (consonants are more readily recognized by the cognitive areas of the brain). (4) This makes singing very interesting in terms of sustained voice sound--given that singing is, in its most basic sense, simply speech in which vowels and their inflections are expanded to last longer. And in that expanded time frame, volume, timbre, and inflection can ebb and flow.

People have always known that voice in singing does something in excess of voice in speech. Whereas singing, like speech, is for the most part a series of cognitive processes, singing cannot be described as only an intricate series of cognitive processes involving language, music, and performance. This is not to say that the many neural associations having to do with language after initial sound apprehension are not crucial to the effectiveness of either speech or song; it is merely necessary to point out that voiced sound in itself is not chastely tied to cognition, but also to precognition. Jakobson and Santilli established in 1980 that spoken/sung words cannot be distinguished by the mind as informational until after the sound itself has been "sorted out" by the nervous system. (5)

Language and cognition are, of course, tightly interrelated, (6) and music acquisition, in itself, stresses pattern over sound qualities. In this context, music must be construed as language. (7) Even the dynamics of performance itself relies on cognition, and according to Palmer, the "parallels with other domains support the conclusion that music performance is not unique in its underlying cognitive mechanisms." (8) But as we have seen, cognition is the "late bird," even to the extent that, as both Gazzaniga and Bickle have pointed out, cognition alone is inadequate to the task of either fully representing, or accurately interpreting, human experience. (9) Cognitive neuroscientist Michael Gazzaniga states: "Reconstruction of events starts with perception and goes all the way up to human reasoning. The mind is the last to know things." (10)

In addition to temporal considerations, another important excess is part of the sounds produced by an individual, live, human voice. This excess is sensory-affective, and shares its immediacy of affect with its live human hearer.

When we hear, the nervous system registers something (sound vibration), sets something else in motion (nerve stimulus), which in turn prompts a physical action (muscle response), all within milliseconds. Changes in electrical activity in neck muscles happen within nine milliseconds after the onset of auditory stimulus. (11) Not only does the body know something before we do, but it also does things in response before we become consciously aware that anything has happened at all. (12) For example, Acoustic Startle Reflex is the trait, present in mammals, of exhibiting a physical reaction in response to sharp, sudden, sound. As most of us have personally experienced, ASR is both rapid and involuntary. Davis found that the acoustic startle (nerve impulse) pathway is mediated by a simple, three synapse neural pathway to the pontis caudalis, a "traditionally non-auditory part of the brainstem." Because the pontis caudalis is in the reticular formation, one of the oldest (phylogenically) areas of the brainstem and essential to the basic functions of life, any sudden, unexpected, sharp sound (even sudden, unexpected laughter) will immediately command the startle reflex.

Unexpected, sudden, loud laughter attracts immediate attention and causes startle. (13) Laugh is also, sometimes, an "in-turn" response to startle.

The startle reflex triggers our fight-or-flight mechanism. We gasp in a breath of air and tense our muscles when we are startled. If it becomes immediately clear that we are in no danger, our fight-or-flight system shuts down, the adrenalin level subsides, and we have to get rid of the tension and the air we took on board. The tension is what holds the air in, and once that is gone the air comes back out as involuntarily as it went in. In other words, we laugh. (14)

Most researchers regard startle reaction as pre-emotional rather than cognitively emotional, (15) with associated motor response. Meyer, Zysset, von Cramon, and Alter write: "Passive perception of human laughter activates brain regions which control motor (larynx) functions. This observation may speak to the issue of a dense intertwining of expressive and receptive mechanisms in the auditory domain." (16) The voice, in this case, is a type of "happy accident" that produces one of two primary categories of laugh sounds: 1) unvoiced laugh (grunts, snorts, etc.); 2) voiced laugh (tonal, vowel-like) regular vibration of the vocal folds during production, giving the sound atonal, vowel-like quality. "Voiced laughs are the versions that are commonly thought of as typical laughter, and can have a song-like quality if F0 happens to fluctuate in a melodic way over the course of several bursts." (17) Further, unlike voiced speech, voiced laugh sounds are not word-associated. "[An] important difference from speech was that voiced laughter typically occurred as an unarticulated vowel" (18)--sound unassociated with "meaning."

Auditory cues stimulate many experiences in the listener that are both rapid-fire and very individual. Prevost has shown that in response to vocalization, auditory cues actually expand upon deixis (established associations), stimulating a broadening spectrum of synaptic activities.

Auditory cues are not solely noises but also stimulate synaptic structures involving other perceptual cues. The dominance of auditory cues occur because they can be shared in the absence of indexical cues. Though they are initially shaped from indexical cues, the process of competition endows auditory cues with independent capacities. Competition leads to a generalization in which perceptual structures are synaptically stimulated despite the absence of indexical cues. (19)

Thus, we have seen so far that the seemingly very ordinary excess that is the nonverbal materiality of voice expresses cognitive association, activates the nervous system in precognitive and corporeal factors such as "gut" reaction, fear, and startle response, and actually helps to expand mental capacity. In this, live voice acts as a form of touch, one not delayed by any form of external instrumentation.

In terms of personal and shared experience, a neuroanatomic model of the functioning larynx shows that "both the motor and sensory neural networks are coordinated during any phonatory activity" (20) This means that at exactly the same time we are speaking, we are also prehearing our own voices--experiencing the same sensory-somatic phenomena that the listener experiences: fight/flight, "gut" reaction, fear/startle, and so forth.

Cranial nerve involvement in precognitive hearing works as follows: Sensory signals from the outer ear are carried to nerve centers by the greater auricular (cervical) nerve branches of cranial nerves five, seven, and-most critical to this study--the auricular branch of cranial nerve ten (CN X), known as the vagus nerve.

The vagus is a mixed nerve, containing both efferent fibers (carrying nerve impulse to motor areas) and afferent fibers (carrying nerve impulse back to the sensory areas--hence the term "affect"). Among its many tasks, the vagus is directly involved in the production of sound in the larynx, also in simultaneous delivery of sound sensation to the inner ear of the speaker/singer.

The vagus is the longest of the cranial nerves, performing many functions related to stomach, lungs, heart, and voice. Two of its efferent branches ennervate the muscles of the larynx. Afferent fibers simultaneously deliver nerve stimulus back to the surface of the inner ear, and to the acoustic nuclei of the brain stem--the initial sound sensory region of the brain.

Polyvagal Theory, which (among other things) links the evolution of the autonomic nervous system to experiential affect, emotional expression, facial gestures, vocal communication, and contingent social behavior, was first introduced by Stephen Porges in 1994. Porges's interests at the time of that writing primarily involved sound apprehension in terms of interpersonal relations, but his work can also be appreciated in terms of interior vocal reception (intrapersonal relations.) His article, "Love: An Emergent Property of the Mammalian Autonomic Nervous System" implicates voice in ways that are new to those outside the area of vocal performance practice, yet ways that seem blessedly old and familiar to those that are. In fact, Porges's Polyvagal Theory may begin to scientifically explain some things that voice teachers have taught for centuries, but which have been considered purely "artistic" and not scientifically viable. The theory involves three phylogenetic emotion subsystems:

1) Emotion subsystem I is associated with the Ventral Vagal Complex (VVC). Neural pathways of the VVC regulate the muscles that control facial expression, sucking, swallowing, breathing, vocalization, and listening, and is "intimately involved in the communication of affect." (21)

2) Emotion subsystem II involves the Sympathetic Nervous System (SNS), and is associated with intense emotion. The SNS not only contributes to fight and flight behaviors associated with protection of self and significant others, but also promotes the general physiological activation associated with sexual arousal. "The SNS provides a mechanism for mobilization and has long been associated with intense emotion." (22)

3) The third and oldest emotion subsystem is dependent on the Dorsal Vagal Complex (DVC), which provides the primary neural control of subdiaphragmatic visceral organs (responsible for "gut" reaction).

What all this amounts to is that the sound vibration of an utterance hits the external ear of whatever human being might be listening, and goes immediately to precognitive areas of the brain. At the same time, sensory nerve fibers of the vagus leading from the larynx and outer ear of the speaker/singer carry the same impulse to his or her own precognitive brain areas. The systems affected in the medulla of both people are not simply relay stations, as was always assumed, but also--as Polyvagal Theory shows--direct connections with somatosensory affect, intense emotion, and "gut" reaction. Thus, individual, live vocal sound can be seen to provide a profound excess in mutual, simultaneous, shared experience, over and above the contributions and experience inherent in either instrumentation and technology.

When it comes to vocal performance, this explains much about the connection that can exist between singer and audience.


(1.) Dr. Porges is Professor of Psychiatry and Director of the Brain-Body Center at the University of Illinois at Chicago.

(2.) Joe Steinmetz and Glen Lee, "Auditory System," (accessed April 2010).

(3.) Yury Shtyrova, Elina Pihkob, and Friedemann Pulvermullera, "Determinants of Dominance: Is Language Laterality Explained by Physical or Linguistic Features of Speech?" Neurolmage 27, no. 1 (August 2005): 37-47.

(4.) G. Cardillo and M. J. Owren, "Relative Roles of Consonants and Vowels in Perceiving Phonetic versus Talker Cues" Journal of the Acoustical Society of America 111, no. 5 (May 2002): 2433.

(5.) Roman Jakobson and Kathy Santilli, Brain and Language: Cerebral Hemispheres and Linguistic Structure in Mutual Light (Columbus, OH: Slavica,1980).

(6.) See Roger W. Brown and Eric H Lenneberg, "A Study in Language and Cognition," The Journal of Abnormal and Social Psychology 49, no. 3 (July 1954): 454-462.

(7.) M. L. Serafine, Musicas Cognition: The Development of Thought in Sound (New York: Columbia University Press, 1988).

(8.) Caroline Palmer, "Music Performance" Annual Review of Psychology 48 (February 1997): 155-138.

(9.) Michael S. Gazzaniga, Nature's Mind: The Biological Roots of Thinking, Emotions, Sexuality, Language, and Intelligence (New York: HarperCollins Basic, 1992), 118; John Bickle, "Empirical Evidence for a Narrative Concept of Self," in Gary D. Fireman, Ted E. McVay Jr., and Owen J. Flanagan, eds., Narrative and Consciousness: Literature, Psychology, and the Brain (New York and London: Oxford University Press, 2003), 195-204.

(10.) Michael S. Gazzaniga, The Mind's Past (Berkeley: University of California Press, 2000), 1.

(11.) Michael Davis, "Neural Circuitry and Neurotransmitters that Mediate the Acoustic Startle Reflex;' Acoustical Society of America, October 1999, (accessed January 2011).

(12.) Paul Eckman, Wallace V. Friesen, and Ronald C. Simons, "Is the Startle Reaction ah Emotion?" in Paul Eckman and Erika L. Rosenberg, eds., What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS) (New York: Oxford University Press, 2005), 21-39.

(13.) Jo-Anne Bachorowski and Michael J. Owren, "Laughing Matters," American Psychological Association 18, no. 9 (September 2004).

(14.) Max Maven, "Exceptions to Gravity," Genii: The International Conjurors' Magazine 59, no. 3, January 1996, (accessed January 2011).

(15.) Ekman et al.

(16.) M. Meyer, S. Zysset, D.Y. von Cramon, and K. Alter, "Distinct fMRI Responses to Laughter, Speech, and Sounds Along the Human Peri-Sylvian Cortex," Cognitive Brain Research (July 2005): 291-306.

(17.) Jo-Anne Bachorowski and Michael J. Owren, "Laugh Sounds Differ in their Elicitation of Positive Emotional Responses in Listeners," Acoustical Society of America, October 1999, (accessed January 2011).

(18.) Bachorowski and Owren, "Laughing Matters."

(19.) Nathalie Prevost, "The Physics of Language: Towards a Phase Transition of Language Change" (PhD dissertation, Simon Fraser University, 2003),

(20.) Harry Hollien and W. J. Gould, "Neuroanatomical Model for Laryngeal Operation" Journal of Voice 4, no. 4 (December 1990).

(21.) Stephen W. Porges, "Love: An Emergent Property of the Mammalian Autonomic Nervous System," Psychoneuroendocrinology 23, no. 8 (November 1998): 837-861.

(22.) Ibid.

Carolyn Timmsen Amory, BA (Music), MA (Italian Letters and Literature), PhD (Comparative Literature), rediscovered an interest in voice quite late, and studied with Duane Skrablalak, Carmen Savoca, and Peyton Hibbitt of Tri-Cities Opera in Binghamton, NY. She also attended the 1984 Bel Canto Foundation Seminar in Busseto, Italy, taught by Carlo Bergonzi and Renata Tebaldi. Her specific interest is teaching. Dr. Amory's students have been accepted to Boston Conservatory, Carnegie Mellon, Crane School of Music, Peabody Conservatory, and AMDA. Many of her students sing and/or speak professionally.

Dr. Amory taught for the Binghamton Community Music Center from 1984 to 2001, adjuncted in voice at Broome Community College, and maintains a private studio. She served as Translations Coordinator for Tri-Cities Opera from 2001 until 2010, producing original translations of Faust, Tosca, Le nozze di Figaro, and L'elisir d'amore. Presentations include "Silence and Sound in Dante's Inferno"(South Atlantic MLA); "Monody: A Chance Encounter of the Material Voice" (Rutgers); "Monody: Ancient Literary Form and Ancestor of the Contemporary Performer's Authentic Voice" (Indiana); and "Hearing, Pre-Hearing, Voice" (Physiology and Acoustics of Singing, UTSA).

Dr. Amory's dissertation, "Human Vocality: Monody, Magic, and Mind," traces the history of sensory affect in voice from Homerian epic, opera, and acting through recent neurological study. Current research interests include voice and presence, the effects of material voice on reading/writing production and apprehension, and the role of material voice in intellectual capacity. Since receiving her PhD degree in May of 2009, Dr. Amory has taught Remedial and Freshman English at Broome Community College in Binghamton, giving her an opportunity to observe the effects of voice use on general literacy, and investigating ways in which vocal presence (or lack thereof) may influence literacy education.
COPYRIGHT 2011 National Association of Teachers of Singing
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2011 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Amory, Carolyn Timmsen
Publication:Journal of Singing
Date:May 1, 2011
Previous Article:Seductively Spanish--programming ideas for Spanish song repertoire.
Next Article:Stroke and voice recovery from a singer-pedagogue's perspective.

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters