Printer Friendly

Effect of auditory feedback on singing.


Singers love to sing in the shower. It provides a warm and moist environment, as well as a live acoustic. Many singers have commented on how wonderful a performance space is when it has a live acoustic, or how it enables singers to hear themselves, and frequently express how easy it is to sing in such an environment. Given this observation, one would expect much research investigating the effect of auditory feedback on singing; that does not prove to be the case. There are, however, plenty of studies of speech and language development; these show that interruptions or alterations in auditory feedback will inhibit vocal learning, and change the way in which a person speaks.


When children begin the process of learning to speak, they initially distinguish and produce speech sounds regardless of the sound environment around them; however, by six months they quickly begin to recognize the sounds in their home environment, and those of their native language. By nine months they are aware of language stresses and rhythms, and by twelve months will no longer recognize speech sounds outside their native language. Additionally, studies conducted on avian vocal learning, often analogous to human vocal learning, show that birds will learn the sounds of their environment, meaning their birdsong will mimic sounds they hear, and not necessarily be the song of their parents. (1) Investigations involving deaf children show that even though they are given extensive therapy, they will not develop normal speech sounds. Even preadolescents and adults who lose hearing later develop speech abnormalities, although adults are more successful at continuing to speak normally. (2)

Houde and Jordan suggest that although auditory feedback plays a lesser role in adults than in children, changing the feedback will result in a change of articulation. The authors had eight male subjects perform a vocal task, for example speaking /[epsilon]/, as the vowel formant frequencies were altered to provide feedback consistent with a different vowel, like /ae/. They observed that the subjects would alter their articulation of the intended vowel, in this case /[epsilon]/, to the point of articulating a different vowel, like /i/. All of this suggests that humans learn speech via auditory feedback, and are influenced by their sound environment, and that auditory feedback is necessary for learning and maintaining speech. (3)

Since singing can be considered an exaggerated form of speech, can the sung tone be affected by changes to auditory feedback? The aim of the present study was to begin to quantify or identify what changes would occur to the sung tone when auditory feedback is changed, specifically:

* Will there be a phonation change as measured by an electroglottograph and a closed quotient?

* Will pitch accuracy change?

* Will intensity change?

* Will the singer's resonance as measured by the vowel formants and the singer's formant change?

This article discusses the findings of a study that was conducted at the University of Nebraska-Lincoln in relation to the above aims and its possible impact on learning to sing.


Eight voice students from the University of Nebraska Lincoln were recruited to participate in this project as subjects. The majority of the subjects were in the fourth year of college or in a graduate degree program. One subject was in the second year of college, and one subject was in the first year of a doctoral degree program. All subjects were enrolled in applied voice at the University of Nebraska-Lincoln. All four major voice types were represented, soprano, mezzo soprano, tenor, and baritone. The average age of the subjects was twenty-four years old, with a maximum and minimum age of thirty and twenty, respectively. All subjects reported that they were in good vocal health.

All data was collected in a secluded room in order to provide the most acoustically quiet environment available. The subjects were fitted with electroglottograph electrodes connected to the Glottal Enterprises Electroglottograph (EGG), model EG-2, and a head mounted microphone, AKG model C420III connected to a KayPentax Computerize Speech Laboratory (CSL) Model 4500. Subjects sang a vocal exercise, a five note ascending scale with the fifth note held followed by a descending major triad to tonic, using an /a/ vowel at a tempo of approximately sixty beats per minutes. The starting frequency of the exercise was played on the computer and produced by a sound analysis program, PRAAT version 4.4.33. The starting frequencies were 139 Hz (C[sharp].sub.3]) and 278 Hz (C[sharp].sub.4]) for males and females, respectively, yielding a target frequency of 208 Hz (G[sharp].sub.3]) and 415 Hz (G[sharp].sub.3]) for males and females, respectively. The subjects practiced the exercise for a total of approximately ten trials.

Six total protocols were performed seven times each. The protocols were as follows: Room (R), Headphones (HP), Lows (L), Mids (M), Singer's Formant (SF), and Highs (H) (Table 1). Each subject performed the Room protocol first, and the remaining protocols were performed in a predetermined random order. Sennheiser headphones, model 457, were worn by the subjects for all the protocols except the Room protocol. To achieve the altered feedback, the microphone cable was split with one end plugged into the CSL, and the other to equipment that would alter the feedback (Figure 1A). Sound levels were maintained by adjusting the pre-amplifier, equalizer, and mixers' main outputs and gains, and monitored based on the equipment's indicators. An audiologist determined the decibel level of the feedback coming out of the headphones to be between 95dB and 106dB. The equalizer was used to amplify or suppress the various frequencies with the intensity set to +/-12dB. All protocols were approved by the Institutional Review Board at the University of Nebraska-Lincoln. Prior to each performance of the exercise, the subject was given the starting pitch.

From the recorded files, an estimate of vocal fold closure, a closed quotient (CQ) as measured by the EGG was calculated by the KayPentax Real-time EGG Analysis Software, version 3.1.6. It should be noted that the signal was captured using direct current (DC) coupling to limit distortion of the EGG signal for each glottal cycle. To calculate the CQ a decision must be made as to what parts of the signal to compare, the opening phase (AB), the closing phase (BC), a combination of the two (AC), or the complete cycle (AD) (Figure 1B). Furthermore, one must decide where in the signal to place the points of reference, at the base of each cycle, at the peak of each cycle, or at some consistent portion of the cycle. Because the signal varied in amplitude as well as distance, due to laryngeal movement, it was decided to calculate the CQ using the method AC/AD at 25% of peak where AC is the closed phase and AD is the cycle. (4) The average CQ was calculated for each subject at each protocol. Acoustic analysis of frequency, intensity, and formant frequencies was done using the Wavesurfer analysis program, version 1.8.5. Mean formant frequencies were determined by Linear Predictive Coding (LPC), (5) and the Singer's Formant was calculated as the average of the third and fourth formants. From the analysis the average fundamental frequency (pitch) and average intensity were calculated for each subject and protocol. Average percentage change was calculated for all data in comparison with the Room protocol and the Headphones protocol, as well as average magnitude of change (the average of the absolute value of the percentage change).



Changes were observed in all subjects and variables throughout all protocols in comparison to the Room protocol or to the Headphones protocol. Intensity varied by subject with a trend toward decreasing intensity as the feedback frequencies centered near the Singer's Formant (FS) or during the Singer's Formant protocol, when compared to both Room and Headphones protocols. This may have resulted from the higher frequencies being more present in the feedback, so the subjects may have decreased intensity because they perceived themselves as being too loud. The Lows protocol appeared to cause an average increase in intensity relative to the Headphones protocol. The greatest average decrease in intensity was observed in the higher frequency protocols (Mids, Highs, and Singer's Formant) with Singer's Formant presenting the greatest reduction (Figure 2A). Finally, the Lows protocol demonstrated the greatest magnitude of change in comparison to the Headphones protocol (Figure 2B).



The individual subject data for the closed quotient (CQ) variable varied among subjects, with two subjects, Subject 2 and Subject 6, showing great variability. This variability can be explained by placement issues with the EGG electrodes, such as adipose tissue and laryngeal movement, both of which affect the EGG signal and resulting data. Therefore, those subjects' data were removed from the analysis. The remaining data suggest that half of the group demonstrated increased CQ values, while the other half demonstrated decreased CQ values. Adjusted for error, the overall greatest average increase in CQ values was observed in the Lows protocol when compared to both the Room and Headphones protocols. CQ values then consistently decreased through the remaining protocols, with the Singer's Formant protocol demonstrating the greatest reduction in value and greatest overall magnitude of change. This would suggest that the Singer's Formant protocol had the greatest effect on phonation. The observation of the decreased CQ values corresponds with the decreased intensity discussed previously, and therefore the decreased CQ values most likely result from the decreased intensity, since intensity is often achieved by an increase in closed quotient or adduction.

Pitch accuracy as defined by sung pitch compared to the Target Pitch varied by subject, with Subject 10 demonstrating great variability. Consequently, Subject 10 will be discussed separately from the other subjects. The remaining subjects demonstrated on average the greatest accuracy to the target pitch during the Singer's Formant protocol, as well as the greatest magnitude of change. This may suggest the importance of this frequency area in securing pitch. Pitch was most stable during the Headphones protocol. The subjects tended to sing more sharp during the Lows protocol. This may be due to the high sound volume at the level of the fundamental pitch, and perhaps caused interference.

Subject 10 demonstrated difficulty in achieving the target pitch. The subject's pitch was severely low during the Room protocol, and accuracy improved as higher frequencies were amplified (protocols Mids, Highs, and Singer's Formant). This, again, may suggest that feedback in the higher frequencies plays an important role in pitch accuracy (Figure 3).

Formant data show some subjects demonstrating variability throughout the protocols, while others show little variability. Separating the subjects by sex provided further insight. The data for the male subjects suggest that in comparison to the Headphones protocol, the vowel formants increased in frequency with the largest average increase during the Mids protocol. In both comparisons, the Singer's Formant appears unaffected (Figure 4A).


In comparison to the Headphones protocol, formant frequencies for female subjects changed uniformly with Lows having the greatest average increase, and Mids having the greatest decrease. As the feedback frequencies increased or centered on the area of the singer's formant, the change diminished. The uniformity most likely resulted from the decreasing reliability in the formant analysis method as the fundamental frequency increased. However, it does suggest that the tone quality did change during the protocols, and therefore were affected by the feedback.


The basic question asked by the researcher was: Would singers change the way they sang if the auditory feedback were altered? The answer is yes. Pitch, formant frequencies, vocal fold contact, and intensity all exhibited changes as feedback changed. As suggested previously, feedback limited to higher frequencies most likely caused subjects to decrease intensity, and therefore decreased vocal fold contact as measured by the closed quotient. Pitch was most problematic during the Lows and Singer's Formant protocols. Even though the pitch was most accurate on average during the Singer's Formant protocol, the subjects were all either sharp or flat, with no individual subjects demonstrating accuracy. This may suggest that these frequency areas, frequencies below 400 Hz (800 Hz for females) for the Lows protocol and between 2500 Hz and 3500 Hz for the Singer's Formant protocol, may play a role in securing and establishing pitch. It may be that a combination of feedback areas may be needed to secure pitch. Brancucci et al. suggest that the vowel formants or vowel quality seem important in determining pitch. (6) It may be that the suppression of portions of the vowel formants caused the pitch inaccuracies.

The formant data presented some interesting findings. Females being most affected by the Lows protocol (near F1), and males being most affected by the Mids protocol (near F2) may be a result of training methods or strategies used by females and males, respectively. Sundberg reported that sopranos tend to focus on tuning F1 near to the fundamental, and Donald Miller has suggested that males tune F2 to as high a harmonic as possible. If these strategies were being used, consciously or subconsciously, this may explain why those protocols affected those groups. It is also possible that both male and female subjects elevated the formant frequencies so as to better distinguish them in the respective protocols. Furthermore, the vowel formants were altered in both males and females suggesting that vowel definition and quality did change. These alterations may have improved vowel clarity or sound quality; however, these were not a focus of the current study. Additional study is required to provide further insight into whether manipulation of the formant frequencies in males versus females produces a specific change in the manner of singing, or affects vowel clarity or tone quality.

With the exception of the formant data, the protocols that amplified frequencies near F1 (Lows protocol) and FS (2500-3500 Hz) or higher (Singer's Formant and Highs protocols) appeared to affect the subjects the most, suggesting that feedback in these areas is important to singing. Considering the increase intensity and CQ values during the Lows protocol and the decrease in intensity and CQ values during the Singer's Formant and Highs protocols, it is possible that the subjects tended to use a heavier mechanism or more vocal effort, that is, more chest register during the Lows protocol, and a lighter mechanism or less vocal effort, that is, more head register, during the higher protocols. The possible reason for this may be that during the Lows protocol with the elimination of the higher frequency harmonics where the "quality of the voice" caused the subjects to sing with a heavier mechanism so as to provide more harmonic content in this area, where in the higher protocols the feedback of the higher frequency harmonics was more than sufficient, which caused the subjects to relax their vocal production. Therefore, increased feedback or a greater sense of awareness of higher frequencies may bring about a more relaxed vocal function, particularly if the student has a tendency to sing with a chest register dominated or pressed phonation sound. This is similar to the observation made by Alfred Tomatis, a French ENT, who recounted several cases where opera singers experience vocal troubles, and after undergoing a listening regimen where high frequencies were amplified, the singers' voices miraculously returned. Tomatis attributed this to a reconditioning of the singers' ear muscles. (7)

Whether the muscles of the ear can be reconditioned or whether Tomatis's theories are correct is a debate beyond the scope of this article. However, there are some pedagogic and health implications. The pedagogic implication is that the type of auditory feedback students receive may help or hinder their singing. This feedback may be limited to the acoustics of the studio environment, a greater awareness of certain frequency areas, or direct feedback of certain frequencies. For example, a studio environment that suppresses higher frequency harmonics, a "dead" room, relatively would provide increased feedback to lower frequency harmonics, and result in a more tense tone, and more difficult vocal production. Furthermore, if a student sings with a pressed tone, then having that student listen to music on his/her iPod, such as a Mozart string quartet, using the High Booster setting of the graphic equalizer function, for ten minutes a day would help develop a greater awareness of higher frequencies and result in a more relaxed tone and easier vocal production. Many consonants have frequencies in the upper harmonics, and so diction would also improve. Additionally, providing students with direct feedback in lessons using a pair of headphones, a microphone, and a graphic equalizer would help facilitate vocal changes that once sufficiently experienced can be made automatic. Caution should be used when providing direct feedback. An audiologist should be consulted before attempting this type of feedback, so as to protect against hearing loss to the student.

The health implication is obvious. Any loss of hearing will have a major impact on how singers use their instrument. When watching a movie or television program, one is often overcome by a full, rich, and loud bass sound. Perhaps due to the physical sensation one receives when listening to music or a movie with an enhanced bass, amplification of low frequencies is dominating our society. Student singers are listening to music more than ever on iPods and MP3 players, often with the bass booster turned in the "on" position. This is alarming for two reasons: not only is this practice harmful to the student's hearing because repeated exposure to loud low sounds can cause hearing loss; it is also harmful to the student's voice, because it can affect intonation, tone production, and tone quality. Therefore, student singers--all singers--must be careful to protect their hearing, and by doing so will protect their voices.


Altering auditory feedback singers receive while singing does change the manner with which they sing. There are measurable changes to intensity, pitch accuracy, formant frequencies, and phonation as measured by electroglottograph signals. The greatest impact occurred when feedback was limited to the frequencies near F1 and Singer's Formant and above. There were differences for male and females in the formant frequencies suggesting a change to vowel clarity and tone quality. Further research will need to focus on altering the formant frequencies during feedback to determine if there are any direct correlations between the altered feedback and the changes the singer makes, as well as a juried panel to access any quality changes as positive or negative. Increased feedback or awareness of frequency areas may be used to help improve vocal instruction; however, care must be taken to protect the student's hearing.


The author would like to express his heartfelt gratitude to Dr. Jerry Doan, Professor of Voice at Arizona State University, for his help in developing this project, and Dr. Tom Carrell, an audiologist at the Barkley Center at the University of Nebraska-Lincoln, for his assistance in study design and implementation.


(1.) Allison J. Doupe and Patricia K. Kuhl, "Birdsong and Human Speech: Common Themes and Mechanisms," Annual Review of Neuroscience 22 (1999): 567-631; Michael S. Brainard and Allison J. Doupe, "Auditory Feedback in Learning and Maintenance of Vocal Behavior," Nature Reviews: Neuroscience 1 (October 2000): 31-40; Jon T. Sakata and Michael S. Brainard, "Real-Time Contributions of Auditory Feedback to Avian Vocal Motor Control," Journal of Neuroscience 26, no. 38 (September 20, 2006): 9619-9628.

(2.) Doupe and Kuhl; D. Ward and E. Burns, "Singing without Auditory Feedback," Journal of Research in Singing 1 (1978): 24-44.

(3.) John F. Houde and Michael, I. Jordon, "Sensorimotor Adaptation of Speech I: Compensation and Adaptation," Journal of Speech, Language, and Hearing Research 45 (April 2002): 295-310.

(4.) R. F. Orlikoff, "Assessment of the Dynamics of Vocal Fold Contact from the Electroglottogram: Data from Normal Male Subjects," Journal of Speech and Hearing Research 34 (April 1999): 1066-1072.

(5.) Parameters for LPC are Number of Formants: 4, Pre-emphasis factor: 0.7, LPC order:14, LPC type: 0, Down-sampling frequency: 10,000 Hz, and Nominal F1 frequency: -10.0 Hz. Formant values for the female data are likely to be somewhat less accurate than the male data due to the increase spacing of the harmonics.

(6.) Alfredo Brancucci et al., "Vowel Identity between Note Labels Confuses Pitch Identification in Non-Absolute Pitch Processors," PLoS ONE 4, no. 7 (July 2009): e6327.

(7.) Alfred A. Tomatis, The Ear and Voice, trans. Roberta Prada and Pierre Sollier (Toronto: Scarecrow Press, 2005), 11-20.


Brancucci, Alfredo, R. Dipinto, I. Mosesso, and L. Tommasi. "Vowel Identity between Note Labels Confuses Pitch Identification in Non-Absolute Pitch Processors." PLoS ONE 4, no. 7 (July 2009): e6327.

Brainard, Michael S., and Allison J. Doupe. "Auditory Feedback in Learning and Maintenance of Vocal Behavior." Nature Reviews: Neuroscience 1 (October 2000): 31-40.

Doupe, Allison J., and Patricia K. Kuhl. "Birdsong and Human Speech: Common Themes and Mechanisms." Annual Review of Neuroscience 22 (1999): 567-631.

Houde, John F., and Michael, I. Jordon. "Sensorimotor Adaptation of Speech I: Compensation and Adaptation." Journal of Speech, Language, and Hearing Research 45 (April 2002): 295-310.

Joudry, Rafaele. "Wired For Sound Therapy." Open Ear 3 (1997): 6-8.

Konishi, Masakazu. "The Role of Auditory Feedback in Birdsong." Annals of the New York Academy of Sciences 1016 (2004): 463-475.

Lander, John. "What Role Does the Ear Play in Singing?" Australian Voice 2 (1996): 57-65.

Laukkanen, Anne-Maria, et al. "Effects of HearFones on Speaking and Singing Voice Quality." Journal of Voice 18, no. 4 (December 2004): 475-487.

Madaule, Paul. When Listening Comes Alive: A Guide to Effective Learning and Communication. Norval, ON: Moulin Publishing, 1994.

______. "Listening and Singing." South African Music Teacher, no. 139 (January 2002): 10-13.

______. "Music: An Invitation to Listening, Language, and Learning." Early Childhood Connections (Spring 1997): 30-34.

Orlikoff, R. F. "Assessment of the Dynamics of Vocal Fold Contact from the Electroglottogram: Data from Normal Male Subjects," Journal of Speech and Hearing Research 34, no. 5 (October 1991): 1066-1072.

Sakata, Jon T., and Michael S. Brainard. "Real-Time Contributions of Auditory Feedback to Avian Vocal Motor Control." Journal of Neuroscience 26, no. 38 (September 20, 2006): 9619-9628.

Tomatis, Alfred A. The Conscious Ear: My Life of Transformation Through Listening, ed. and trans. by Billie M. Thompson and Stephen Lushington. New York: Station Hill Press, 1991.

______. The Ear and Language. ed. and trans. by Billie M. Thompson. Phoenix, AZ: Sound Listening and Learning Center, 1996.

______. The Ear and Voice, trans. by Roberta Prada and Pierre Sollier. Toronto: Scarecrow Press, 2005.

Ward, D., and E. Burns. "Singing without Auditory Feedback." Journal of Research in Singing 1 (1978): 24-44.

Ware, Clifton. "The Singer as Processor." Opera Journal 35 (March, 2002): 27-37.

Kevin Hanrahan has performed nationally and internationally in opera, oratorio, and recital performances. In January of 2007, Dr. Hanrahan along with pianist Roberta Swedien performed Schubert's Die schone Mullerin in Pune and Mumbai, India, and were the first to perform Schubert's masterpiece in over 50 years, with the last performance having been given by Peter Pears and Benjamin Britten. He has worked with numerous influential conductors, including Robert Page, Charles Bruffy, David Stocker, and Gunther Schuller, as well several esteemed directors, such as Elizabeth Bachman, Rhoda Levine, Gregory Lehane, and Graham Whitehead. Dr. Hanrahan has held teaching positions at Arizona State University, Scottsdale Community College, and Grand Canyon University. As a researcher and teacher, Dr. Hanrahan has presented at national and international conferences. He has given master classes and workshop sessions in the United States and abroad, as well as via Internet2 technologies. Dr. Hanrahan currently holds the position of Associate Professor of Voice and Voice Pedagogy at the University of Nebraska-Lincoln, and is the founder of the UNL School of Music Voice Lab.
Table 1. Protocol descriptions.
Protocol         Abbreviation          Frequencies Enhanced (Female)
Room                R                  Feedback from the room only
Headphones          HP                 No enhancement
Lows                L                  400 (800) Hz and below
Mids                M                  1600 to 5000 Hz
Singer's Formant    SF                 2500 to 4000 Hz
Highs               H                  5000 above--Max. 20,000 Hz
COPYRIGHT 2012 National Association of Teachers of Singing
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2012 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Hanrahan, Kevin
Publication:Journal of Singing
Date:Nov 1, 2012
Previous Article:Brains, breath, and voice: Emma Azalia Hackley, pioneering African American voice pedagogue.
Next Article:Fach vs. voice type: A call for critical discussion.

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters |