Printer Friendly

Comparison of plosive sounds in monolingual and bilingual children, using the voice onset time acoustic parameter: cases report.


With respect to the voicing, in Brazilian Portuguese (BP), non-voiced sounds are found [p, t, k] that do not have any vocal fold vibration mode (PPVV). These sounds contrast with voiced sounds [b, d, g], which present with some vocal fold vibration (voice source) and coordinated actions with the cartilages and muscles of the larynx [1-5].

Voicing is a property acquired gradually through development in children. It is known that they use different strategies to control and synchronize the adjustments required to produce the various articulation patterns [6-10]. In other languages, such as English, Japanese and Korean, there is the presence or absence of aspiration, in addition to voicing. It should be noted that each language has a distinct voicing characteristic, as well as the aspiration in certain plosive sounds [11,12].

The voice onset time (VOT) is defined as the time interval between the clearance of the oral obstruction of plosive sound, identified by the burst, and the beginning of the vibration of the vocal cords identified on a broadband spectrogram through the vertical spline [13-15].

It is a simple duration measure that contributes to facilitating a check of the relationship between speech sound production and perception [9,15-18]. The VOT also makes it possible to evaluate the synchronization between the articulatory gestures [6,18-22]. In this way, the VOT enables establishment of whether a continuum exists in the development of laryngeal adjustments according to age and if the acoustic specificities are related to the gradient of the articulatory gesture [7,8,14,15].

VOT measures were compared in the speech production analysis [p, t, k] in bilingual adults (Portuguese and English). In was possible to notice in the results that the VOT values were lower in English when compared to PB [23].

Another study that was also conducted with adults [21], analyzed the differences between the VOT duration measures of plosive consonants [p, t, k] in the speech production of 14 bilingual individuals (Spanish-English), aged 18 to 24 years. The group was consisted of 11 women and 3 men; 11 already knew both languages before age 6 (six) years and 3 (three) of them learned the second language later. Participants performed spontaneous conversation tasks about important dates in Mexico City and talked about the same topic while they were also assembling puzzles. Researchers found lower VOT values among subjects who learned both languages before reaching age 6 years, and the difference was more significant for English.

Another study analyzed the values of VOT measurements in non-voiced plosive sounds [p, t, k] of BP and English in five bilingual children from 8 to 9 years old. The authors reported the following VOT values in the first study: for BP, [p] = 48ms, [t] = 60ms, and [k] = 70ms and for English, [p] = 40ms, [t] = 56ms, and [k] = 65ms [24].

With respect to children's speech as well, 40 children aged between 8 and 10 years old participated in another study, of which 20 were monolingual (BP) and 20 bilingual (BP/English). Differences were reported between the averages obtained by monolingual and bilingual children in producing aspirated plosive sounds in English. According to the author, it may suggest a strong influence of the first language of participants (BP) in the production of the aspirated plosive sounds in English [25].

Another study on infant speech during the phase of language acquisition examined plosive sounds in bilingual participants who could speak BP and a German dialect. The material consisted of 12 oral interviews, and acoustic analyses were performed with VOT measurement in plosive segments in the two groups. The authors found higher VOT indexes for the bilingual group when compared to the monolingual group and suggested that this difference may lead infants to confuse BP non-voiced plosive with German-voiced subjects [26].

Given that the VOT allows to investigate whether or not children and adults issue voiced and non-voiced sounds according to the standards of their language [3,8,15,17,27], the interest generated by the topic is based on the possibility of using a robust track to assist in the evaluations and therapies of speech-language pathology [5-7,21,22]. The data from some studies conducted with BP monolingual children [6,7] and with bilingual children [24,25] and bilingual adults [23,26,28] were considered as a parameter for standardization

Based on the above, the research suggests the following hypothesis: there is a difference in the trait of voicing among children who are exposed to a single language and bilingual children and VOT therefore may present specific characteristics for each language [11,29]. In addition, the amount and form of each individual's language exposure will affect sound production [23,30].

There as a noticeable increase in the number of bilingual schools in society and therefore in the number of bilingual children. Considering the lack of researches with voiced and non-voiced sounds that compare monolingual and bilingual children, this research will be of great importance to speech-language pathology.

To address this question, the objective of this study was to compare plosive sounds by means of VOT in Brazilian Portuguese (BP) monolingual children with bilingual children (BP/English).


This study was approved by the Research Ethics Committee of the Pontifical Catholic University of Sao Paulo, under the process number 019/2007. Parents and/or guardians signed the Free and Clarified Consent Term (FCCT) and children signed the Term of Assent (TA).

The sample consisted of six children, of whom three were monolingual native speakers of BP and three were BP-English bilingual. The ages of the three monolingual children (CM), all girls, were 7y3m (CM1), 7y6m (CM2), and 7y9m (CM3). Ages of the bilingual children (BC) were 7y5m for BC1 and BC2 and 7y6m for BC3. The three CM children studied in Brazilian schools, and their parents spoke BP as their first and only language. The three bilingual children learned both languages simultaneously because they lived in the United States until they were 6 years old, and each of them had one Brazilian and one American parent.

For the speech sample collection, children were placed in a quiet room of the school, where a decibel meter was used to confirm that the noise was lower than 30dB. In order to record sounds, girls sat in an armless chair, with their feet on the floor, and the Shure SM7A, a dynamic unidirectional low-impedance microphone, positioned 10 cm away from the mouth [6]. Each of the girls recorded three repetitions of the following sentences: "Diga 'papa' baixinho" ("Say 'papa' quietly"); "Diga 'baba' baixinho"; "Diga 'tata' baixinho"; "Diga 'dada' baixinho"; "Diga 'caca' baixinho"; and "Diga 'gaga' baixinho". [6] These sentences were repeated alone and the researcher, seated on the right side of the child confirmed perceptually that the sentence was reproduced, while the recording technician, on the left side, was responsible for observing the sound wave on the computer screen. Both the researcher and technician, were using headphones, so if they identified something that sounded unusual, they would repeat the recording of that child another day. Before making the measurements, the researcher also listened to a couple of sentences with "pata" and "bata" and conducted a hearing assessment to verify whether the production was in accordance with the stimulus [3]. The audio files were converted into WAVE files and the inspection of the acoustic signal was manually performed considering the wave form and the broadband spectrogram with PRAAT v5.2 software [6].

The VOT acoustic parameter expressed in milliseconds (ms) is regarded as a decisive time interval in the accurate perception of non-voiced [p, t, k] and voiced [b, d, g] plosive sounds. Because VOT is within the scope of measures of duration, it may suffer from interference from external factors and therefore usually requires some standardization procedure [4,17]. To avoid interference, we decided to "control" the speech rate of the sentences-vehicle, among other factors [3,16].

We calculated the VOT (ms) positive measurements of non-voiced [p / t / k] plosive sounds, which was considered from the burst to the onset of the vowel, and the negative measurements of the voiced [b / d / g] plosive sounds, which was considered from pre-voicing to the burst [3,13,16].


Table 1 describes and compares the measurements of the VOT (ms) values obtained from the speech samples of the three monolingual children (CM1, CM2, and CM3) and the three bilingual children (CB1, CB2, and CB3). The measures for non-voiced plosives [p, t, k] for monolingual children had higher values than those of bilingual children, with the exception of the plosive /k/, for which CM3 and CB3 presented equivalent values.

In Figure 1, we present a comparison of the averages of the three monolingual vs the three bilingual children for the VOT (ms) parameter for the non-voiced plosive sounds [p, t, k]. The monolingual children showed higher values when compared to the bilingual group.

Table 2 describes and compares VOT (ms) values for the three monolingual and three bilingual children. The three monolingual children had lower values for the voiced plosive sounds /b/, /d/ and /g/ compared to bilingual children.

Figure 2 shows a comparison between the VOT (ms) averages for the monolingual and bilingual children for the voiced plosive sounds [b, d, g]. Monolingual children showed lower values ([b] = -91ms, [d] = -90ms, [g] =-75ms) compared to averages for the bilingual children ([b] =-92ms, [d]= -95 ms, [g] = -94ms).


Languages rely on sounds that result from a combination of many mechanisms involving the use of airflow and the chambers associated with it, such as the lungs, larynx, and the soft palate [1-3]. 3The larynx is not solely responsible for modulating airflow and producing sounds, which are chained and articulated in sequences characteristic for each language [3,4]. The production of speech sounds also depends on the plasticity of certain organs of the speech apparatus that creates numerous configurations in the vocal tract, which also include the vocal folds [1-4].

Voicing characteristics arise from synchronization between adjustment of laryngeal activity and oral articulation [1-3,10] and can be detected by means of perceptual-auditory assessment or analysis of various acoustic clues [6,8,23,27]. These issues are essential for learning a language, as well as to the differentiation between the languages [25].

Studies indicate that some particularities, such as aspiration, breathiness, and voicing interruptions or absence, may be found in the study of speech production in monolingual and bilingual children of BP, English, Spanish and German [6,15,17,24,29].

We found differentiation in VOT values when comparing data from speech productions between monolingual and bilingual children, as shown in Figure 1 and 2 and as observed by some children's speech [8,25] and adult's speech researchers [23,26,28].

Some authors have reported that there is some mode of vibration of vocal cords for the production of plosive sounds and that the use of some features to allow voicing is often required. This voicing can be identified on the sound bar before the burst event. On the other hand, non-voiced plosive sounds have no mode of vibration of vocal cords and are produced at short intervals and with no aspiration because after the burst, there is a slight air release.

In the VOT results for the three monolingual girls, we found VOT values (Table 1) that are compatible with those reported in the literature [2,3,13] and also in some studies by other researchers [23,25].

In English, the presence or absence of aspiration usually determines voicing contrast, the plosive [p, t, k] values. Sounds with an absence of aspiration present a shorter VOT while sounds with aspiration present a longer VOT, ranging from 25ms to 100 ms. The VOT is negative for voiced sounds, [b, d, g], but presents shorter values, with a slight variation around zero, because the release of the obstruction of the plosive and the beginning of voicing are almost simultaneously [2,3,11,13]. Considering these characteristics, we were able to obtain VOT values for our bilingual children that were consistent with those previously reported by other researchers [23,24], as shown in Table 1 and 2.

The data from the bilingual children (Figure 2) are in accordance with results from a study conducted with bilingual adults who presented higher values of VOT measures by the presence of aspiration [23]. For children's speech, other earlier findings35 were also similar; unlike values for the monolingual children, this previous study uncovered the presence of a slight aspiration in VOTs of bilingual children between the ages of 8 and 9 years [24].

The VOT values of monolingual children of this research (Figure 1) are also in line with data presented by bilingual children of a research that analyzed the differences relating to VOT patterns in non-voiced sounds produced by monolingual (BP) and bilingual (BP/English) children. The authors of such study [25], as in this study, found differences between the averages obtained by monolingual and bilingual children, in the production of plosive sounds in English, and suggested that it could indicate an influence of the first language of participants in the production of the aspirated plosive sounds in English [25].

When compared to findings in bilingual children, we found higher VOT values (Table 1) for non-voiced plosive sounds in the monolingual children group studied and lower VOT values (Table 2) for the voiced plosive sounds.

Based on the results obtained, even with a small number of subjects, it is clear that when children are exposed to a single language, the voicing becomes more evident and it causes a highest value. With respect to bilingual children, the value was lower, but the identity of the plosive sound was maintained.

Thus, based on literature [3,10,22] it is worth noting that the distinct VOT values observed in the results (Table 1 and 2 and Figure 1 and 2) are probably due to different laryngeal adjustments and also subtle changes in the phonetic conduction of a speech-language pathology contrast in monolingual and bilingual individuals since childhood [3,10,22].

Despite lack of statistical tests due to the small number of subjects, it was possible to notice that the analysis of acoustic data of the VOT, which was chosen for this study, is valid since it showed differences in the values of voiced and non-voiced plosive sounds in children exposed to another language. There was an understanding, still based on literature, that the aspiration might have been responsible for differences in the VOT values, mainly for the voiced plosive sounds.

Therefore, VOT may be a complementary tool in the speech-language clinic with monolingual and bilingual children, useful in evaluations of and therapies for language disorders. It also can facilitate helping children exposed to more than one language to ameliorate mistakes when speaking and difficulties when writing and thus assist in treatment planning.


Compared with the three bilingual children in this study, the three monolingual children produced higher voice onset time values for non-voiced plosive sounds and lower values for voiced plosive sounds.

doi: 10.1590/1982-021620182052118


[1.] Ladefoged P, Maddieson I. Stops. In: Blackwell publishers Inc. The sounds of the world's languages. Massachusetts (USA); 1996. p. 47-101.

[2.] Kent RD, Read C. As caracteristicas acusticas das consoantes. Correlatos acusticos das carcteristicas do falante. In: Ed Cortez Editora. Analise Acustica da Fala. Traducao: Meireles AR. 1. Sao Paulo; 2015. p. 229-390.

[3.] Barbosa PA, Madureira S. Elementos de producao da fala. Oclusivas e africadas In: Cortez Editora. Manual de fonetica acustica experimental. Aplicacoes a dados do portugues. Sao Paulo; 2015. p. 314-80.

[4.] Sundberg J. O que e voz? O sistema fonador. Respiracao. A fonte glotica. A diversidade na voz. In: Editora da Universidade de Sao Paulo. Ciencia da Voz Fatos sobre a voz na fala e no canto. Traducao: Salomao GL. Sao Paulo; 2015. p. 249-83.

[5.] Gregio FN, Queiroz RM, Sacco ABF, Camargo Z. O uso da eletroglotografia na investigacao do vozeamento em adultos sem queixa de fala. Rev Intercambio. 2011;23:88-105.

[6.] Lofredo-Bonatto MTR. Vozes infantis: a caracterizacao do contraste de vozeamento das consoantes plosivas no portugues brasileiro na fala de criancas de 3 a 12 anos. In: XVI Congresso Brasileiro de Fonoaudiologia Campos do Jordao, Sao Paulo; 2008.

[7.] Melo RM, Mota HB, Mezzomo CL, Brasil BC, Lovatto L, Arzeno L. Desvio fonologico e a dificuldade com a distincao do traco [voz] dos fonemas plosivosdados de producao e percepcao do contraste de sonoridade. Rev. CEFAC. 2012;14(1):18-29.

[8.] Cristofolini C. Gradiencia nao fala infantil: caracterizacao acustica de segmentos plosivos e fricativos e evidencias de um periodo de "refinamento articulatorio" [tese]. Florianopolis (SC): Universidade Federal de Santa Catarina Programa de Pos- Graduacao em Linguistica; 2013.

[9.] McCarthy KM, Macon M, Rosen S, Evans BG. Speech perception and production by sequential bilingual children: a longitudinal study of voice onset time acquisition. Child Development. 2014;85(5):1965-80.

[10.] Kenstowicz M. Phonology in generative grammar. Oxford: Blackwell; 1994. p. 195-249.

[11.] Shimizu K. A cross language study of voicing contrasts of stops consonants in asian languages. Seibido Publishing Co. Ltd. Japan, 1996.

[12.] Kang Y. Voice onset time merger and development of tonal contrast in Seoul Korean stops: a corpus study. Phonetics. 2014;45:76-90.

[13.] Lisker L, Abranson A. A cross language study of voicing in initial stop: acoustical measurements. Word j. Linguistic Circle. 1964;20(3):384-422.

[14.] Balukas C, Koops C. Spanish--English bilingual voice onset time in Spontaneous code switching. International Journal of Bilingualism. 2015;19(4):423-43.

[15.] Lofredo-Bonatto MTR. A producao de plosivas por criancas de 3 anos falantes do portugues brasileiro. Rev. CEFAC. 2007;9(2):199-206.

[16.] Camargo ZA, Madureira S. Dimensoes perceptivas das alteracoes de qualidade vocal e suas correlacoes aos planos da acustica e da fisiologia. DELTA. 2009;25(2):285-317.

[17.] Lofredo-Bonatto MTR, Madureira S. Estudo sobre a percepcao e a producao do contraste de vozeamento da fala de criancas de 3 anos. Rev. CEFAC. 2009;11(1):67-77.

[18.] Fabiano-Smith L, Buntab F. Voice onset time of voiceless bilabial and velar stops in 3-year-old bilingual children and their age-matched monolingual peers. Clinical Linguistics and Phonetics. 2011;26(2):148-63.

[19.] Melo RM, Mota HB, Mezzomo CL, Brasil BC, Lovatto L, Arzeno L. Acoustic characterization of the voicing of stop phones in brazilian portuguese. Rev. CEFAC. 2014;16(2):487-99.

[20.] Stolten K, Abrahamsson N, Hyltenstam K. Effects of age of learning on voice onset time: categorical perception of swedish stops by near-native L2 speakers. Language and Speech. 2014; 57(4):425-50.

[21.] Piccinini P, Arvaniti A. Voice onset time in Spanish-english spontaneous code- switching. J Phonetics. 2015;52:121-37.

[22.] Camargo Z, Madureira S. Analise Acustica: aplicacoes na Fonoaudiologia In: Fernandes FDM, Mendes BCA, Naves ALPGP (org.). Tratado de Fonoaudiologia, 2a edicao, Sao Paulo, ROCA. 2015. p.695-9.

[23.] Rocca PA. O desempenho de falantes bilingue: evidencias advindas da investigacao do VOT de oclusivas surdas do ingles e do portugues. DELTA. 2003;19(2):303-28.

[24.] Zimmer MC, Bandeira MHT. A dinamica do multilinguismo na transferencia de padroes de aspiracao de obstruintes iniciais entre o pomerano (L1), o portugues (L2) e o ingles (L3). In: X Congresso Nacional de Fonetica e Fonologia Niteroi; 2008.

[25.] Bandeira MHT. Diferencas entre criancas monolingues e multilingues no desempenho de tarefas de funcoes executivas e na transferencia de padroes de VOT (Voice Onset Time) entre as plosivas surdas do pomerano, do portugues e do ingles [Dissertacao]. Pelotas (RS): Universidade Catolica de Pelotas; 2010.

[26.] Bandeira MH, Zimmer MA. Transferencia dos padroes de VOT de plosivas surdas no multilinguismo. Letras de Hoje. 2011;46(2):87-95.

[27.] Gregio FN. Analise fonetico-acustica do contraste fonico de vozeamento em criancas [tese]. Sao Paulo (SP): Pontificia Universidade Catolica de Sao Paulo; 2013.

[28.] Schaeffer SCB, Meireles AR. Padroes de vozeamento de consoantes plosivas em falantes de pomerano (L1) e de portugues (L2) Anais do VII Congresso Internacional da ABRALIM Curitiba; 2011.

[29.] Lamy DS. A variationist account of voice onset time (VOT) among bilingual West Indians in Panama. Studies in Hispanic and Lusophone Linguistics. 2016;9(1):113-41.

[30.] Dmitrieva O, Llanos F, Shultz AA, Francis AL. Phonological status, not voice onset time, determines the acoustic realization of onset f0 as a secondary voicing cue in Spanish and English. J Phonetics. 2015;49:77-95.

Maria Teresa R. Lofredo-Bonatto (1)

Marta A. Andrada e Silva (2)

(1) Pontificia Universidade Catolica de Sao Paulo--PUC-SP Sao Paulo, Sao Paulo, Brasil.

(2) Faculdade de Ciencias Medicas da Santa Casa de Sao Paulo, Sao Paulo, Sao Paulo, Brasil.

Conflict of interests: Nonexistent

Received on: March 5, 2018

Approved on: August 31, 2018

Corresponding address: Maria Teresa Rosangela Lofredo-Bonatto Avenida Paulista 509, 4 andar cjto 410 Cerqueira Cesar

CEP: 01311-910--Sao Paulo, Sao Paulo, Brasil

Table 1. Voice onset time (ms) values for non/voiced plosive phonemes
/p/, /t/, and /k/ for the three monolingual children (CM1, CM2,
CM3) and three bilingual children (CB1, CB2, CB3)

Phonemes   CM1   CM2   CM3   CB1   CB2   CB3

/p/        14    12    13    13     9    11
/t/        13    19    20    11    10    17
/k/        31    39    39    28    25    39

Table 2. Voice onset time (ms) values for voiced plosive phonemes /b/,
/d/, and /g/for the three monolingual children (CM1, CM2, CM3) and
three bilingual children (CB1, CB2, CB3)

Plos /children   CM1    CM2   CM3   CB1    CB2    CB3

/ b /            -91    -89   -71   -92    -113   -92
/ d /            -101   -86   -84   -111   -83    -93
/ g /            -94    -70   -62   -120   -77    -84

Figure 1. Comparison between averages obtained for monolingual and
bilingual children for the voice onset time (ms) parameter of the
non-voiced plosive sounds /p/, /t/, and /k/

                monolingual   bilingual   monolingual   bilingual

                                 /p/                       /t/

Average (ms)        13           11           17           13

                monolingual   bilingual


Average (ms)        36           31

Figure 2. Comparison between averages obtained for monolingual and
bilingual children for the voice onset time (ms) parameter of the
voiced plosive sounds /b/, /d/, and /g/

               monolingual   bilingual   monolingual   bilingual

                                /b/                       /d/

average (ms)       -84          -92          -90          -95

               monolingual   bilingual


average (ms)       -75          -94
COPYRIGHT 2018 CEFAC - Associacao Institucional em Saude e Educacao
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2018 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Case reports; texto en ingles
Author:Lofredo-Bonatto, Maria Teresa R.; Silva, Marta A. Andrada
Publication:Revista CEFAC: Atualizacao Cientifica em Fonoaudiologia e Educacao
Article Type:Estudio de caso
Date:Sep 1, 2018
Previous Article:A case study of a socially deprived child from a dialogical approach.
Next Article:Analysis of softwares for emotion recognition in children and teenagers with autism spectrum disorder.

Terms of use | Privacy policy | Copyright © 2022 Farlex, Inc. | Feedback | For webmasters |