Linguistics and lyric diction - a personal retrospective and a selective glossary.

IN THE FOURTEEN YEARS OF MANAGING the "Language and Diction" column of the Journal of Singing, I have kept two overriding aims in the forefront of consideration. First, the need to supplement existing literature for performers and pedagogues on lyric diction, particularly for those languages less well represented, or unrepresented at all. Second, to expand the vocabulary of lyric diction to include terminology and concepts well entrenched in the field of linguistics, but not normally encountered in the musical literature, that have practical value for performers. (1)

The former aim has resulted in more than fifteen articles specific to languages other than English, French, German, Italian, and Latin. No one person can possibly fulfil this mission. For this, gratitude must be extended to all those guest contributors who have the expertise to write with authority for musicians in these essential languages. There is still much work to be done in this area, and several other languages (European and non-European) with significant vocal repertory remain largely unrepresented in the musical literature. (2)

The latter aim has involved the mainstreaming of terminology and nomenclature that formerly remained outside lyric diction reporting--terms such as phoneme/allophone, and the proper use of square brackets [ ], slashes / /, and angled brackets <>. The glossary presented in this article is designed to put many of these terms and symbols conveniently in one place, for ease of reference if an unfamiliar item appears in the course of reading articles in the Journal and elsewhere. Although some terms will be described in detail elsewhere in Journal of Singing articles, this article should provide a convenient reference point. (3)

Several fine dictionaries of linguistic terminology exist, and are listed at the end of the article. The entries in this article are highly selective by comparison, and are limited to terms that may be less familiar to the musician, and require explanation as to their relevance to musical performance. Terms already in general use, such as liaison, vocalic harmonization, will be assumed to be already familiar.



A morpheme attached to a word, providing additional meaning. These include the prefix ("un-"), suffix ("-ment," "-ly"), circumfix (Ger. "ge... t," "gegrubt"), and less commonly, superfix and infix.


An articulation that involves a stop consonant released with a fricative. Considerable controversy has long existed over whether affricates are single or double articulations. They are generally considered single, although they do not appear on the official IPA chart of consonants. Affricates can be found in both onset and coda position (Ger. /ts/ zusammen, Herz) and are often given phonemic status. /pf/, for instance, is phonemic in German (Pferd), but can also occur across syllable boundaries [p.f] (Hupfeld). Same with [t.s] (itself). Affricates are not to be confused with coarticulation, which involves constriction of the vocal tract simultaneously at two (or more) distinct points.


This is really a voice pedagogy term, rather than a linguistic one. Aggiustamento is a time honored deliberate adjustment of the normal quality of an articulation in speech--usually a vowel--for the sake of vocal production. It often involves a greater openness of a close vowel in very high registers, for the sake of beauty of tone and technical well-being. Aggiustamento of the vocal tract does not necessarily result in acoustic difference, and tests have shown that a lack of adjustment can indeed alter the perception of a vowel color across the singer's range. Thus, this technique can be employed to maintain perceptual consistency across range, as well as facilitate production of the sound itself.


One of the possible phonetically distinct articulations of a phoneme within the sound pattern of a language. Allophones can be thought of as "fine tunings" of phonemes. Allophones are language-specific. Two allophones of a phoneme in one language may be separate phonemes in another. In Spanish, [d] and [[??]] are allophones of /d/, but both /d/ and /[??]/ exist as separate phonemes in English. Further discussion is found under phoneme.

Angled bracket

The standard symbol for orthographic text, whether letter, syllable, word, or phrase. One might say for instance that English <-ough> can be realized as [[LAMBDA]f], [au], [ou], [u[??]], and [[??]f]. Or that French [[??]] is employed for <-aim>, <-ain>, <-eim>, <-ein>, <-im>, <-in>, <-ym>, <-yn>, <-ien>, <-oin>, <-uin>, and <-ient>.


An articulation involving enough constriction of the vocal tract for it not to be categorized as a vowel, yet not enough to create friction. In English, [[??]], [j], and [w] function as consonants, and are thus approximants. There is some controversy still as to the extent of articulations to be considered in this category, such as [l].


IPA transcription gives the visual impression that speech is an orderly succession of discrete articulations, which may be combined in a variety of ways to create words and sentences. The rapid enunciation of speech, however, demands that expeditious short-cuts be taken, in order to facilitate quick movement between articulations. The Italian velar [[??]] of sangue is an allophone of /n/, altered from dental [n] because of the ensuing velar plosive. Similarly, /n/ becomes [??] in San Pietro because of the bilabial [p]. As a general rule, the more physiologically distant adjacent articulations are, the more likely some form of assimilation will occur. One of the functions of narrow (phonetic) transcription is to indicate assimilation patterns, usually with the help of diacritics.

Broad transcription

Alternate name for phonemic transcription.

Central / Centralized

Articulated with the tongue arch toward the center of the oral cavity, in reference to vowels. Centralized implies a shift in position toward the center, from some more peripheral position. In most languages, the vowels employing cardinal vowel symbols ([i], [e], [u], [[??]], [a], etc.) are usually centralized to a degree from the "extreme" cardinal positions. In English, [[LAMBDA]] is strongly centralized compared to cardinal vowel [[LAMBDA]], for instance. A vowel can also be decentralized, as in the tendency for French [[??]] to marginalize toward [oe].

Citation form

A form of a lexical item considered "standard," for use in discourse or in dictionary entry. The infinitive of a verb is usually the citation form of verbs for lexicographic purposes. In phonology, the citation form of a word (as in a dictionary of pronunciation) is the most commonly encountered pronunciation as employed in the standard dialect, or the unreduced ("strong") pronunciation of the word. The citation form for "either" is [TEXT NOT REPRODUCIBLE IN ASCII] in British English, with [TEXT NOT REPRODUCIBLE IN ASCII] as an alternate pronunciation, while in American English it is the reverse, [TEXT NOT REPRODUCIBLE IN ASCII] / [TEXT NOT REPRODUCIBLE IN ASCII]. Monosyllabic prepositions and articles ("that," "the") often have an unreduced (strong) and a reduced (weak) form, in which case the unreduced is the citation form. The choice of symbol to represent a phoneme in a language operates on a similar principle. In English, /t/is chosen, rather arbitrarily, from the allophone [t] (as in "stop"), as the "citation form" of the phoneme, with other allophones indicated by diacritics.


A word--usually a short monosyllable--which is unstressed and phonologically bound to a word adjacent to it. Thus, it behaves like an affix, even though it is an independent word, such as "the," and "an." Some consider English contractions, such as the "-n't" in "wouldn't" as clitics, based on the citation form being an independent word.

Close (High) / Open (Low)--vowels Closed / Open--syllables

These two pairs are often confused with one another. Vowels are close or open, or somewhere in between. Syllables are closed (consonantal coda) or open (vocalic coda). High vowels are close, not closed. "Closed vowel" implies that an action has been performed on it.


The final constituent of a syllable, after the nucleus, typically the consonant(s) following a syllabic vowel or diphthong. The nucleus together with the coda form the rhyme.

Complementary distribution

In phonemic theory, allophones are said to be in complementary distribution when their environments do not overlap. For instance, clear <l> [l] and dark <l> [[??]] are never interchangeable in English. [l] always occurs before a vowel, but [[??]] never does. These allophones of English /l/ are thus in complementary distribution. A good test of this is to try substituting one for the other for any given /l/ in any word. The result will sound odd, especially when replacing [l] with [[??]], which is reminiscent of a Slavic accent in English. Opposite: free variation.

Two sounds must also exhibit phonetic similarity in order to be considered allophones of the same phoneme. Thus, English [h] and [[??]], which never occur in the same environment, are nevertheless separate phonemes, even though no minimal pairs can be found, because they lack phonetic similarity, qv.


Loosely, a consonant that can be sustained, that is, in which there is no stoppage of the airstream. This feature has been problematic, and linguists do not always agree on which groups to include. Some include fricatives, liquids, nasals, and vowels; others include any consonant except plosives and affricates; others include any consonant except plosives, affricates, nasals, and semivowels.

Devoicing / Devoiced / Voicing

Unlike the terms unvoiced and voiceless, which are synonymous with one another, devoicing usually refers to the loss of a segment's normal voicing. In German, voiced plosives are typically devoiced in coda position--[g][right arrow] [k], [d][right arrow] [t], [p][right arrow] [b]. In English, this can occur in similar position inadvertently, especially at the ends of phrases. Some speakers will make use of devoicing as a unconscious speech habit in more formal situations.

In English, the allophone [[t.sup.1]], as in <bottle> rhymes with <model> [[d.sup.1]], and is sometimes transcribed as [[[??].sup.1]], the subscript wedge indicating voicing. [[[??].sup.1]] is a more intuitive symbol for an allophone of /t/ than the phonetically correct [[d.sup.1]] in such instances. It indicates that the allophone of /t/ is both laterally released and voiced. (4) Devoicing is indicated with an under-ring [d] or overring [[??]], as in <Jagd>, [jakt], or [TEXT NOT REPRODUCIBLE IN ASCII]. Again, [kt] is the usual transcription, but [TEXT NOT REPRODUCIBLE IN ASCII] reflects the phonemic derivation more precisely, [[??]] and [[??]] being devoiced allophones of /g/ and /d/.

Epenthesis / Epenthetic

The insertion of a segment into a word that formerly did not contain it. Such changes usually occur as a result of assimilation, enabling a more convenient transition from one articulation to the next. The [p] in "empty" is epenthetic, as is the [d] in thunder, when viewed in terms of the etymology of the words. An epenthetic vowel is sometimes inserted by some speakers, to clarify otherwise adjacent consonants, as in "film" with an inserted schwa, [fil.[??]m]. The loss of a vowel due to the truncation of identical or similar consonants, as in [[.sup.1]laI.b[??]I] for "library" is a form of syncope (or syncopation) known as haplology. Strictly, epenthesis is a form of intrusion, occurring medially.

Fortis / Lenis

A fortis articulation involves more energetic muscular tension to execute than a lenis one. With vowels, those on the periphery of the vowel quadrilateral are fortis, and those more central are lenis. Voiceless consonants are fortis; voiced are lenis. Lenition is thus the process of becoming less strong or more sonorous, as [k] [right arrow] [g]. The devoicing of final voiced plosives in German is a process of fortition.

Free variation

This occurs when two articulations can be freely exchanged without any change in meaning. For example, word-final /p/, as in <stop>, may be unaspirated [p], aspirated [[p.sup.h]], or unreleased [[p.sup.[logical not]]] without any effect on meaning. Such phones will have allophonic status in a language, unless minimal pairs can be found.

Intrusion / Intrusive

Intrusion is the insertion of an articulation into a word without etymological justification. The intrusive "r" of British English is a well known example. Divided into prothesis (word-initial, as in the <e> of "establish," from Latin "stabilire"), epenthesis (qv.--medial), and paragoge (final, as in the <st> of "amongst"). The opposite of paragoge is common in Italian, and known as apocope or apocopation ("cor," "esser," "or").


A grammatical boundary, or frontier. In phonetic/phonological terms, this usually means syllable and lexical (word) boundaries, but could also refer to segmental boundaries (from one articulation to the next). The correct temporal handling of juncture can be crucial to meaning, as in <white shoes> vs. <why choose>. In vocal music, intelligibility involves sensitivity to the precise placement of juncture articulations on the rhythmic/melodic line, as with the gemination of consonants in Italian, French, and German.


This term usually refers to a less prevalent or unexpected characteristic of an articulation. Features found in only a few languages, rather than more universally, are considered marked. An unrounded back vowel is more marked than the more predictable rounded counterpart. Within the major languages of sung diction, interdental fricatives are marked articulations, found only in English and Spanish. [m] could be considered a more marked allophone of Italian /n/ than the others, in that it occurs only before [f]. A marked feature will thus be less common lexically as well as cross-linguistically, will be less diachronically stable, and implies the existence of a more common unmarked counterpart.

Minimal pair

The traditional gold standard by which articulations are granted (or denied) phonemic status in a language. If two different words in the lexis of a language differ by only a single segment, the articulations in each of the words that differentiates them are phonemically contrastive, as in English <fat> and <vat>, German <bitte> and <biete>, Italian <caro> and <carro>, etc. Sometimes known as the commutation test.

Mixed vowel

A vowel that combines one of its features with that of a different vowel. Vowels that combine the lip rounding of a back vowel with the tongue placement of a front vowel are the ones that are most common in lyric diction ([y], [[UPSILON]], [oe], etc.). By this definition, nasal vowels are a form of mixed vowel. A term encountered more in lyric diction literature than in general linguistics.

Morpheme / Morphology

A morpheme is the smallest grammatical unit capable of having its own meaning, or altering the meaning of a word to which it is affixed. Morphemes cannot be further split, except into phonetic segments. Morpheme is a good alternative to "root stem" in discussions of German vowel quality and syllabification. Affixes (prefix, suffix) are bound morphemes that can only be attached to a word or stem. Morphology is thus the branch of grammar that deals with the structure of words.

Narrow transcription

Alternate name for a phonetic transcription.


The central and most sonorous portion of a syllable, between onset and coda, usually a vowel or diphthong, but occasionally also a syllabic consonant ([[??]], [[??]], [[??]], [[??]]).


As the word implies, any segment whose production produces an obstruction that involves friction or a complete stop. In other words, all fricatives and plosives, together as a group. Opposite: sonorant


The first constituent of a syllable, typically the consonant(s) preceding a syllabic vowel.


Open has two primary meanings, one related to syllables, the other to vowels. These are easily confused. An open syllable ends with a vowel--the opposite, a closed syllable, has a consonantal coda. An open vowel is any vowel below the horizontal midpoint of the vowel quadrilateral, and involving a lower jaw placement and more space in the oral cavity. The open vowels [a] and [[alpha]] are sometimes called low, while [[epsilon]], [oe], [[??]], and [[LAMBDA]] are mid-low. The IPA terminology, from bottom to top, is open, open-mid, close-mid, and close. Probably to avoid confusion, many linguists employ low and high for vowels, reserving open and closed for syllables. Note that non-open syllables are closed, but non-low vowels are close.


Not the one invented by A. G. Bell. A phone is a single phonetic segment (articulation), independent of any status it might have in a given language. Field linguists studying an unfamiliar language, for instance, will first compile an inventory of all phones known to occur, then examine each to determine its status (i.e., phonemic or allophonic) within that language.


There is no more problematic term than this one to convey precisely. It is easier to describe what it is not: a phoneme is not an articulation, nor an alphabetic letter, but rather an idea. I have found it helpful to use a Platonic analogy: a "tree" is an idea, of which there are innumerable individual specimens, all unique. But we intuitively recognize that they are all trees, and that our minds understand a "tree" to be a mental construct derived from the sum of all specimens that we collectively agree have something in common. (5) In other words, the written word <tree> symbolizes a received idea, "tree," of which the world contains innumerable examples or specimens. In an identical fashion, we all comprehend the idea of a sound component of a language, perhaps without knowing that there is a name for it: phoneme. Everyone agrees, for instance, that German contains /r/s. All instances of /r/ in German are indicated in the spelling of words by <r>. But the actual pronunciation varies widely according to environment. There are consonantal /r/s, as in [r], [[??]], [[??]], and [R], and vocalic /r/s, as in [[??]] ("Mutter"). These specific articulative realizations are known as allophones of the phoneme /r/. Thus, a phoneme may be considered to be the sum of all possible articulations within a language's sound pattern that are collectively recognized as being "the same sound." This notion of "the same sound" lies at the heart of the confusion for many students of lyric diction. The various allophones are not "the same sound," but articulations distinct from one another, whether appreciated as such or not. Native speakers of a language are often quite unaware that they use differing articulations for what they naturally assume are "the same sound," until it is pointed out. The /t/ of English is a commonly cited example:

/t/  [right arrow]  unaspirated [t]                     <stop>
                    aspirated [[t.sup.h]]               <top>
                    unreleased [[t.sup.[logical not]]]  <pot>
                    voiced [[??]]                       <bitter>
                    nasally released [[t.sup.n]]        <button>
                    laterally released [[t.sup.1]]      <bottle>

These six distinct articulations are subsumed under one "idea," the English phoneme /t/. As native speakers of a language, we all subscribe to the fuzzy intuition that several distinct articulations can all belong to "the same sound." The internal multiplicity is due to assimilation, or the influence that adjacent articulations have upon one another, without our realizing it. Note also that the reverse also occurs. Identical articulations in different environments are sometimes understood to be "different sounds," as in "bitter" / "bidder."

One might imagine a proto-language in which all the allophones were pronounced identically, or at least more similarly than they are today. When such a language takes on written form, the choice of spelling is based upon the pronunciation then in currency. Orthography outlives pronunciation, and is far more resistant to change, thanks to the printed word. Thus, in German, "Rast," "brechen," "zerrissen," and "Vater" all employ the letter <r>, but will be realized with different articulations.

The most common misunderstanding about phonemes in my experience with students is that they easily confuse the "idea" of phoneme with the orthographic letter representing it. This conflation is very understandable. In the /t/ chart above, <t> (i.e., the written letter) could be substituted for the phoneme /t/, and the statement would be equally true. But they are different statements, and the distinction is recondite. One says that "the phoneme /t/ can be realized as...," and other states "the letter <t> can be realized as..." The difference is crucial, since it is by no means inevitable that an articulation will always be represented by the same letter. One can state that, in English,

[integral]  [right arrow]  <ti>      nation
                           <sh>      fish
                           <sch>     fuschia
                           <ss>      mission etc.

or that

<sch>     [right arrow]  [sk]          school
                         [[integral]]  fuschia
&ltti&gt  [right arrow]  [[integral]]  nation
                         [ti]          tiara
                         [taI]         time
                         [tI]          until

But these are simply two ways of comparing articulation to orthography. They are opposites of one another, and very useful charts for illustrating the degree to which a language operates phonetically (1-to-1 correspondence between sound and letter) or otherwise. The relationship is sometimes uneasy and intricate, perhaps arduous to commit to memory for a foreign learner, but not intrinsically difficult to comprehend. Note, however, that the phoneme plays no part in these charts. A phoneme is a third dimension, independent of orthography and articulation, with no concrete reality. This is the stumbling block for many students.

On occasion, an articulation is not represented in orthography, as in the British intrusive [r], <law and order> [TEXT NOT REPRODUCIBLE IN ASCII]. In such a case, the phoneme exists, realized by its intervocalic allophone (a voiced alveolar approximant), but not represented in the orthography of the phrase.

Phoneme                 Allophone  Orthography
/r/      [right arrow]  [[??]]     <-->

Phonetic similarity

Strictly, a condition to be satisfied before establishing two phones in complementary distribution as allophones of the same phoneme. In English, [[p.sup.h]], [p], and [[p.sup.[logical not]]] are allophones of /p/, not only because they are in complementary distribution, but also because they exhibit phonetic similarity. English [n] and [h] are also in complementary distribution, but are phonetically not similar, acoustically or intuitively. Therefore, no phoneme encompassing these two sounds exists. A rigorous definition of "similarity" is elusive, however, involving (for consonants) a blend of manner of articulation, point of articulation, and other features.

Phonetics / Phonology

Phonetics is the study of human utterance, independent of language. Phonetics can include all possible articulations, whether known components of languages or not, such as the experiential babbling of infants, before they begin to appreciate which sounds get results, and which do not. Traditionally, phonetics has been subdivided into articulatory phonetics, dealing with the physiological bases of speech; acoustic phonetics, investigating the physical characteristics of sound waves and their propagation (employing the sound spectrograph); and auditory phonetics, concerned with the ear and brain processes that process speech sounds.

Phonology adapts the principles of phonetics to actual languages. Thus, one speaks of the phonology of Spanish, Swahili, Tagalog, etc. No two languages are alike in their phonologies, and the literature of lyric diction is directed toward the description of the sound patterns and processes of each language in the first instance, and then toward contrastive studies of two or more languages. The course of phonological investigation, up to 1960, was centered on the phoneme, and upon matters of distribution and contrast. Since that time, the principles of various forms of generative phonology have been developed, which concentrate on processes, underlying forms and rules, and regroup the articulations themselves in new patterns based upon distinctive features. Almost all the investigations of generative phonology lie outside the domain of lyric diction, which remains stoutly grounded in earlier approaches. The jargon of generative phonology soon gets complex and unwieldy for the practical purposes of lyric diction, largely because its aims are analytic, theoretical, and descriptive.

Reduced / Reduction

Phonologically, a reduced form is a "weak" form of pronunciation of a word, often affecting monosyllables, and stemming from assimilation processes in rapid speech. Almost all unstressed vowels in English are reduced, or centralized to [[??]] or intermediate positions such as [I] and [[LAMBDA]]. Many monosyllables ("of," "for," "a") have strong and weak forms, depending on the grammatical context, part of speech, and speed of delivery. The "e-muet" pervades French speech in all but the most formal manner of delivery: "est-c(e) que je n(e) sais pas?" (6)


Employed in reference to an articulation in which the tongue tip is curled up and back to a degree. The IPA recognizes several retroflex articulations, with varying degrees of retroflexion, some of which involve the tip or base of the tongue coming in contact with the palate or the back of the alveolar ridge. It is most commonly encountered in lyric diction in reference to the English [[??]]. The [t], [[??]], [[??]], and [[??]] of Hindi, and the [[??]], [[??]], [[??]], and [[integral]] of other languages are also retroflex articulations.

Rhotic / Rhoticity / Rhotacized (7)

A rhotic or rhotacized vowel is "r-colored," as in SAE <fur> [f[??]], <earth> [[??][theta]], <mother> [TEXT NOT REPRODUCIBLE IN ASCII]. (8) Mandarin Chinese also has rhotic vowels. The right hook [[??]] is the accepted symbol for rhoticity. While retroflex [[??]] and [[??]] are rhotic vowels, most rhotic articulations are not retroflex. Lingual [r], flap [[??]], and uvular [R] and [[??]] are rhotic (i.e., "r-colored"), but not retroflex.

Segmental / Suprasegmental

A segment is an articulative unit, either phonetic or phonological, during which a relative stasis in articulators can be perceived. In simple terms, an articulation represented by a single IPA symbol. The segment is in reality a convenient fiction, in that the speech organs are in a constant state of motion during in the course of speech utterance. But varied evidence indicates that a segment is real in a linguistic and grammatical sense, in spite of this. Suprasegmental thus refers to any grammatical unit larger than a segment, from syllable to word to phrase to sentence. Phonotactics, assimilation, vowel harmonization, liaison, elision, stress, intonation, morphemes, are all suprasegmental features. Although strictly not identical, the term prosodic is often employed in the same sense.

Semiconsonant / Semivowel

As implied by their names, such phones behave both as consonants and as vowels. The term semiconsonant is usually employed in onset, when a glide [j],[w], or [y] begins a syllable, as in "year," "want" / "Jahr" / "ouest," or is the last element in an onset cluster: "nuit" / "piu," "quando," etc. In other words, they are phonetically vowels of very small duration, but phonologically they function as consonants. These are sometimes referred to as on-glides. Semivowel is usually reserved for the nonsyllabic latter element (or off-glide) in a falling diphthong, as in "buy," "boy," "how" / "frei," "treu," "blau." English onset [[??]], "rain," can be thought of as a semiconsonant, or vocalic-r. Sometimes the terms semiconsonant and semivowel are used loosely, or interchangeably. Some consider semiconsonant an obsolete term, semivowel being employed for both on-glides and off-glides.

Slash / /

The standard symbol for phonemic notation. IPA transcriptions between slashes are phonemic (or broad) transcriptions, and only symbols recognized as phonemes in the language are permitted.


Any consonant other than obstruents (i.e., plosives and fricatives). Sonorants include liquids, nasals, and approximants.


An elusive concept, but generally accepted as the level of acoustic output, or loudness, of an articulation. Linguists employ the notion of sonority in technical discussions of the syllable, placing sonority peaks as syllable nuclei. In general, vowels have greater sonority than consonants, and open vowels greater sonority than close. A wide variation in sonority levels among consonants (especially the high-sonority [s]) muddies the discourse on syllabification. A Sonority Hierarchy has been suggested, ranking articulations from lowest to highest sonority as follows: oral stops / fricatives / nasals / liquids / glides / vowels. Some linguists associate sonority in speech primarily with the size of the air stream resonance chamber. The voice pedagogue is well familiar with this, in terms of developing vocal technique. Maximizing sonority in this sense involves matters of stage projection, about which there exists a vast literature, not only for the singer but also for the stage actor and orator.

Square bracket []

The standard symbol for phonetic notation. IPA transcriptions between square brackets are phonetic (or narrow) transcriptions, that typically indicate a level of allophonic detail not found in phonemic transcriptions. The amount of detail is variable, and contingent upon the context of the discussion. No IPA transcription is capable of indicating a comprehensive level of articulative detail. Such detail can only be rendered in the feature matrices employed in generative phonology.

Earlier literature on lyric diction always employed square brackets, without distinguishing between phonemic and phonetic transcription. This was frequently problematic. Phonemic symbols were expected to be interpreted phonetically, based upon the related prose descriptions. Compare, for instance,

German [[.sup.[??]]bru-d[??]r] [TEXT NOT REPRODUCIBLE IN ASCII]
The left transcription implies that the [d] is followed by a syllabic
schwa, whose syllable coda is a consonantal [r], with a (presumably
short) roll. The right transcription captures the darker schwa, with a
vocalic <r> that is subsumed into the vowel. It also indicates vowel

The left transcription similarly prescribes an apical, consonantal [r]
in the unstressed first syllable, rather than the dark-schwa off-glide
(or semivowel) of a diphthong. The right transcription also includes
the recognized symbol for syllable break.

The left transcriptions (from earlier lyric diction sources) seem to be guilty of prescribing consonantal <r>s, to the exclusion of vocalic. In reality, they were simply employing the symbol [r] in a phonemic sense, leaving the precise allophone open to interpretation. In other words phonemic symbols invaded phonetic transcriptions; or, put differently, square brackets were the sole convention, with the symbols therein to be loosely thought of as either phonemes or precise articulations, as appropriate. This conflation would not pass muster in an introductory college course in linguistics. Conversely, the right transcriptions could be criticized for over-specifying the articulation of <r> in cases where latitude prevails in sung text.


Occupying the full nucleus of a syllable. Usually applied to pure vowels, but can also apply to consonants. Only one vowel in a diphthong or triphthong may be syllabic in a single syllable. In Italian <aiuto>, [a] and [u] are syllabic nuclei of successive syllables, and [j] is a semi-consonant in onset (sometimes called the on-glide in a rising diphthong).


A syllable is universally understood intuitively, in a loose sense. But a rigorous definition is remarkably elusive, and much has been written on the topic. The most accepted approach ("prominence theory") is to observe sonority peaks in the course of an utterance, and treat these as syllable nuclei. The main problem here is that a few consonants (especially [s]) have a much higher sonority rating than most others, yet are not thought of as separate syllables.


Issues of syllabification generally involve consonant clusters, and the manner of deciding which ones belong where. Like the syllable itself, intuition often borrows inconsistently from more than one method of viewing word structure. For instance "fishing" is usually hyphenated "fish-ing," but there is no phonotactic constraint on making the syllable boundary [TEXT NOT REPRODUCIBLE IN ASCII]. The sense that [[integral]] belongs with [[.sup.1]fI] is morphologically based. One visually recognizes the word more readily with the hyphen after <sh>. In French, the phonological syllabification will always be [p[epsilon].[integral]oe[??]r], even if the orthographic syllabification is "pech-eurs." One must be on a continuous vigil in French lyric texts for hyphenation that does not comply with the CV pattern of the phonology. The rules of syllabification in prose and poetry follow a different set of rules than that of phonetic syllabification. There are many words in English where hyphenation of words is contentious, and ambisyllabic (belonging to both syllables) consonants are common.

Voice onset time (VOT)

The elapsed time, if any, between the release of a consonant (most often a plosive), and the commencement of the vowel. VOT can be negative, zero, or positive. If voicing is initiated at the very moment of stop release, the VOT is zero. If voicing occurs prior to the release, VOT is negative, as in voiced plosives. If it occurs sometime after the release, it is positive, as in [[k.sup.h]]. Positive VOT equates with aspiration. In some non-European languages, the length of VOT is phonemic, in that different words can result. English has positive VOTs for single unvoiced plosives, while Italian or French have zero VOT--a distinction crucial for good lyric diction. Also in English, voiced consonants are often unvoiced for the first part of their duration in speech--a tendency to be eliminated in singing. Such partial devoicing is absent from French and Italian, and one should be aware of this tendency in English speech if one is to eliminate it in those languages. It is a particularly difficult hurdle to surmount with Italian voiced double plosives ("addio," "babbino").

Voiced / Voiceless (Unvoiced)

The accepted term for an articulation that involves vibration of the vocal folds is voiced. Voiceless is thus the opposite--vocal folds at rest. Unvoiced is synonymous and interchangeable with voiceless. Contrast devoicing / devoiced.


(1.) This has created the impression--one that I am uneasy with--that I am a "theoretical" writer on lyric diction. I entered the field as a voice and opera coach with language background, not as a linguist with musical background. My aim has always been to include only those concepts and nomenclature that have direct relevance to lyric text and music. While I cannot speak for the many valuable guest contributors to the column, I imagine that they would feel similar. The "true" theoretical literature on linguistics is vast, extremely technical, demanding, and largely irrelevant to the immediate concerns of practical musicians.

(2.) The Journal of Singing welcomes submissions of articles from guest contributors who have the particular expertise to provide such guidance. Languages that remain especially in need of essential lyric diction materials include Catalan, Danish, Dutch, Finnish, Greek, Polish, Portuguese, Rumanian, Serbo-Croat, Slovak--and of course any non-Western languages with a vocal music tradition, as well as dialects of English, French, German, Spanish, and many others with a significant musical repertory. Extensive musical repertories also exist with texts in Basque, Hebrew, Yiddish, Romany, Sanskrit, classical Latin, and Koine Greek.

(3.) Some terms are of such importance that separate articles have been devoted to them by the author, as an ongoing series under the rubric "Linguistic Lingo and Lyric Diction." These can be found as follows:

I--"Phoneme," Journal of Singing 67, no. 2 (November/December 2010): 187-193.

II--"Syllabification," Journal of Singing 67, no. 4 (March/April 2011): 427-435.

III--"Pronunciation Contrasts," Journal of Singing 68, no. 2 (November/December 2011): 179-190.

IV--"Lexical Juncture," Journal of Singing 69, no. 2 (November/December 2012): 179-190.

V--"Phrasal Juncture," Journal of Singing 69, no. 5 (May/June 2013): 577-588.

VI--"Assimilation," Journal of Singing 71, no. 5 (May/June 2015): 603-612.

VII--"Phonotactics," Journal of Singing 72, no. 3 (January/February 2016): 331-343.

(4.) Notating syllabic consonants in English as [[d.sup.1]] and [[t.sup.n]] forms a convenient circumventing of the matter of syllable count. A word such as "button" is normally thought of as a two-syllable word, and arguably should be transcribed [[.sup.1]b[LAMBDA].t[??]], or more precisely [TEXT NOT REPRODUCIBLE IN ASCII], when the /t/ releases directly into the syllabic consonant without an intervening [[??]]. A superscript consonant can only indicate type of release, not the existence of a new syllable. Similarly, "model" would be [TEXT NOT REPRODUCIBLE IN ASCII].

(5.) This is Plato's doctrine of "forms," with which the reader may be familiar. A phoneme is a "form" in this sense. It is similar to the model of meaning known as the "semantic triangle," introduced in the 1920s by Ogden and Richards, that relates symbol directly to concept, and concept directly to thing. The relationship between symbol and thing derives from those direct relationships.

(6.) Of course, this does not occur in style soutenu, but cannot be ignored by singers, as it is expected in spoken delivery in opera comique, opera bouffe, operette, and other lighter genres. The rules for dropping "e-muet" are involved, and unfortunately do not often form a component of French lyric diction discourse. For those wishing access to a thorough treatment of this fundamental feature of speech, the following are recommended: Pierre Fouche, Traite de prononciation francaise (Paris: C. Klincksieck, 1959), 91-139; Glanville Price, An Introduction to French Pronunciation (Oxford: Basil Blackwell, 1991), 76-87; Bernard Tranel, The Sounds of French: An Introduction (Cambridge: Cambridge University Press, 1987), 86-107.

(7.) The accepted spelling of rhotacized is with an <a>, not an <i> as is sometimes encountered.

(8.) Although the rhotic [[??]] is recognized by the IPA, the symbol [[??]]--often seen (as employed here) to distinguish stressed SAE rhotic schwas from unstressed ones--has been superseded by [[delta]] in both stressed and unstressed position, since stress is not a segmental phonetic feature.

No days such honored days as these! While yet
Fair Aphrodite reigned, men seeking wide
For some fair thing which should forever bide
On earth, her beauteous memory to set
In fitting frame that no age could forget,
Her name in lovely April's name did hide,
And leave it there, eternally allied
To all the fairest flowers Spring did beget.
And when fair Aphrodite passed from earth,
Her shrines forgotten and her feasts of mirth,
A holier symbol still in seal and sign,
Sweet April took, of kingdom most divine,
When Christ ascended, in the time of birth
Of spring anemones, in Palestine.

              Helen Hunt Jackson, "A Calendar of Sonnets: April"

Leslie De'Ath, Associate Editor
