Printer Friendly

Parallels and incongruities between musical and verbal behaviors.


Musical performance and interpretation constitute minimally explored areas of research in behavior analysis, and a paucity of adequate accounts of these behaviors may deprive the field of a potential frontier for applications. Additionally, analysis of complex musical behavior may spur conceptual improvements in the behavioral lexicon through the analysis of transposition and other examples of musical behavior for which functional operant classes have not yet been constructed. As in other areas of psychological investigation, where adequate functional accounts of phenomena are lacking, nonfunctional accounts quickly take their place.

Nonfunctional interpretations pervade descriptions among musicians concerning the education of students, as in the lament of Benedict and Schmidt (2015) that a "leap of faith into [music] teaching--once taken with little pause--is losing its affirming and qualitative aspects, allowing teaching to be flattened against checks and balances, oversimplified against measurements of effectiveness and efficiency." It might be asked what aims beyond effectively and efficiently teaching students music skills are relevant to a music education program. Indeed, without some quantifiable measures of progress by music students, teaching strategies defy evaluation or comparison, and blaming learners for the deficiencies of teaching strategies may be an implicit result of such logic. Or consider the possibly unwitting mentalistic position taken by Kenney (2009) in describing how, in the process of a child learning musical games, "The brain has had a feast, extracting patterns from the experience: searching for meaning from the words, solving the problems of game actions, sorting through the rhythm and melody to find patterns, solving socioemotional challenges required in choosing a partner, taking turns, and learning about self in space, to name just a few." Such metaphorical cerebral reductionism, long disavowed by even cognitive psychologists, constitutes an uncontroversial analysis among music educators as recently as 2009. The seemingly arrested progress of psychological accounts in music, though regrettable, is not easily attributed to any fault on the part of music educators or behavior analysts, as neither group has had much in the way of opportunities for mutual contributions.

A Sampling of the Suzuki Method

Of course, the eclectic theoretical foundations of music instruction have not kept music education from 'working' for centuries, just as mentalistic assumptions in literature have similarly not prevented writers from learning to read and write. The Suzuki Method is one contemporary approach illustrating how one popular method of musical instruction implicitly reiterates principles of behavior modification. Considering its prominence in music education, The Suzuki Method works, even if it does not detail exactly how.

The Suzuki website explicitly lists its teaching principles and cites examples. These include:

* "Suzuki teachers believe that musical ability can be developed in all children.

* Students begin at young ages.

* Parents play an active role in the learning process.

* Children become comfortable with the instrument before learning to read music.

* Technique is taught in the context of pieces rather than through dry technical exercises.

* Pieces are refined through constant review.

* Students perform frequently, individually and in groups." (Suzuki Association of the Americas, 2016).

The Suzuki Method teaches musical performance analogously to the acquisition of regular verbal behavior, coining this strategy the "Mother-Tongue Method" which acknowledges, among other principles, that listening to music precedes playing, playing precedes reading, and audience control is critical to shaping performances. This is a direct analogue to how children initially learn separately, and later combinatorically, to listen, then to speak, then to read.

Many aspects of the Suzuki method can be considered representative of music instruction methods in contemporary and traditional instruction, and these basic methods have been used in one form or another for many centuries and across the world. Indeed, musicians have been composing and performing long before the initial formulations of psychology, logical positivism, or Western science. It also follows that accomplished teachers of music have commonly experienced through contingencies of instruction many methods common to psychology and behavior analysis long before their inceptions including operant conditioning, shaping, chaining, stimulus fading, and distributed practice.

Behavior Analysis of Music

In describing how and why musical learning occurs, many conventions of musical behavior can be described functionally by an existing account of language which borrows examples from music: B. F. Skinner's Verbal Behavior (1957). In Verbal Behavior, Skinner seeks to describe verbal behavior in the language of operant contingencies. He coins a novel taxonomy of operants which includes the echoic, mand, tact, intraverbal, textual, and autoclitic, among others. These terms differ from traditional linguistic accounts of verbal behavior in that they are defined by functional relations. For example, the mand is not a request, but a verbal behavior under the control of a motivating operation (like hunger) that specifies its rein-forcer (or stimulus that increases the probability of that behavior). The verbal behavior "cake, please" simultaneously states its reinforcer (cake) and becomes more likely in instances of hunger. Such functional definitions have led to practical scientific technologies in language acquisition, especially in the field of autism treatment and research. It follows that similar functional definitions might enable new applications in the field of music as well, given a sufficient understanding of behavior environment relations underlying music and its interpretation. Particularly, Skinner's analysis of the intraverbal, or verbal behavior under the control of other verbal stimuli, describes nearly all combinations of greater than two notes in music and renders the question of how best to teach note sequences an empirical concern. With all conventions of music described in observable terms, quantification of student progress becomes possible, and music education methods may then become subject to experimental evaluation and improvement.

Behavior analysis as a method of interpretation and indirect consultation concerning music has already occurred for some time, with strong examples in the literature of Greer and colleagues, who examined the effect of music listening as a contingency for vocal pitch acuity and attending behavior (Greer et al. 1971), the impact of music discrimination training on children's allocation of listening time to different music types (Greer, Dorow, & Hanzer 1973), and the effect of adult approval on students' music preferences (Greer, Dorow, Wachhaus, & White 1973) (see also Madsen et al. 1976). Also noteworthy is the interdisciplinary work of Michael Domjan (2016), who recently launched the Tertis/Pavlov Project, a collection of online videos that feature behavior analytic commentary on conventions of music, with performances on a Tertis viola serving as examples. These contributions can be expanded into a synthesis of behavior analytic and musical conventions to study more specific and complex elements of music that are too often the default provinces of mentalistic explanations in areas such as pitch perception, improvisation, and creative songwriting. One example of this kind of synthesis can be found in the work of Snyder (2004) who investigated the important musical behavior of absolute pitch from a behavior analytic perspective.

This paper seeks to clarify and compare the basic definitions of terms used in music and the analysis of verbal behavior in order to draw connections between the domains. Further, the analysis of verbal behavior will be extended to musical behavior, and the adequacy of describing musical fundamentals in terms of Skinner's analysis of verbal behavior will be examined.

Mimicry and the Echoic

Mimicry in music constitutes an initial step of musical education: before a learner can engage in complex musical behavior, they must learn to copy sounds accurately on their instrument, which may include the voice. This is initially a rote process facilitated by educators to teach fundamental motor and discriminative behaviors. It has direct functional significance to the acquisition of musical skills when considered in the context of verbal behavior.

Skinner refers to echoic behavior as "the simplest case in which verbal behavior is under the control of verbal stimuli" wherein "the response generates a sound pattern similar to that of the stimulus. For example, upon hearing the sound Beaver, the speaker says Beaver" (Skinner 1957, p. 55). Considered in the context of music, a teacher might sing 'A' at a wavelength of approximately 440 Hz and expect learners to sing back 'A' sounds of the same pitch. Skinner continues, "An echoic repertoire is established in the child through 'educational' reinforcement because it is useful to parents, teachers, and others. It makes possible a short-circuiting of the process of progressive approximation, since it can be used to evoke new units of response upon which other types of reinforcement may then be made contingent" (Skinner 1957, p. 56). The process of rote practice in first learning to sing operates more than analogously to verbal learning; it often operates identically. The verbal community comprised of music educators teaches musical mimicry to establish musical component products, or notes, which are the musical equivalent to phonemes and words that are also initially learned as echoic behavior. In both cases, the topography of the learner's responses is modified through reinforcement delivered by the verbal community, except in vocal singing the topographies of pitch and tempo are also shaped.

Teaching someone to play an instrument also proceeds analogously to teaching verbal behavior, and is distinguishable primarily by topographical differences. Whereas the echoic typically involves the vocal musculature in verbal learning, learning to play an instrument simply involves different musculature. To return to the example of teaching a class to sing 'A' at 440 Hz with their vocal musculature, learning to play the violin would entail teaching violin students to play 'A' at 440 Hz with coordinated movements of the fingers, hands, and arms manipulating bows and strings. This could be done analogously to normal verbal learning by playing a given note and allowing a learner to 'find' it on their instruments until they match the original note which would constitute emulation in the form of an echoed product, or imitation if technical feedback is provided to reinforce the exact process of producing the target note (a distinction explored in Heyes & Galef 1996). Indeed there are instances of musicians who initially acquire musical behavior through this kind of fine-tuned shaping in which topographical correspondence is a means to conditioned social reinforcement (Greer & Speckman 2009), as in singing along with music, or 'self-taught' guitarists who learn by trying to find notes and chords they have heard, also known as 'playing by ear.' Shaping of this kind is common in small ensembles, but such shaping may not be feasible in classical music instruction based around rigid compositions and coordination of a large group of musicians with a variety of parts to play. Large ensemble music instruction more often involves a different operant with special application to reading and writing music: the textual.

Music Reading and the Textual

Skinner describes the textual as an operant in which "a vocal response is under the control of a nonauditory verbal stimulus" (Skinner 1957, p. 66). In traditional music pedagogy, the textual enables much more specific control over musical responses (i.e., singing and playing) than echoic control alone. The echoic requires a greater degree of variability in the course of response shaping than is desirable in an organized music classroom, so the textual is frequently utilized in its place, just as it is used in a conventional classroom to bring verbal behavior under less variable control (i.e., when learning new words through reading alone). As Skinner describes the process, "Early echoic behavior in young children is often very wide of the mark; the parent must reinforce very imperfect matches to keep the behavior in strength at all ... [;] the response is not yet a function of any variable available to the parent" (Skinner 1957, pg. 59-60). To reduce the degree of variability requires an expansion of the minimal repertoire through the implementation of nonauditory verbal stimuli demonstrating a point-to-point correspondence with the target response, which in music consists of notes or chords played within specified dimensions of intensity, duration, and tonal quality under the control of written verbal stimulation such as standard notation and instrument specific forms of tablature.

However, 'reading' music does not occur identically to the reading of words. The difficulty of silently rehearsing notes of a specified wavelength in the absence of a reference note is much more difficult than repeating a syllable to oneself which merely involves discrimination of the position of vocal musculature without the additional complex discrimination between pitches. Describing this behaviorally would translate to a matter of the accessibility of discriminative control. It is far easier to teach an association between physical locations on a musical instrument and given sounds than it is to teach discrimination between positions of the vocal musculature and given sounds. As a result, playing notes on an instrument given written standard notation as a discriminative prompt is considered commonplace in music education while singing those same notes under the same discriminative control is considered an advanced skill. Teaching a student to play open 'A on a violin or to play 'A' on the D or G string by shifting to the 4th and 8th position, respectively, is made more feasible by visual and tactile prompts such as fingerboard tape which can be faded later in instruction. No such visual and tactile prompt fading is usually available in the instruction of accurate singing.

An additional problem in reading words versus music is that syllables do not occur along a continuum as do wavelengths of sound. The syllable 'tah' could not reasonably be described as higher or lower than 'zeh,' and any other syllabic combination would be equally arbitrary in terms of topographical relation. On the other hand, musical notes by merit of their differing pitches topographically embody relational concepts of higher and lower, and combined intervals can evoke emotional reactions through dissonance and consonance which emerge from the complexity or simplicity of resulting sounds. From varying combinations of intervals, relational concepts of tension and resolution emerge as well as emotional relations of happier and sadder. Similar relations may apply to common verbal behavior in the instance of 'beautiful sounding languages,' but perhaps not with the same degree of commonality across listeners.

Similarities between reading standard notation and reading written conventional language cease at the point of silent reading, which is a less effective kind of identification in music than in typical reading. Skinner states, "Many performers or singers never learn to read silently and may find it necessary in spotting a musical text to play a few bars on an instrument or at least to whistle or sing it aloud. Comparable silent activities provide inadequate stimulation for an identifying response" (Skinner 1957, pg. 66-67). He does not discuss the reason why silently reading music may entail more difficulty than simply reading words, but one possible explanation may be the relative inaccessibility of discriminative control in covertly singing to oneself, as opposed to the more overt discrimination between the syllables "go" and "teh" which could not as commonly be mistaken for one another given the presence of distractor syllables. By contrast, the presentation of other musical notes can interfere with discrimination of musical pitch, as demonstrated by Hedger, Heald, and Nusbaum (2013). These authors found that the absolute pitch identification accuracy of so-called possessors of absolute pitch could be incrementally shifted downward through the imperceptible detuning of a piece of music being evaluated by key and pitch of notes. The alleged absolute pitch possessors were convinced that the slowly declining pitch of the musical piece under consideration had not changed or had changed in ways different than pitch. No comparable degradation of discrimination accuracy among conventional verbal units is likely possible, such as by fading 'go' gradually to 'teh,' for example, as these units are not discriminated relative to one another as musical notes usually are.

Transcription, Dictation, and Musical Transposition

The operant that describes written visual depiction of music is transcription, and musical transcription operates analogously to conventional verbal transcription with one clear exception being the topographies of heard verbal stimuli that are transcribed. Drawing musical notes on a scale with a formal correspondence to other notes is analogous to conventional transcription, in the sense that it is a minimal repertoire undergirding the development of more refined musical transcription that exhibits only a point-to-point correspondence between heard note and written note, also known as taking dictation. However, an idiosyncrasy distinguishes musical writing. Notes written on a staff may all be visually identical, yet each might constitute a separate note by virtue of its position relative to the staff, notes, and the key signature specified at the beginning of the piece. The placement of conventional textual stimuli controls no comparable verbal response. Writing 't' higher or lower than another 't' does not control any differing response, for example.

However, the minimal repertoire in the case of musical transcription is similar in its operation to normal transcription. To start, transcribers of both writing and music must be able to describe what is written, or they are merely following verbal rules prescribing similarity of visual forms as in calligraphy. Skinner states, "Copying a text in a familiar alphabet differs from drawing in the size of the 'echoic' unit. The skilled copyist possesses a small number of standard responses (the ways in which he produces the letters of the alphabet) which are under the control of a series of stimuli (the letters in the text)" (Skinner 1957, pg. 70). Composers and transcribers of musical scores exhibit similar differences in the size of textual units. Rather than tediously dictating each and every individual note heard, transcribers hear, tact, and transcribe arpeggios, chords, and recognizable note patterns, sometimes of several notes at a time, then write those note sequences in standard notation with respect to the notes' durations, frequencies, and tonal qualities. In this way, proficient writers of standard notation often transcribe and autoclitically tact, or emit verbal responses under the control of musical stimuli. For example, an A minor arpeggio may be transcribed as a distinct intraverbal sequence analogous to a word rather than individually labeling the notes 'A,' 'C,' 'E,' just as a fluent writer might simply write 'cat' as its own minimal unit rather than individually transcribing the letters 'C,' 'A,' and 'T.' Indeed, the scores of skilled composers often feature notes written in a kind of smooth scrawl similar to the characteristic cursive styles of fluent writers. These examples demonstrate how the minimal repertoire of note transcription can expand to accommodate arpeggios, chords, and other musical 'units' composed of several individual notes.

In composing music, a potentially novel operant may be exemplified by transposition. Due to the unique minimal repertoire of musical text in which discriminative control is exerted by the relative position of notes on a staff, it is possible to change the pitches of all notes to an equal degree, resulting in a similar perception of melody on the part of the listener despite a potentially large shift in pitch for all notes. In terms of verbal behavior, such a modification exemplifies a comprehensive, uniform adjustment of point-to-point correspondences for all notes on a staff. Modifying a piece of music in this way is crucial for transcribers and composers of music, who often record music in writing, or devise arrangements of notes for multiple instruments, some of which play music written in distinct keys. Typically, following transposition, composers must change the key signature or add sharps and flats as extra-stimulus prompts to facilitate playing in the new key: a notation which is roughly similar in concept to the dot and dash extra-stimulus prompts which delineate long from short vowel sounds.

Writing heard notes with a point-to-point correspondence across dimensions demonstrates taking dictation. Whereas the correspondence in simple transcription consists of formal similarity between two visual stimuli, the correspondence in dictation consists of a point-to-point correspondence across dimensional systems: auditory stimuli evoke written responses which vary as a function of those auditory stimuli. For example, if the notes A, C, E, G, are heard one after the other, the musical dictation of these notes would yield A, C, E, and G on standard dictation as an arpeggio, and further, the rhythm of these notes would also control which note values are assigned.

However, the minimal unit in this case still depends on the correspondence between individual heard and written notes. A new verbal operant emerges when the listener describes the heard notes 'A, C, E, G' as an 'A min 7 arpeggio in 1/8 notes with a 4/4 time signature acting as the resolution to a i, ii, V, i chord progression.' The description of these more complex verbal relations comprise the basis of music theory, which might be described analogously as the grammar of musical composition. To extend the analogy further, a verbal analysis of musical behavior may elucidate the functions between musical behaviors and controlling stimuli just as behavior analysis has described the functions between conventional verbal responses and controlling stimuli. The expansion of the minimal unit shown in the music theory example above corresponds with a different type of operant, as the stimuli involved do not exhibit point-to-point correspondence. Skinner refers to this operant as the intraverbal, and it underscores much of what characterizes music as a unique form of verbal behavior.

The Intraverbal in Musical Chords, Melodies, and Arpeggios

Skinner defines the intraverbal as a verbal operant maintained by generalized conditioned reinforcement in which "verbal responses show no point-to-point correspondence with the verbal stimuli which evoke them" (Skinner 1957, p. 71). In the conventional case, an example of intraverbal control might be the stimulating phrase "3 + 3" reliably evoking the response "6," or "A, B, C, D" reliably evoking "E, F, G ..." In the example of music, standard notation shares much in common with both the alphabet and numbers, albeit with some idiosyncratic features lacking in both numerical and lingual intraverbals. One distinction concerns the cyclical reiteration of the 'same' notes across octaves, such that A3 and A5 on a piano differ in wavelength, and are considered different notes, but are often treated as compositionally interchangeable in music theory due to being octaves; a distinction which finds no parallel in language, where no two different letters can be treated interchangeably except in some specific inaudible dialectical variations as in 'apologize' versus 'apologise.' An argument could be made that this kind of cyclical progression occurs in mathematics as well, such as when '10, 20, 30 ...' or other multiples can assume parallel functions to the intraverbals '1, 2, 3 ...,' yet this parallel ends at the point of finite comparisons, as number lines are not bound to any theoretical cycle and terminus as are musical notes. The musical intraverbal is also distinguished by its capacity for simultaneous intraverbals. Many notes played at the same time are interpretable as a unit, which is not the case when multiple syllables are spoken at once.

In similar form to the alphabet, there are only a finite number of perceptible notes in music though they cyclically repeat through octaves in a range between the lowest and highest frequencies perceptible to human hearing. The 12 steps of the equal tempered chromatic scale are used in almost all music. Variations on this standard include the minute variations of 'just' temperament which seek to more closely approximate pure intervals through the modulation of certain notes of conventional scales, and myriad other discordant schemes of equal temperament, which have their origin more in music theory than conventional utility. Such scale systems are the exceptions which prove the rule of Standard Notation. Comprehensive systems of intervals are analogous to the International Phonetic Alphabet, which replicates all sounds in English while adding a multitude of sounds that are superfluous in everyday writing. In similar form, comprehensiveness of a musical lexicon in accounting for all possible intonations does not necessarily constitute an aesthetic virtue when heard and judged by a conventional audience, though it may provide descriptive precision in the study of intervals.

Relative and Absolute Pitch

Some particularly problematic constructs in music include the alleged 'possessions' of mimicry, relative pitch, and absolute, or perfect, pitch, which are commonly treated as the capacities of an exceptional musician rather than discreet operants that are reinforced to a standard of performance. Mimicry training is essentially identical to echoic training, and consists of singing a note that is identical to a heard note. In conventional verbal behavior, this might correspond with saying 'M' upon hearing 'M.' Relative pitch is the production of a note given another heard note, which might correspond conventionally with instantly saying the letter that is five letters after 'H' ('M'). Finally, absolute pitch is "the ability to identify a tone's pitch or to produce a tone at a particular pitch without the use of an external reference pitch", which might correspond conventionally with instantly identifying the 13th letter of the alphabet ('M') (Takeuchi & Hulse 1993).

The covert behaviors involved in these tasks may encourage mentalistic interpretations of their acquisition. Indeed, the view of absolute, or perfect, pitch as an innate possession of gifted savants is traditionally held, as expressed by Ward (1963) who states, "The consensus of the experiments on learning is that 'genuine' AP cannot be taught to adults. Although some improvements with training will occur, no one has yet brought an initially unskilled subject to the level of proficiency ... of nearly perfect semitone discrimination" (Ward 1963). It remains to be seen what criterion distinguishes 'genuine' from 'false' absolute pitch, though the more parsimonious interpretation is that if improvement is possible in the behavior, then the skill must necessarily be shapeable through reinforcement, even if a reliable method of acquisition has not yet been developed. Further, an experimental demonstration of a method's inadequacy cannot support the interpretation that the skill is unteachable, though it is conceivable that, as with second language acquisition, there may be developmental sensitive periods in which pitch training will be most beneficial to the learner or even incapacities that may hinder acquisition.

With these limitations considered, absolute pitch could be conceived of as a low probability behavior awaiting an effective method of acquisition. Such a method might include discrete trials pairing sounds with specific positions of the vocal musculature until the two are paired with results maintained over increasing latencies. This might be accomplished by presenting echoic prompts with visual feedback of notes sung faded to social feedback on accuracy with increasing delays added between the echoic prompt and the sung response. If the delay between echoic prompt and correct sung response increases to latencies typical of absolute pitch possessors in unskilled participants, it could be said that absolute pitch was acquired. This procedure is similar to that already undertaken by Snyder (2004) in an experiment on modification of absolute pitch behavior with the exception that visual prompts were not used.

The covert stimulation and complex discriminations involved in teaching relative and absolute pitch may be usefully described using the intraverbal operant, though Skinner himself refers to them in the context of the textual class (Skinner 1957, p. 68). As both relative pitch and absolute pitch are musical behaviors under the control of stimuli produced by other musical behaviors, those behaviors in the absence of written stimuli exemplify intraverbal responses to the extent that the sung pitch is produced relative to a reference pitch. In the case of relative pitch, the reference pitch is audible while it is covert in the case of absolute pitch. Interpreting these skills through such a lens suggests a teaching protocol that might explicitly train component behaviors of relative and absolute pitch to fluency in order to facilitate those abilities. In this way, a science of musical behavior might teach important musical skills traditionally considered inaccessible under mentalistic assumptions.

Intraverbals and Music Organization

In combining the intervals of music, direct parallels with intraverbal behavior in general become evident. Skinner states, "Many important characteristics of chained verbal responses, or of intraverbals in general, are clarified by a comparison with musical behavior. In playing from memory, the haplological anticipatory jump to a concluding phrase, the reverse haplology of being unable to find the concluding phrase because an earlier linkage keeps recurring, and the 'running start' frequently needed to begin playing in medias res are all obvious parallels. Music also provides evidence of the importance of self-stimulation in 'intraverbal' chains. The singer who cannot produce notes at the proper pitch may 'loose the melody' [sic] in either sight-reading or singing by ear or from notes" (Skinner 1957, p. 73). The examples of musical phrases as introductory and concluding phrases parallel the tendency of learners to omit portions of memorized scripts, as in the alphabet, where young learners commonly exhibit errors in producing certain letters which are correctable by having them perform a running start. These examples derived from compositional structure hardly demonstrate the simplest intraverbals possible in music, as any melodic combination of notes is necessarily intraverbal, even in cases as simple as two-note implied chords. At the other extreme, even quite complex intraverbal behavior occurs commonly in music, as in the commonality of contiguous usage as a means of discerning predictable patterns in musical 'improvisation.' Skinner refers to this phenomenon in stating, "aside from intraverbal sequences specifically acquired, a verbal stimulus will be an occasion for the reinforcement of a verbal response of different form when, for any reason, the two forms frequently occur together. A common reason is that the nonverbal circumstances under which they are emitted occur together" (Skinner 1957, p. 75). An example of contiguous usage in improvisation might include common 'tricks' of the solo instrumentalist, such as predictable endings of phrases with the root note or third, or common sequences of notes colloquially referred to as 'turn arounds' which can be used to signal the end of an improvisational phrase and have their parallel in the construction of paragraphs with identifiable beginnings and endings.

What makes one soloist 'better' or 'worse' than another is the degree to which they quickly chain these phrases and use intraverbal sequences to signal the beginnings and endings of phrases. A player's acumen at manipulating these conventions in real time indicates they are performing as both player and listener, much as the roles of speaker and listener are exemplified in conventional verbal behavior. Self-referential composition that draws an audience's attention is referred to as 'framing' among artists, and is sometimes the only discernable difference between a commonplace item and a piece in a gallery. In the context of music, proper phrasing, or discriminable beginnings and endings, can make the difference between one perceiving an 'inspired improvisational piece' or a discordant collection of notes. Indeed, it was manipulating these conventions that led to Ornette Coleman's Free Jazz or John Coltrane's Ascension where musical framing is either unconventionally presented or omitted, though the aesthetic value of such manipulations has remained controversial, just as it has in free form poetry.

The Autoclitic and Musical Phrasing, Framing, and Composition

The degree to which a musician exhibits the aforementioned phrasing owes to yet another operant called the autoclitic. Skinner describes the descriptive autoclitic as "a response, that, when associated with other verbal behavior is effective upon the same listener at the same time" (Skinner 1957, p. 315). It constitutes an instance in which, "The speaker may acquire verbal behavior descriptive of his own behavior. Although the community can establish such a repertoire only by basing its reinforcing contingencies upon observable behavior, the speaker eventually exhibits it under the control of private events. The behavior so described may be verbal: the speaker may talk about himself talking. He may describe the responses he has made, is making, or will make" (Skinner 1957, pg. 313). This operant may be described in a musical context as tacting occasioned by musical discriminative stimuli. While conventional autoclitics can be said to alter the effects of other operants such as tacts or mands, music is notable for lacking clear correlates to tacts and mands, which may account for the ubiquity of lyrical prosody as a musical accompaniment enabling tact and mand functions to be exhibited in the context of song. For example, Band Aid could not mand listeners to "Feed the World" with melody alone in the absence of supplemental verbal behavior in the form of lyrics. Nor does Korsakov's Flight of the Bumblebee necessarily tact a bumblebee's flight without the title. On the other hand, certain limited musical conventions do seem to exhibit tact and mand functions, such as national anthems which appear to mand an audience to stand and attend, or musical leitmotifs which characteristically tact certain television characters or shows.

With respect to the interpretation of musical notes alone, the musical autoclitic refers to verbal behavior under the control of note patterns and the aforementioned framing of musical note sequences. Examples might include referring to song 'introductions,' 'endings,' 'bridges,' 'choruses,' and other terms relevant to Skinner's discussion of the autoclitic in talking about talking, or in this case, talking about performing or composing music with respect to 'frames.'

Regarding musical criticism and appreciation, the autoclitic assumes great importance as a means of explaining why different performances of the same sequences of notes are assigned varying reviews across listening audiences. In order to appreciate a musical piece, one cannot merely perceive a piece of music, but must talk about the performance and tact organizing elements of what is heard. For example, a listener might appreciate a piece by autoclitically tacting the 'bridge' of a song as a particularly enjoyable 'transition' to a 'chorus.' An extensive degree of musical training allows the listener to interpret more minute elements of a heard piece such that one's critique could be said to demonstrate a more 'refined' or 'nuanced' appreciation of a heard piece. A professional music critic might notice how the 'transition' of "You Can Never Hold Back Spring" to a 'diatonic minor chord structure' underlying the final repetition of the 'theme melody' adds poignancy to the composition in the 'finale.' The composition of a minor chord structure can be said to have altered the effect of the theme melody on the critic; an autoclitic function with respondent elements of conditioned association.

In another example, the Beatles melody from "You Never Give Me Your Money" on Abbey Road is the refrain of a horn section in a later song "Carry That Weight" which continues into an additional verse of the earlier song followed by an additional refrain of the English nursery rhyme "All Good Children Go To Heaven." The repetition of these melodies could be seen as autoclitics influencing the tacts of listeners concerning that song and the album as a whole. The song "Carry That Weight" listened to in isolation from the album Abbey Road would likely result in a markedly different effect on listeners' impressions than the song in context with all other songs. More blatantly, the Leonard Cohen song "Hallelujah" contains the lyrics "It goes like this/The fourth, the fifth/The minor fall, the major lift/The baffled king composing hallelujah" which are sung simultaneously with, and describe in real time, a I, IV, V, ix, IV, V, iii, ix chord progression as a descriptive autoclitic (Cohen 1984). The minimal unit of the autoclitic is sometimes expanded, as in call and response traditions, in which musical refrains are played and echoed across instruments, for example in Indian and African musical traditions. These examples seem to act as extended intraverbal sequences which may autoclitically tact specific 'parts,' such as 'caller' and 'responder.'

More complex than these examples is the behavior of the virtuoso who can readily exhibit these nuanced variations in real time during improvisation specifically in order to evoke such audience interpretations, and thereby 'communicate through music;' a seemingly autoclitic function in the way that selection of notes or improvisational variations in composition alter the effect of the music on the audience. For example, idiosyncratic usages of tremolo, vibrato, grace notes, improvisational fills, modulations, and other artist specific musical signatures may serve as autoclitic tacts for specific artists' 'playing styles' such that a listener may be able to accurately discriminate between Stevie Ray Vaughan and George Harrison by guitar playing alone.


Skinner's verbal operants as descriptive categories serve as analogues for many musical behaviors. The echoic, textual, intraverbal, and autoclitic correspond with common musical fundamentals including mimicry, transcription/dictation, relative/absolute pitch, and composition/phrasing. Even the traditional instruction of instrumental music through textual prompts has its parallel in academic settings such as science and history, where students may learn to say words relevant to those disciplines through reading alone in the absence of echoic prompts.

However, in spite of these possibilities, some elements of musical behavior defy ready categorization in Skinner's existing taxonomy of operants. These exceptions include transcription of musical pitch by the relative physical position of written notes, the cyclical intraverbal sequences of musical scales, the concept of octaves, and the distinct process of transposing identical melodies in varying keys such that pitches of all notes are altered despite a high likelihood of evoking very similar or identical appraisals from a listening audience. Also unique to music is the property of group performance, which does not extend to conventional verbal behavior. Many musicians playing different notes in the same key at the same time can form a musical performance which listeners can interpret, yet many speakers saying different words at the same time is a meaningless cacophony. These examples confound the existing verbal behavior approach to music, and suggest that further classification may be necessary to effectively describe some divergent musical behaviors. In these cases, analysis of musical behavior may actually benefit behavior analysis through an expanded taxonomy of operants.

Additionally, certain verbal operants exhibit unclear correlates in music absent conventional verbal behavior. For example, without a supplementary title, the notes of Vivaldi's Four Seasons may not serve to tact the four seasons. Without supplementary lyrics, Bob Dylan's protest songs might not implicitly mand, or be a conditioned establishing operation for, the activism of a pacifist counterculture. Further, without supplementary verbal behavior, autoclitic criticism of musical composition would be impossible, as one cannot hum about humming. Hence, music as it relates to the arrangement of notes and rhythms is possibly different from conventional verbal behavior in the verbal operants it can be said to exemplify. Despite these differences and limitations of the existing verbal operants in music, a functional account of the discipline remains both possible and potentially beneficial.

Much as Skinner's analysis of verbal behavior abates much mystery concerning complex verbal behavior, so too might the taxonomy of verbal operants offer an accessible functional account of otherwise elusive musical skills to those who would consider it. Further, investigation of certain musical conventions may serve to expand the scope of verbal operants into the realm of audiology and set a precedent in music education for a behaviorist teaching philosophy. Following from the conclusion that a functional analysis of musical behavior is possible, it is the authors' hope that more experiments in the acquisition of musical repertoires be undertaken using the methods of behavior analysis for the shared benefit of behavior science and the field of music education. Further, through the theoretical examination of musical behavior as a distinct subject matter, behavior analysis may benefit from an expanded taxonomy of all the operants shaped and maintained by the verbal community.

DOI 10.1007/s40732-017-0221-8

Published online: 10 March 2017

[email] Linda J. Hayes

Benjamin S. Reynolds

Compliance with Ethical Standards

Funding This study has no external funding to disclose.

No Conflict of Interest Exists Benjamin Reynolds declares that he has no conflict of interest. Dr. Linda J. Parrott Hayes declares that she has no conflict of interest.

Human and Animal Rights No human or animal subjects were involved in this study.

Ethical Approval This article does not contain any studies with human participants or animals performed by any of the authors.


Benedict, C., & Schmidt, P. (2015). Acts of courage: Leaping into mindful music teaching. The Canadian Music Educator, 56(3). 16-20.

Cohen, L. (1984). Hallelujah. On Various positions [CD], Columbia Records.

Domjan, M. (2016). The Tertis/Pavlov Project. The University of Texas at Austin,

Greer, R. D., Dorow, L. G., Wachhaus, G., & White, E. R. (1973a). Adult approval and students' music selection behavior. Journal of Research in Music Education, 2/(4), 345-354.

Greer, R. D., Dorow, L. G., & Hanzer, S. (1973b). Music discrimination training and the music selection behavior of nursery and primary level children. Bulletin of the Council for Research in Music Education, 35, 30-43.

Greer, R. D., Randall, A., & Timberlake, C. (1971). The discriminate use of music listening as a contingency for improvement in vocal pitch acuity and attending behavior. Bulletin of the Council for Research in Music Education, 26, 10-18.

Greer, R. D., & Speckman, J. (2009). The integration of speaker and listener responses: A theory of verbal development. The Psychological Record, 59, 449 488.

Hedger, S. C., Heald, S. L. M., & Nusbaum, H. C. (2013). Absolute pitch may not be so absolute. Psychological Science, 24(H), 1496-1502.

Heyes, C. M., & Galef, B. G., Jr. (Eds.). (1996). Social learning in animals: The roots of culture. San Diego: Academic Press.

Kenney, S. (2009). Brain-compatible music teaching. General Music Today, 25(1), 24-26.

Madsen, C. K., Greer, R. D., Madsen, C. H., Jr. (Eds.). (1976). Research in music behavior: Modifying music behavior in the classroom. Columbia University: Teachers College Press.

Skinner, B. F. (1957). Verbal behavior. Cambridge: B. F. Skinner Foundation.

Snyder, J. (2004). Toward a behavioral account: A feedback protocol for the acquisition of absolute pitch. UNR Behavior Analysis Program Master's Thesis.

Suzuki Association of the Americas. (2016). Website. Subsection "About the Suzuki Method." Retrieved 12/5/16 from https://

Takeuchi, A. H., & Hulse, S. H. (1993). Absolute pitch. Psychological Bulletin, 113(2), 345-361.

Ward, W. D. (1963). Absolute pitch. Sound, 2, 14-21.

Benjamin S. Reynolds (1) * Linda J. Hayes (1)

(1) University of Nevada, Reno, NV, USA
COPYRIGHT 2017 Springer
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2017 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Reynolds, Benjamin S.; Hayes, Linda J.
Publication:The Psychological Record
Article Type:Report
Date:Sep 1, 2017
Previous Article:Effects of cultural consequences on the interlocking behavioral contingencies of ethical self-control.
Next Article:The simple memory span experiment: a behavioral analysis.

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters