Language and grammar: a behavioral analysis.


Speech-language pathologists' (SLPs') academic study of language is heavily influenced by linguistic and cognitive viewpoints. A majority of textbooks and writings familiar to SLPs explore in greater detail the linguistic and structural view of language and offer only a limited summary of the behavioral view whose concepts and implications are not carried throughout the text. Most SLPs are well versed in the phonologic, morphologic, syntactic, and pragmatic structures of language but are not equally well versed in the functional units that are basic to Skinner's (1957) analysis. Nonetheless, SLP's treatment methods are mostly behavioral (Hegde, 1998, 2008a). Inevitably, this has led to a conceptually inconsistent model of language and treatment of language disorders.

Chomsky's (1959) critical review of Skinner's (1957) book--Verbal Behavior--is better known than the book itself. Most students and clinicians seem to be unaware of the invalidity of Chomsky's criticism or the competent responses given to his negative review (e.g., Anderson, 1991; MacCorquodale, 1969, 1970; McLeish & Martin, 1975; Palmer, 2006; Richelle, 1976). Rejoinders to his review have pointed out that Chomsky poorly understood Skinner's Verbal Behavior, behavioral methodology, and behaviorism. Chomsky's misunderstanding of Skinner's book and concepts was so severe that it "would prompt most examination graders to read no further" (Richelle, 1976, p. 209). Chomsky frequently attributed views of other psychologists to Skinner who had unequivocally repudiated them. In a questionable case of scholarship, Chomsky repeatedly misquoted Skinner (Adelman, 2007). More than four decades after he wrote the review, Chomsky was still a critic of Skinner, and with the same distorted understanding of Skinner's work (Virues-Ortega, 2006).

A commonly held assumption among most linguists, and SLPs who follow them, is that Skinner's Verbal Behavior has faded into history. The fact, however, is that research on verbal behavior and treatment of verbal behavior disorders based on Skinnerian analysis are flourishing. Among several others in the Unites States, the journals of The Analysis of Verbal Behavior, The Behavior Analyst, Journal of Applied Behavior Analysis, Behavior Modification, and several international journals on behavior analysis regularly publish many articles on the Skinnerian verbal behavior analysis and treatment. This journal, Journal of Speech-Language Pathology and Applied Behavior Analysis is devoted to bridging the gap between the two disciplines. As Schlinger (2008a) has ably demonstrated, Skinner's Verbal Behavior is alive and well. An interesting observation Schlinger makes is that although both Verbal Behavior and Chomsky's (1957) Syntactic Structures had their 50th anniversary in 2007, Skinner's book on, has been selling better than Chomsky's. The verbal behavior approach to treating children with autism is now recognized internationally as the most evidence-based approach. Teaching almost all forms of communication disorders is essentially behavioral (Hegde, 1998, 2006, 2007; Hegde & Maul, 2006; Pena-Brooks & Hegde, 2007), whether some SLPs acknowledge it or not. In fact, if any tide has turned against something, it is the tide against Chomsky's generative linguistics. While Skinner's experimental and applied behavior analysis is thriving worldwide, Chomsky's generative grammar notion has disappeared from linguistics (Harris, 1993; Leigland, 2007). Chomsky's own multiple revisions and qualifications of his 1957 theory have moved away from a cognitive, generative, rule-based theory of language (Schoneberger, 2000). Within just a few years of Chomsky's Syntactic Structures was published, there was the generative semantic "rebellion" that denied the supremacy of grammar in language. (Linguists often describe newer approaches as revolution, war, rebellion.) Soon came the "pragmatic revolution" which asserted in the 1970s that language should be understood as actions performed in social contexts--mostly an arm-chair philosophical view which was still structural in its orientation. More than 30 years before the "pragmatic revolution," Skinner had advocated the social nature of verbal behavior with better conceptual and experimental bases than the speculative pragmatic approach has ever had (see Skinner, 1957, Preface, for a historical account of his analysis). SLPs have found that when they need to intervene (i.e., offer treatment), they need to turn toward Skinner's experimental and applied behavior analysis; linguistics of any era could offer little or no help.

Contrary to the typical portrayal of Skinner's analysis of language as "simplistic," it is sophisticated, complex, and comprehensive. His analysis of verbal behavior, as he preferred to call it, includes an innovative analysis of grammar, word order, and meaning (Hegde, 2008b) which is unfamiliar to most SLPs. There are other methodological behavioral approaches to language (Osgood, 1963; Mowrer, 1952; Staats, 1968) that are sometimes confused with Skinner's vastly different radical behavioral approach that offers a natural science view of language, with an ensuing applied technology that SLPs have readily accepted. At least three unique features of Skinner's analysis of verbal behavior are especially relevant to an applied science of speech-language pathology.

First, Skinner's analysis accepts the constraints of the methods of natural science. Dependent variables are analyzed in relation to their publicly observable, measurable, and experimentally manipulable independent variables. Skinner's analysis is functional in the sense that it seeks to identify variables that cause verbal behaviors. Explanations of events are kept at the level of observation and experimental analysis, and therefore, do not involve inferred mental, cognitive, or pseudobiological (innate) entities.

Second, Skinner's analysis treats language as a form of behavior, and not as a formal system that exists in the minds or brains of speakers, independent of their actions. Thirty years after the publication of his Verbal Behavior, Skinner (1987, p. 11) restated that his book "is not about language. A language is a verbal environment, which shapes and maintains verbal behavior." He went on to say that "Those who want to analyze language as the expression of ideas, the transmission of information, or the communication of meaning naturally employ different concepts." (1987, p. 11). He then urged the scientists to judge which one--a scientific causal analysis or a mental structural analysis--works better. When a causal approach is preferred, analysis of structural properties of mechanically generated sentences (e.g., they are eating apples, or colorless green ideas sleep furiously) are not productive because they do not represent empirical data. Such productions will be of interest to scientists only when they are empirically recorded utterances of speakers, under given conditions of stimulation, meeting specific social consequences.

Third, Skinner's analysis does not include special explanatory laws. He wrote Verbal Behavior to show that "speech is within the domain of behaviors which can be accounted for by existing functional laws, based on the assumption that it is orderly, lawful, and determined, and that it has no unique emergent properties that require either a separate causal system, an augmented general system, or recourse to mental way-stations" (MacCorquodale, 1969, p. 832). Consistent with his analysis of behaviors in general, Skinner has analyzed verbal behaviors in terms of a contingency relationship between (1) current states of motivation, (2) currently controlling environmental conditions, (3) past history of reinforcement, and (4) the genetic constitution of the individual (Skinner, 1957). Operant analysis, therefore, is not restricted to "stimuli and responses" and does not ignore the genetic factors.

Verbal Behavior: Definition

Verbal behavior (VB) is a class of behavior that is "reinforced through the mediation of other persons" (Skinner, 1957, p.2). Verbal behavior is social behavior, because, unlike nonverbal behavior, it cannot be conditioned or maintained by nonsocial entities. Nonverbal behavior in this context does not refer to nonvocal verbal behavior (as in alternative forms of communication). It refers to behaviors that are, in traditional terms, noncommunicative (e.g., walking or watering a house plant). Contrary to nonverbal behaviors, verbal behavior may be conditioned only by the actions of other people.

The essence of Skinner's definition is that it is only people who get affected by it in such a way as to get conditioned to reinforce VBs. In other words, both the VBs, and their consequences (listener responses), are conditioned. Also, unlike nonverbal behaviors, VBs are devoid of direct and mechanical reinforcement contingencies (Skinner, 1957; MacCorquodale, 1969). As E. Vargas (1988) distinguished them, VBs are verbally governed (mediated), whereas nonverbal behaviors are (environmental) event-governed. Consider the example that contrasts a nonverbal response with a verbal response: A thirsty woman may walk up to the refrigerator and get a drink. The nonverbal response of walking will directly and mechanically get reinforced when she gets her drink--an environmental event. No other person need be present to reinforce it. But instead, if her response is verbal (e.g., "May I have a glass of water?"), it needs social mediation to get reinforced. Someone (mediator) has to reinforce it by complying with her request. The need for a mediator to select and strengthen VBs adds an additional element to the familiar three-term contingency involving stimuli, responses, and consequences that explains nonverbal behavior. VB, therefore, is explained on the basis of a four-term contingency that involves (1) stimuli, (2) verbal responses, (3) listener responses, and (4) the reinforcing effects of listener responses (J. Vargas, 2009). It should be noted however, that in all other respects, VB is essentially like nonverbal behavior. For instance, verbal and nonverbal behaviors both have their respective discriminative stimuli, and are similarly selected and strengthened by their consequences, and may be extinguished by withholding reinforcement (E. Vargas, 1988). Also to be noted is that the uniqueness of VB does not require special explanatory laws; Skinnerian laws of behavior are sufficient to account for it.

Verbal Behavior: Units of Analysis

An analysis of verbal behavior should first determine the units of analysis. Linguists analyze language with such structural units as phonemes, morphemes, words, and sentences that may be adequate for a formal analysis of language. Skinner asserted that linguistic structures tell us nothing about their causes--but the natural science account of any phenomenon is a causal analysis. Apparently, structuralists presume that independent variables can be sliced according to the structural properties of responses. That is, phonemes, words, sentences, and so forth necessarily have separate causal variables--a presumption without empirical support.

Skinner's analysis shows that the same cause may lead to the production of a word, a phrase, or a sentence depending on the current stimulus condition and past reinforcement history. For instance, one might just say, "yuck" or "I think it is disgusting"--variable structural units under similar stimulus conditions and similar effects on listeners. To the contrary, the same verbal response may be controlled by different independent variables in different situations. For instance, a boy might say "ball" because he saw a ball, or echoed someone else, or read the printed word ball.--different causes for structurally the same response ("ball"). That structures (forms) and causes do not covary is unaccounted for in the linguistic analysis. A word is always a word, regardless of why it was produced. A sentence is different from a word, though it may have the same cause as a word on a given occasion. Skinner's analysis of verbal behaviors based on their independent variables avoids this problem inherent to structural analysis.

Technically, the response unit in the behavioral analysis is called a verbal operant which is ". . . a disposition (tendency, likelihood) to respond in a certain way to a certain state of affairs because of a past history of reinforcement" (Winokur, 1976, p. 21). A given verbal response is concrete, and is an exemplar of a class of responses. In contrast, a verbal operant is abstract because it means both a controlling relation and a class of verbal responses with similar causes and conditioning history. Skinner classified VBs on the basis of motivational variables, discriminative stimulus control, and other VBs (that cause additional VBs). The following sections of this paper summarize distinct verbal operants, beginning with mands.

Motivational Control: The Mand

A mand is a verbal operant whose cause is a motivational variable. States of deprivation or aversive stimulation cause mands to be emitted by a speaker. Skinner defined the mand as "a verbal operant in which the response is reinforced by a characteristic consequence and is therefore under the functional control of deprivation or aversive stimulation" (1957, p. 35-36). Under a state of deprivation, positive reinforcers (consequences individuals work to obtain) will be effective. Under conditions of aversive stimulation, negative reinforcers (consequences that remove such stimulation) will be effective. In either case, a mand of any form, including speaking, writing, signing (e.g., American Sign Language), pointing, finger spelling, and sending Morse codes may be emitted (Michael, 1982).

Responses such as A glass of water, please or May I have a hamburger are controlled by states of deprivation and are reinforced positively. States of deprivation are motivational, and deprivation simply means that a person has not had access to something specified for some measured duration. Responses such as Quit that or Get out are controlled by their respective aversive stimulus and are reinforced negatively when the listener complies. In all cases, a mand specifies its own reinforcer; for instance, the mand, Will you please be quiet specifies what will (negatively) reinforce that mand: cessation of chatter. When mands are produced, an appropriately conditioned listener will act in ways that are reinforcing to the speaker.

Produced mostly for the benefit of speakers, and propelled by states of motivation, particular forms of mands do not strictly covary with discriminative stimuli present in the environment. For instance, a speaker's mand, "May I have an apple pie?" is more likely in places where pies are available. Nonetheless, one might also say, "I want to eat a piece of pie" when none is in sight; it may function as a mand if another person who hears it proceeds to bake a pie. Occasionally, when deprivation is very strong, mands may be completely free from external stimulus control, as in the "isolated desert-dwelling hermit's cry, 'water'" (Winokur, 1976, p. 30). In general, requests, commands, prayers, advice, questions, warnings, permissions, offers, and the like are mands. Note that multiple linguistic categories are reduced to just one (mand). Whether an utterance is a mand or not cannot be determined by its structural properties. The utterance of the word "Fire!," for example is a mand when addressed to a firing squad, a textual when read aloud from print, a tact when it is evoked by the sight of fire, and an echoic when a child in therapy imitates that modeled word. Similarly, the sentence I see fire may be a tact, a textual, or an echoic, each with its own cause.

Skinner (1957) also described a variety of generalized mands, which seem irrational but are nevertheless lawful. Extended mands occur when people mand small babies, dolls, untrained animals, and machines (e.g., a driver's mand at a stop light, "Common, green light!") that do not reinforce the speaker. They are maintained because of a past history of reinforcement for similar responses emitted under similar conditions.

Clinical Implications

SLPs should be especially interested in teaching mands to children and adults with language disorders. Traditionally, SLPs have shown greater interest in teaching tacts (see the next section)--the perennial naming of objects and colors rather than mands. Mands, however, are an important class of verbal operants that clinicians should target in both early and later stages of language intervention with children as well as adults. Even individuals with aphasia, traumatic brain injury, or dementia would be better functional communicators if they could mand. In fact, what is promoted as functional communication in speech-language pathology is, for the most part, mands. For the purpose of clarification, it should be noted here that function in the speech-language pathology literature does not refer to causes, as it does in natural science and behavioral analysis. Instead it vaguely refers to the "use of language." It is generally and correctly asserted that individuals with significant communication problems but who learn to "express their basic needs," "ask for information," "request for clarification"--all mands--are better functional communicators than are those who name or describe objects. A woman whose husband is aphasic does not especially care if he can describe or name (tact) water; she will be content if he can mand it when thirsty.

Michael (1988) suggests that about half of what adults say in the course of a daily interaction with others may consist of mands. Some SLPs may assume that children who are taught the labels (tacts) for objects, will mand the objects they want. Contrary to this assumption, clinical VB training, or what is beginning to be called the verbal behavior approach (Barbera, 2007; Miguel, 2009), has made it clear that children who learn tacts may not automatically mand; they need mand training as well (Hall & Sundberg, 1988; Michael, 1988). Children who cannot mand often resort to such undesirable behaviors as tempter tantrums, whining, grabbing, and aggressive nonvocal acts because they cannot request what they want (Carr, et al., 1994.) Nonverbal or minimally verbal children are especially prone to unacceptable problem behaviors to socially acceptable manding. Teaching mands first to such children may reduce many undesirable behaviors because the mands will give them access to the same reinforcers that their undesirable behaviors successfully sought (Carr, et al., 1994; Reichle & Wacker, 1993). Other classes of VBs may then be more efficiently taught to children whose undesirable vocal or nonvocal manding behaviors have been replaced by desirable vocal mands.

In more recent research on teaching VB to children and adults who have not learned a verbal repertoire, the concepts of deprivation and aversive stimulation have been refined further to account for some varied conditions under which mands tend to be produced. Generally, and as noted, mands may be produced under states of deprivation or aversive stimulation. For instance, a person who has not had access to water for several hours is likely to request it when the conditions that support a mand exist. However, a person may also mand for a drink when he or she has just ingested salt--a condition that bar owners tend to exploit by offering salty pretzels to its patrons who, after eating them, order (mand) more drinks (J. Vargas, 2009). Eating pretzels creates a state of fluid deprivation, but is not, in itself, a state of deprivation. SLPs offering language treatment to infants and toddlers often schedule language treatment sessions just before their young clients have had breakfast or lunch to increase the probability that the food used as a reinforcer during the sessions might be more effective than when the children arrived at the clinic after a full meal. Similarly, a teacher who plans to increase question-asking behaviors (mands) in her students may increase the difficulty of an academic task. This aversive task difficulty may increase the probability that the students ask for help; the teacher may then use prompts, models, and other procedures to teach mands (request for teacher's help). Such necessary steps taken to increase the motivation for mands (or other nonverbal behaviors) are known as establishing operations (EOs), a term Keller and Schoenfeld had used in 1950, but expanded and refined by Michael (1988, 2000). EOs also are known as motivating operations (MOs) (J. Vargas, 2009). Under natural settings involving speakers with good mand repertoire (such as the adult in the bar), EOs increase the probability that a mand will be produced. Under clinical conditions involving speakers with limited mand repertoire, establishing operations make it somewhat easier to teach mands. In essence, EOs have two kinds of effects that clinicians can exploit. First, they alter the reinforcing effects of some object, event, or activity; this is the reinforcer-establishing effect. Second, EOs change the current frequency of behaviors that were previously reinforced by that object, event, or activity; this is the evocative effect. The two effects of EOs are independent and concurrent (Michael, 2000). In light of these refinements of the motivational variable, Michael defines the mand as "a type of verbal operant in which a particular response form is reinforced by a characteristic consequence and is therefore under the functional control of the establishing operation relevant to that consequence" (1988, p. 7; emphasis added).

EOs may be unconditioned (UEO) or conditioned (CEO). Biological propensities underlie UEOs whereas past learning underlie CEOs (Hall & Sundberg, 1988; Michael, 1988). Asking mothers to bring infants to early language intervention sessions just before breakfast, and then using breakfast food as reinforcers for vocal responses is an example of UEO. The infant's sensitivity to food is biologically determined (hence unconditioned or unlearned), although the specific types of food preference is learned. On the other hand, an EO that increases the value of a toy as a reinforcer for a child under mand training is an example of CEO. An preferred toy placed on a high shelf may temporarily increase its value as a reinforcer and the probability that the child will mand it. A missing item necessary to complete a task might also create a CEO to teach mands, as Hall and Sundberg (1987) demonstrated. While teaching a student who is deaf to make soup, the authors omitted the needed hot water to create a CEO that helped teach the mand, "hot water." It may be noted that if the water manded is immediately consumed, any EO the clinician will have manipulated would be unconditioned; if the water manded is not consumed, but is used for some activity (such as washing hands or cooking a meal), then any EO in effect is conditioned.

The classic conditioning literature not only had recognized the effect of deprivation (currently, part of EOs), but also that of satiation. While deprivation increases the value of a reinforcer and the response rate associated with it, satiation decreases both. A person who has just eaten is unlikely to mand food. This effect of deprivation, too, is crucial for the clinician who plans to teach mands for food and drink. As a child receives food following successive mands, the reinforcing value of food is likely to decline, and so is the response rate. The term abolishing operations has been used to refer to those aspects of EOs that reduce (a) the reinforcing value of some object or activity and (b) the response frequency associated with that object or activity (McGill, 1999; Michael, 2000).

There exists an extensive literature on EOs in behavioral literature (see McGill, 1999 and Smith & Iwata, 1997 for reviews) and mand training (see Sautter & LeBlanc, 2006 for a review of treatment research). In most of the studies, EOs were manipulated to decrease undesirable behaviors (e.g., self-injurious behaviors). However, EOs are important in mand training because the clinician need not wait for opportunities to arise for the child to produce it. Instead, the clinician can create conditions (EOs) that encourage manding more frequently and thereby make the mand training more efficient (e.g., Hall & Sundberg, 1988). It is also evident from VB treatment research that what SLPs call language initiation, an important skill targeted in language therapy with children, is, for the most part, manding. Teaching mands to children with impaired VB (language disorders) is an effective way of teaching verbal initiation (Taylor, et al., 2005). Consistent with Skinner's suggestion (1957), VB treatment research also has shown that mand training facilitates the training of other verbal operants (Sautter & LeBlanc, 2006).

Discriminative Stimulus Control: The Tact

Discriminative stimuli are aspects of the environment that control certain verbal responses. A discriminative stimulus sets an occasion for a response that has characteristically received reinforcement in the past. Although some people mand much, tacts that are controlled by discriminative stimuli are a significant portion of most people's everyday speech. A tact is a verbal operant evoked by objects or events in the environment and reinforced by a verbal community in the presence of those objects and events. Discriminative stimuli that control tacts are "nothing less than the whole of the physical environment--the world of things and events which a speaker is said to "talk about'" (Skinner, 1957, p. 81). Motivational variables, critical for mands, are unimportant for tacts, which may be described as "objective" or "disinterested." Tacts say less about speakers' internal states than they do about their physical world; mands do the opposite.

At the simplest level, naming could be a tact. At the next level of complexity, descriptive statements could be tacts. Normally, listeners reinforce tacts based on the relation or correspondence between the tact and its antecedent. For instance a tact such as grass is reinforced if the controlling antecedent is indeed grass; but the tact grass is green is reinforced on the basis of a correspondence between the object grass and its conventional color. When the speaker is a child learning VB, however, tacts whose forms do not show strict correspondence with their adult forms, and hence lack the conventional correspondence with the antecedents as well, may still be reinforced. When the child says "da," for example, the mother may say, "Yes, that is a dog!" and thus reinforce the child's tact, even though that tact does not correspond either to its discriminative stimulus (in the adult sense) or to its adult topographic feature. Gradually, the mother demands greater correspondence, and an appropriate repertoire of tacts is established.

Tacts, though generally controlled by discriminative aspects of the physical environment, do not have a point-to-point correspondence with their antecedents. Echoics and textuals (see the subsequent sections) have such a correspondence. Tacts evoked by environmental stimuli soon become more complex due to the recombinative arrangements with other verbal operants; intraverbals, described later, help generate continuous speech in the absence of a parade of physical stimuli.

While mands are likely to be reinforced by unconditioned reinforcers, generalized conditioned reinforcers always reinforce tacts. In most situations, these reinforcers are verbal responses of listeners: right, correct, good, I agree, I think so, very interesting, and so forth. These and other reinforcers are often interchangeable, and the speakers will get reinforced as long as their verbal responses bear a conventional correspondence with their antecedents. Lack of correspondence can lead to conditioned punishers: No, that is not green; I don't agree, I see it differently, and so forth.

Like any other response, tacts, once conditioned, will generalize to similar stimulus situations. Various kinds of generalization vastly expand the tact repertoire. A generic or simple generalization of a tact is observed when a child or an adult produces an established tact ("ball" or "pen") to a new stimulus (e.g., a new ball or a new pen). More complex forms of generalized tacts are involved in what are considered metaphor (including simile) and metonymy.

Metaphorical generalizations (that create what are generally called metaphors), philosophically and linguistically thought to be a cognitive and creative achievement of a high order, also are a special kind of tacts under more refined discriminative stimulus control. Skinner's (1957) example, Juliet is the sun (metaphor) or Juliet is like the sun (simile), shows that the variables that controlled Romeo the speaker are sun and Juliet who shared some common stimulus property that affected him. Creative as they may be, metaphors and similes arise out of discriminated and shared properties of stimuli that control them, not out of some presumed cognitive processes or intellectual achievements.

Metonymical generalization of tacts accounts for verbal operants that seem to have no controlling stimuli (Skinner, 1957; Winokur, 1976). Metonymy is the act of naming something with another word that is associated with it. In behavior analysis, metonymical expressions seem to lack a relevant stimulus, as shown in the example that follows. These verbal operants pose a particularly difficult problem to the linguists and cognitive theorists. How do speakers tact objects that are missing, which they do all the time? If the object is missing, and the response is "about" that object, what is the discriminative stimulus for that response? In linguistic-semantic analysis, responses of this kind are classified as nonexistence. In cognitive analyses, the speaker emitting such a response is said to "recognize the absence of an object that was once present." Unfortunately, what is recognition, and how the absence of something is recognized, pose additional explanatory challenges. A missing object cannot be a stimulus for a response, just as a missing cause does not produce an effect. For example, when a child, while looking at the toy shelf, says, "No truck," we cannot conclude that the missing truck controlled that response. Many other objects, not just trucks, were also missing, but the child did not tact them. The response "No truck" is actually controlled by the currently present stimuli (e.g., toys that are present, along with perhaps the empty space on the shelf) that coexisted with the missing truck. The toy truck has been a part of those stimuli in the presence of which the response truck has been reinforced in the past. Clusters of stimuli have common elements, and a response conditioned to one of them is also conditioned to all or some of the individual elements in clusters. The speaker who again confronts one or some of those elements is likely to emit the response in question. This is called metonymical extension (generalization) of responses. Metonymical generalizations also account for more complex tacts than just naming a missing object (Skinner, 1957; Winokur, 1976). A journalist's report that the "White house asserts that the recession is over," is indeed what the President or a spokesperson has said. Specific speakers and the White house are commonly associated with each other; therefore, they share a controlling relation to the tacts.

Certain processes governing tacts lead to abstraction. A controlling stimulus is typically composed of multiple and discriminable (isolatable) properties such as shape, size, color, texture, configuration, use, function, and so forth. A verbal response under the control of an isolated discriminable property of a stimulus is an abstract response. Skinner wrote that "abstraction is a peculiarly verbal process because a nonverbal environment cannot provide the necessary restricted contingency" (1957, p. 109). In other words, nonverbal environment cannot teach abstractions; only a verbal community can. For instance, in teaching the child an abstraction of red to redness as such, the verbal community (or a clinician) will have to reinforce the tact "red" made in relation to objects that are red, but vary in shape, size, texture, and to a point, hue. However, because these irrelevant properties (e.g., shape or size) also gain some control over the verbal tact "red," the teachers must reinforce differentially. The response red is reinforced always and consistent with redness, but regardless of other properties of red stimuli. In this kind of teaching, irrelevant stimulus properties do not covary with reinforcement whereas redness does, and thus comes to control the tact "red." As a result, a response is created that tacts an abstract property of a stimulus that varies in other properties.

Skinner's analysis of tacts is extensive, and includes provocative discussions on how people come to tact private stimuli--stimuli that arise within the speaker's body, and more importantly, how the verbal community manages to arrange contingencies of reinforcement for them. In addition, discussions on problems of reference and meaning (Hegde, 2008b), and a variety of literary behaviors also are included.

Clinical Implications

Generally, SLPs do a good job of teaching tacts, although the clinicians have tended to conceptualized what they teach in linguistic terms (naming and describing objects and events). Although tact teaching is important, there may sometimes be an overreliance on teaching simple tacts at the expense of other verbal operants, especially mands, intraverbals, and autoclitic (often grammatic) relations. Simple tacts are typically the responses given to the mand, "What is this?" Except at such simplest level of object naming, tact training will include different types of verbal operants. Individual words expanded into topographically more complex combination of verbal operants (phrases and sentences) include mands, intraverbals, and autoclitics (certain morphologic and syntactic aspects). As we shall see in a later section, most everyday speech is a combination of different verbal operants. For instance, after teaching a child to tact "ball" to certain round objects, the clinician may teach the child to produce The ball is red, which consists of two tacts (ball and red) with two autoclitics (the and is). Similarly, the child mat learn to say, give me that red ball, which consists of a mand, two tacts (red and ball) and an autoclitic (that). Therefore, pure tact training should soon give rise to a higher level of training in which different verbal operants are combined into what are commonly called sentences and that Skinner (1957) considered as larger segments of VB resulting from autoclitic activity.

After the mand, the tact is the second most frequently targeted verbal operant in VB treatment research (Sautter & LeBlanc, 2006). Possibly, if child language treatment research published in speech-language pathology and other related discipline journals is included, the tact may be the most frequently taught verbal operant. In most VB treatment research, tacts were taught in combination with other verbal operants. Several studies also have analyzed the functional independence of tacts from other verbal operants (mands, intraverbals, and echoics). The findings have generally supported Skinner's assertion that verbal operants have different causes and need separate training, although in a few studies generalization across functional units have been noted (Sautter & LeBlanc, 2006).

Verbal Behaviors Caused by Other Verbal Behaviors

Skinner wrote that "behavior generally stimulates the behaver" (1957, p. 138), and VB can stimulate other VBs. Intraverbals, echoics, and textuals are the three kinds of VBs whose controlling variables are VBs themselves.


VBs whose controlling variables are prior verbal responses are called intraverbals. A defining characteristic of intraverbals is that there is not point-to-point correspondence between intraverbal responses and their stimuli. Such a correspondence is more evident in tacts, and most in echoics (imitative responses that duplicate their own stimuli) and textuals (naming printed stimuli--reading).

Some intraverbals may be generated by another person's verbal responses. A speaker may say "four" when someone utters "two plus two is ..." However, the most important classes of intraverbals are those that are controlled by the speaker's own prior VBs. Speech, once initiated by some variables, is capable of evoking more speech in the same person. Much of everyday conversation is intraverbal, as are serious discussions. One speaker's production of "Why?" often evokes the production of an intraverbal "Because ..." in another speaker. Similarly, "Fine, thank you" may be an intraverbal response to "How are you?" (Skinner, 1957). The instructor who asks the class, "What is a discriminative stimulus?" hopes that the question will generate an intraverbal response of "A discriminative stimulus is . . ." The instructor also hopes that what follows is an accurate (reinforceable) and complete intraverbal. Most of lower or higher education is designed to generate intraverbal responses that the ordinary verbal community may not establish. As these examples show, an utterance is not only a response to some other variable, but it also is a stimulus to subsequent utterances. People often "go on speaking" not because of a "train of ideas" rushing inside their heads, but because of the stimulus function of their own verbal responses. Most likely, intraverbal control is a significant contributor to speech fluency (Hegde, 1982).

Intraverbals can be either chains or clusters (Winokur, 1976). Intraverbal chains have a fixed order upon which the delivery of reinforcement is contingent. In reciting a poem, one part controls the other in a sequential manner. The child acquires the alphabet as a chain in which one letter supplies the necessary stimulus for the next. When the recitation of the alphabet gets interrupted, the child usually goes back to recreate the stimuli for subsequent responses. Counting is chaining, as are formulas, syllogisms, and symbolic logic. History is taught and learned as intraverbals. Skinner cautions, however, that "any one link in a chain of responses is not under the exclusive control of the preceding link" (1957, p. 72), because repeating just the last emitted letter of an interrupted recitation of the alphabet may not reinstate the chain.

Intraverbal clusters are groups of verbal operants that can evoke each other with no specific order or grammatical connection. Moreover, clusters are bidirectional, while chains are unidirectional. Word association test responses are clusters. The verbal response "ring," for example, can serve as a discriminative stimulus for clusters, such as: (1) "gold," "diamond," "hand," "finger," "engagement;" (2) "noise," "clang," "bell," "door;" (3) "worm," and perhaps other clusters (Winokur, 1976). However, a member of any one cluster is usually not a member of any other cluster, although all the clusters in question may have the same discriminative stimulus.

Intraverbal clusters are of two types, thematic and formal (Winokur, 1976). In thematic clusters there is no acoustic similarity between the verbal response that serves as the discriminative stimulus and the response it evokes. There is such a similarity in formal clusters. In thematic clusters, verbal stimuli and responses evoke each other because they share common "meaning" in the sense that they have entered into similar contingencies of reinforcement. For example, "ring," "noise," "clang," and "bell," are all about the same thing. In formal clusters, stimuli and responses come together because they sound similar acoustically: the word "hat" may evoke "cat," "mat," "pat," and "chat." Thus, rhyming as a phonological skill may not suggest some kind of awareness or knowledge, but phonetically controlled formal clusters (intraverbals). Formal clustering is explained on the basis of response induction, a behavioral process in which new responses similar in form to those reinforced earlier are likely to occur.

Intraverbal relations have also been described in terms of divergent and convergent control. A cluster that consists of the verbal stimulus "Chair" and several evoked intraverbals (e.g., "table," "sofa," "dining," "reclining," etc.) illustrates divergent intraverbal control; a single stimulus evokes multiple and varied intraverbal responses. A different cluster that consists of varied verbal stimuli (e.g., "four legs," "made of wood," "something to sit on," etc.) and a single evoked intraverbal "Chair" illustrates the convergent control (Axe, 2008). Because of their general complexity, intraverbals require not a simple discrimination, but a conditional discrimination, in which one verbal stimulus changes the evocative effect of another verbal stimulus, and in combination, they evoke an intraverbal response (Axe, 2008). See the next section for examples and teaching implications of conditional discrimination in intraverbal relations.

Clinical Implications

SLPs who describe language disorders in terms of lack of continuous speech, limited conversational skills, impaired sentence completion tasks, lack of topic initiation and maintenance, limited production of synonyms and antonyms, limited production of proverbs and common sayings, are indeed describing impaired intraverbal relations. Impairment in intraverbal relations is a higher level VB disorder than deficiencies in producing mands and tacts. To establish the higher level intraverbal skills in the repertoires of children and adults with language disorders, SLPs first need to teach the more basic verbal operants, including echoics, mands, and tacts. For instance, to learn the intraverbal, "In the winter, the big white bears hibernate," emitted as an answer to the question, "What do the big white bears do in winter?," the client should have all the tacts of that intraverbal in his or her verbal repertoire; if not, the clinicians will have to first establish them.

In the SLP literature, there are few or no studies on teaching explicitly described intraverbal relations to children and adults. Nonetheless, SLPs have taught intraverbals to their clients, though they have not conceptualized them as such. Clinicians often teach words to children with language disorders by (a) presenting a stimulus, (b) asking a question (e.g., "What is this?"), (c) immediately modeling the response (e.g., "Say, ball") and (d) reinforcing the correct response from the children. Later, they fade the model, and just present the stimulus, ask the question, and reinforce the response. This sequence of procedures is a way of establishing intraverbal relations. Clinicians routinely exploit a premorbidly well established intraverbal relation, even if it is currently weakened, to evoke intraverbal responses in clients with brain injury. For instance, clinicians often prompt in the form of an incomplete sentences (e.g., "You eat with a ...) to evoke an intraverbal response "fork" from a client with aphasia. Phonetic cues, (e.g., the word starts with ap ...") or cues based on the use of an object (e.g., "You write with it") also are examples of strategies to train or retrain intraverbal relations. Asking children to name individual members (e.g., cats, rats, dogs, etc.) of a class of stimuli (animals) is another method of establishing intraverbals. Responses to such directions as "Name some animals" are intraverbals. Intraverbals are established as well when children learn to respond with synonyms when an antonym is supplied or (and vice versa) or when they learn to give correct definitions when asked to define terms.

In VB treatment research, there are several controlled treatment studies on teaching intraverbal relations to clients who lack them. These studies help guide SLPs develop treatment programs to increase explicitly defined intraverbal relations in children and adults who have limited verbal repertoire. Cihon's (2007) review of studies on training intraverbal repertoire and the specific studies cited in it are helpful to clinicians in designing their own intraverbal treatment programs for children and adults with VB (language) disorders. Cihon describes peer training, conversational speech training, the discrete trials, and several other educational methods (e.g., direct instruction and precision teaching) as effective procedures to establish intraverbal relations.


Much has been written on the role of imitation in language learning. It may be so because echoics (imitated verbal responses) are perhaps the earliest of vocal operants. An echoic is a verbal response whose acoustic pattern resembles that of its own verbal stimulus. In effect, an echoic reproduces its own stimulus. Echoics are reinforced when the responses more or less accurately reproduce the acoustic characteristics of their stimuli and closely follow the stimuli. Repeating what someone said in the recent or remote past is not echoic verbal operant. Echoics are verbal operants preceded by the same or closely resembling verbal stimuli (Skinner, 1957). Because they are reinforced verbal operants, no special faculty or instinct is presumed to be responsible for them.

Listening consists mostly of covert echoics. [Incidentally, listening, the other part of VB, is not covered in this paper; but see Schlinger, 2008b for the Skinnerian view that listening is behaving verbally.] At the least, a listener may covertly repeat the important parts of a speaker's responses. The recipient often covertly and overtly repeats complicated traffic directions. In everyday conversation, speakers tend to echo one another's words or peculiar phrases. Certain work orders or complicated instructions also may be immediately repeated (echoed). Skinner (1957) also describes self-reinforcing self-echoic behaviors that are often called palilalia if they are excessive, apparently uncontrolled, or pathological in other ways. Even the so-called verbal perseveration seen in individuals with neurological disorders may be described as fully or partly self-echoics. It is important to note, therefore, that echoics are not limited to simple imitations.

Echoics in a child's repertoire gives a distinct advantage to the verbal community that teaches its language. The teaching begins with babbling, but both the infant and the caregivers have some ways to travel before they arrive at helpful echoics. Echoics as early operants are shaped from the baby's tendency to initially exhibit unconditioned and undiscriminated babbling (i.e., non-operant vocalizations that are neither an echoic nor any other kind of verbal operant). Nonoperant babbling occurs when well fed (and relaxed) babies are lying on their back, and the air movement through the vocal folds causes them to vibrate, resulting in random sounds or noises. Because it is random (not yet selected by reinforcement), it is likely to include sounds that are not specific to the infant's verbal environment (the family "language") (McLaughlin, 2006). That the infant's babbling may include some sounds alien to the infant's verbal environment is neither surprising nor remarkable. There is no justification to hypothesize that the nonoperant and unselected (unreinforced) babbling is due to some innate mechanism because such babbling is a physiological-aerodynamic phenomenon.

The reinforcing effects the caregivers themselves experience from the infant's nonoperant babbling that establishes an interlocked set of caregiver-infant reactions that help the emergence of other verbal operants. The interlocked chain of reactions include the parent's echo-babble, infant's self-echoic babble, caregivers' reinforcement, infant's operant babble, and the more refined reinforcement made contingent on sounds of the surrounding (family) environment that eventually help establish mands, tacts, and other verbal operants. It is interesting to note that the initial reinforcement occurs for the caregivers, who, (in everyday language) take delight in the random sounds or syllables their babies produce. That their babies' vocal sounds and syllables are conditioned reinforcers to the caregivers and (soon) to the babies is central to the development of initial echoics and their transformation into other verbal operants. Baby's babbled sounds and syllables reinforce the caregivers because who will have had a long verbal learning history; speech (and its individual sounds) have been socially reinforcing to them; lack of speech is socially aversive. The reinforcing effects of their babies' nonoperant babbling causes two main changes in the caregivers' behaviors. First, they echo-babble their babies' babbles. That is, the caregivers repeat what the babies babble. Second, the caregivers produce similar sounds and syllables while they are reinforcing their babies with their caretaker routines: changing, bathing, drying, dressing, feeding, holding them closely, and playing with them. The vocal sounds of the caregivers, by association with such reinforcers, acquire conditioned reinforcement value for the babies. Next, the sounds the babies hear themselves produce are immediately and automatically reinforced, causing an increase in babbling. It is known that babies who are deaf begin to babble but soon stop--possibly because of lack of automatic reinforcement. This reinforcement also may be the reason why babies self-echo; that is, they repeat their own babbled sounds in a stage of language development researchers call reduplicated (e.g., da-da-da) babbling (McLaughlin, 2006). A different type of conditioning also begins to take place. Parents, hearing the babies' babbled sounds, syllables, and self-echoics, begin to socially reinforce them (e.g., by smiling, tickling, picking the babies up). There is experimental evidence that such contingent social reinforcers increase babbling in babies. (See Hegde, 1998 for reviews of classic studies and Bloom, Russell, Wassenberg, 1987; Goldstein, King, & West, 2003, Goldstein & Schwade, 2008, Gros-Louis, West, Goldstein, & King, 2006 for more recent studies in which caregiver reinforcement has systematically increased babbling in babies.)

One typical objection to this analysis is that parents do not plan to reinforce and that they cannot reinforce consistently (Owens, 2005). A point worth noting is that the caregiver reinforcement need not be planned (McLaughlin, 2006) nor should it be continuous to increase nonechoic, echoic, or self-echoic babbling. There is evidence that caregivers attend to (reinforce) about 50% infant vocalizations (GrosLouis, et al., 2006). It is well known that intermittent (less than 100%) reinforcement increases both the frequency and the strength of behaviors so reinforced (J. Vargas, 2009). Indeed, because of intermittent reinforcement, infant vocalizations resist extinction when reinforcers are not forth coming (Goldstein, Bornstein, Schwade, Baldwin, & Brandstadter, 2007). Nonetheless, in their sources on language development, SLPs are likely to be told, without any review of studies, that "There is very little evidence that the infant's babbling is shaped gradually by selective reinforcement" (Owens, 2005, p. 77). Nor is there any evidence that the effects of reinforcement are always slow and gradual, even if these could be operationalized.

There is now stronger evidence that self-echoic babbling, caregiver echoics, and reinforced babbling serves as a basis for more complex verbal operants, because one theory that discounted that possibility has been discredited. This is Jakobson's (1968) theory of discontinuity between early vocalizations and later language development. Most researchers now agree that there is a continuity, and thus there is support, for conditioned echoics playing a significant role in VB learning (McLaughlin, 2006; Pena-Brooks & Hegde, 2007). Later on, babies' responses at the word level are often partial echoics (approximations) as in dada for daddy. Nevertheless, parents tend to reinforce them initially, but gradually, the parents require progressively better approximations, resulting in complete echoics. While partial or complete echoics are being reinforced, other variables, such as persons, objects, and events also are present. Eventually, these variables gain discriminative stimulus control over the response.

Michael (1982) suggests a new term, duplic, to extend Skinner's echoic verbal operant to include sign language, which does not have a vocal stimulus that typically evokes a vocal echoic (or other verbal operants). The new term captures the essence of a verbal operant whose stimulus need not be verbal; when one person--often a teacher--signs, the other person--often a learner--copies the sign (imitating, not "replying"). The term duplic includes not only sign imitations, but also copying texts. The term captures both the auditory and visual stimulus modes, and therefore, may be preferable to the term echoic.

Clinical Implications

Regardless of the theoretical differences on the importance of echoics in language learning, SLPs routinely establish echoics (typically called imitations) in most of their treatment sessions. Echoics may help establish verbal operants of any topographic feature: words, phrases, and sentences. The verbal stimuli SLPs give to evoke echoics in their clients are called modeling, although within a behavioral framework these models are also mands (e.g., "Say, ball"). In everyday speech, modeling is not the necessary stimulus for echoics. In the typical verbal environment, a speaker need not manipulate a stimulus to call the resulting response an echoic. A speaker who says, "Remember, meeting at 3 p.m. today," is not necessarily modeling for the listener to echo; but the listener may still overtly or covertly echo it: "meeting at 3 p.m. today." The initial speaker may still reinforce the listener's overt echoic ("Yes, that is the time!") while not having explicitly set the stage for it.

Some of the earliest treatment studies on child language and speech disorders conducted in the 1960s and 1970s, reported that the modeling-imitation-reinforcement sequence was effective in establishing echoics that may be shaped either into topographically more complex responses or other classes of verbal operants (see Hegde, 1998 for details). For instance, once such tacts as ball and big are established, the client may be taught such mands as I want big ball or such tacts plus autoclitics as that is a big ball. Verbal operants that are at zero baseline almost always need to be first established as echoics. Shaping should be added to echoics when echoics are not produced on baselines (Hegde, 1998). Echoics have been found to be a useful initial teaching target in treating adult communication disorders, including aphasia, apraxia of speech, and dysarthria. Whenever echoics are established as a starting point in treatment, clinicians fade modeling to bring the client's verbal responses under the control of more natural stimulus conditions (e.g., motivation in the case of the mand, and discriminative stimuli in the case of the tact).


Skinner defined the textual as a vocal response that "is under the control of a nonauditory verbal stimulus" (1957, p. 66). Most commonly, the printed text is the visual controlling (causal) variable of textual behavior, commonly called reading. Skinner defined a reader as a "speaker under the control of a text" (1957, p. 65). Other visual stimuli that control textuals include pictures, pictograms, phonetic symbols, hieroglyphs, Braille, and other visual forms that evoke textuals. In reading, printed stimuli are covertly or overtly named, but are not described, and for this reason, textuals are said to be functionally equivalent to the proper name of the stimulus (printed characters, words, other symbols). The stimulus control is precise, or there will be defective textuals (misreading).

Michael (1982) suggests the term codic to include textual and to appropriately extend this type of verbal operant to taking dictation. In both the textual and dictation taking, the stimulus is verbal, although the verbal stimulus for the textual is print and the verbal stimulus for dictation taking is vocal. The response form, however is different; it is reading--a vocal response in the case of the textual and writing--a motoric response in the case of dictation taking. Skinner's analysis of the textual and the more recent extensions (see Michael, 1982, 1985; E. Vargas, 1982, 1988) include such varied phenomena as self-textuals ("making notes to oneself"), transcription, writing, copying printed material, and so forth. Thus, the Skinnerian analysis is more complete than the fragmented linguistic analysis of phonologic, semantic, grammatic, morphologic, and literacy skills.

Clinical Implications

Treating reading and writing disorders is now within the scope of practice of SLPs. In recent times, attention has been drawn to the concept that reading and writing skills are language-based. It may have taken this long to recognize this simple fact because of the linguistic model SLPs have been following. The transformational generative linguistics paid little or no attention to reading and writing. Skinner, however, had considered all aspects of VB since the 1940s, and his analysis of textuals offers a good starting point for planning remediation programs. Unfortunately, there is not much research in speech-language pathology on teaching textuals to children or adults. In behavioral journals (e.g., Journal of Applied Behavior Analysis, The Analysis of Verbal Behavior), clinicians can find ways of teaching reading and writings skills that are better based on experimental methods than are the literacy intervention approaches often described in SLP journals and textbooks.

Space will not permit a more detailed examination of textuals as clinical treatment targets. It is, however, useful for SLPs to consider integrating textual training with speech-language training (Hegde & Maul, 2006). For example, pictures used to evoke (along with mands) VBs of specific topographic features (phonemes, words, phrases, sentences) may be accompanied with printed stimuli. When teaching the production of phoneme /b/ or the tact ball with the help of a stimulus picture, the clinician may have the word ball printed on the stimulus card. While evoking either an echoic ball or an evoked tact ball, the clinician also can point to the printed word ball to reinforce a textual response. If clinicians follow the VB approach to teaching reading and writing, they would avoid such ineffective procedures as teaching phonological awareness or teaching encoding and decoding skills, and such other unproven cognitivelinguistic approaches.


The listener is an important part of the contingency governing VB, which typically occurs only in the presence of other individuals who reinforce it. The term audience refers to an effect on verbal operants that listeners (audience) have. Therefore, audience is not a class of verbal operants like the ones described so far. An audience is defined as "a discriminative stimulus in the presence of which verbal behavior is characteristically reinforced and in the presence of which, therefore, it is characteristically strong" (Skinner, 1957, p.172). An audience may determine all that is said on a given occasion, although this happens infrequently. Audience is usually a supplementary variable; its effects are additive to the strength of other primary variables. As a supplementary variable, an audience may have two kinds of effects.

First, audience may determine such production effects as audibility of utterances. On such occasions, the audience may not have any effect on what is being said, only on how it is said. A strong primary variable makes speech very probable, but normally it will be uttered aloud only in the presence of an audience. In certain other social situations, strong primary variables of speech may be absent, so that people confronting each other "do not have much to say." The presence of each other, however, might constitute a strong audience effect (strong discriminative stimuli for speech). People then talk about the weather or do "small talk." An opposite effect is seen when the audience present is too weak, and the primary variables are very strong. The person is "itching" to speak but confronts a wrong kind of an audience--the one that has not been associated with reinforcement in the past. The speaker may nevertheless blurt out something, and then may hastily add "please don't mind, I am just bubbling today."

Second, as a supplementary variable, an audience may partly determine what is being said, although some politicians may be accused of saying all that is said because of an audience. When an audience becomes a part of controlling variables, its effect is usually to select a particular response. The presence of a single primary variable raises the probability of the emission of several verbal responses that are typically conditioned to it. The audience present may determine which one of these actually gets said. After having tested a child, a language clinician is likely to respond with different sets of words to describe the "problem" depending on whether the listener is a junior student in speech pathology, a graduate student, a fellow colleague, or the mother of the child. People who are bilingual switch languages depending on their audience. Among themselves, professions emit their own jargon, and peer groups emit their slang. These are all instances of audience control of what is being said.

Clinical Implications

For the most part, SLPs initially establish verbal operants in the clinic. As they teach various verbal skills to their clients, they also serve as the initial audience for the clinically established VBs. Consequently, the clinicians become discriminative stimuli for the clinically established VBs. Unless the clinicians take additional steps, other people the clients encounter may not evoke the newly established VBs because their discriminative stimulus function is not yet established. This is often described a problem of generalization; but it is indeed a problem of lack of required additional conditioning of VBs to audience other than the clinician (Hegde, 1998).

To overcome that problem, clinicians usually bring other people, including family members, into the treatment sessions. Fellow clinicians also may take part in treatment sessions. The clinician may move treatment sessions to more natural settings where additional people serve as audience for the newly established verbal skills. In essence, audience generalization is a matter of transferring the stimulus control from clinician to other persons with whom the client regularly interacts.

Multiple Causation of Verbal Operants

Motivation (causing mands), environmental events (causing tacts), another speaker's speech (causing echoics), printed or other visual stimuli (causing textuals), one's own speech (causing intraverbals) and the audience (producing an effect on those verbal operants) exemplify the six kinds of stimulus control of the first five kinds of verbal operants. These controlling variables do not exert their influence in isolation, however. Typically, several variables concurrently control the same verbal operant, and different sets of variables control different operants. Consequently, (1) a single variable evokes (causes) multiple responses, (2) multiple variables evoke a single response, and, (3) different classes of verbal operants constitute single utterances. Only in echoics (duplics) and codics (textuals and dictation taking) do we see single variables controlling their respective single responses.

That a single variable controls more than one response is the basis of much of the speech emitted in relatively constant environments. People do not need a parade of stimulus events to talk for extended periods of time. One variable is sufficient to generate much speech, which serves as stimuli for more speech (intraverbal control). Therefore, there is hardly any dearth of stimuli. In fact the problem might be to explain how certain responses, though probable, do not get emitted and how the responses that were emitted got selected. Skinner (1957) rejected the mentalistic idea that speakers choose their words, because it creates a more difficult problem: explanation of an elusive choosing process and an inner speaker who chooses. Instead, Skinner proposed that the number and strength of variables present in a given situation determines the selection of verbal operants. The effects of two or more controlling variables are additive. For a specific response, therefore, the greater the number of currently active variables, the higher the probability of its occurrence. However, a few stronger variables with a longer history of reinforcement may exert greater influence on response selection than several weaker variables with defective or recently instituted contingencies. Additional supplementary strengthening in the form of specific audience effects is also a source of response selection. On the basis of multiple causation, Skinner described different kinds of supplementary strengthening and explained various kinds of neologism, intrusions, slips, and distortions in speech.

Different classes of verbal operants, such as tacts, mands, and intraverbals are often parts of a single utterance. The utterance, Fool! You think I am a fool? contains an echoic and a tact. Attention, kids! contains a mand and an audience effect The mand, May I have a large cup of strong coffee also contains multiple tacts (large, cup, strong, coffee) and autoclitics (a, of).

Clinical Implications

Except at the very basic level, VB training involves multiple verbal operants. Pure mands or pure tacts, though rare, may be taught with limited response topography (words, phrases). Teaching children to say cookie, sock, or book as either a mand or a tact may be initially necessary, but excessive training at this level will make it difficult to shift training to more complex verbal operants. More importantly, the simpler and purer verbal operants will not be as effective as the more complex verbal operants in modifying the listener responses.

Response complexity may be increased only by including different verbal operants into single utterances. The traditional sentence typic ally is a combination of multiple verbal operants, including autoclitics, described in the next section. Conversational skills training--a high level of verbal training--will include, in different combination, mands, tacts, partial or full echoics, self-echoics, autoclitics, and intraverbals, as noted previously.

Autoclitics: Grammar and More

In Skinner's (1957) analysis, mands, tacts, intraverbals, echoics, textuals, and audience effects are primary verbal behaviors, which do not include grammar and word order. This position contrasts with the Chomskyan linguistic theory in which grammar is primary, and has the status of an independent variable in that an innate grammar is essential for language development. However, Skinner did not ignore word order and grammar. None of the abundant linguistic criticisms have come to grips with the subtleties of Skinner's analysis of grammar and word order. According to Skinner (1957), grammar and word order are secondary to the primary verbal operant classes described previously. Grammar is secondary to first having something to say. Without a repertoire of primary verbal operants, a speaker has no use for grammar (including word order). Therefore, word order and grammatical features are dependent variables. They describe certain effects, not causes. Skinner described the secondary processes of grammar and word order under the term autoclitic. The term, however, includes several verbal phenomena not included under the linguistic analysis of grammar and word order.

Skinner wrote that "part of the behavior of an organism becomes in turn one of the variables controlling another part" (1957, p. 313), and that "parts of language deal with other parts of language" (1957, p. 341). In other words, some verbal operants discriminatively tact other verbal operants. Thus, when we "tact our own verbal behaviors, including its functional relationships" (Skinner, 1957, p. 314), we have autoclitics. As they talk, people tell their audience how, what, and why they talk; they fine-tune the listener reactions by making specific comments about their speech. Such specifications of what, why, and how speech is being emitted are autoclitics, which include grammatical features. In essence, autoclitics include speech about speech, the controlling variables of speech, or specifications of certain aspects of stimuli that prompted speech. It is in this sense that autoclitics are secondary VBs controlled by other, primary, VBs. Primary VBs are the controlling part; autoclitics are the parts controlled. Both are VBs, but autoclitics are not controlled by those that control primary VBs (e.g., motivation and physical stimuli).

It has been noted that the term autoclitics includes not only grammatical words and word order, but many other kinds of verbal responses that are traditionally not included under grammar. Responses such as I said, I see, in other words, certainly, perhaps, are all autoclitics too. Any part of an utterance which discriminatively tacts any aspect of the controlling variables and their relation to primary operants is an autoclitic.

Varieties of Autoclitics

Autoclitics are of different kinds. A large group of them are descriptive. "I said ...," is an autoclitic which describes the speaker's prior verbal responses, while I was about to say describes an imminent response. Some of the descriptive autoclitics tact the controlling variables of the verbal operants they accompany. A statement such as the President is in Copenhagen may be made for any one of several reasons. But if the speaker, reading a newspaper, adds I see to that statement, the listener knows why the speaker said what he or she did. In other words, the controlling variable of the statement (the printed story in the newspaper) is tacted for the benefit of the listener. The same statement, when preceded by Tom said (that the President is in Copenhagen) identifies a different variable controlling the primary VB.

A number of autoclitics specify the strength of the VBs they are a part of. Autoclitics like I guess, I imagine, indicate that what follows in each case is not very strongly determined, where as I am certain that, I know for sure that, indicate that subsequent responses have strong controlling variables. Although the primary VBs (the responses that follow or accompany those autoclitics) are the same, the effects on the listener are not. Thus, different autoclitics modify listener reactions in specific ways. Autoclitics like I hate to say tact the emotional state of the speaker, while those like I agree, and I should say describe the relation between the response and other VB of the speaker or listener, or the situation in which the behavior occurs.

Some autoclitics qualify responses they accompany, and include negation and assertion. The linguistic analysis of negation implies that the absence of previously experienced objects is the source of negation. But as Skinner says, this is "clearly impossible in a causal description" (1957, p. 322), because a thing that does not exist cannot cause anything. If the response no rain is controlled by the absence of rain, "why do we not emit a tremendous flood of responses under the control of the absence of thousands of other things?" (Skinner, 1957, p. 322). In the example no truck discussed earlier, no is an autoclitic controlled by the verbal operant truck, which is in turn controlled not by its absence, but by the presence of stimuli, which, on previous occasions, accompanied the object truck. The response she is not Swedish contains the autoclitic not, which is controlled by she and Swedish, both made likely by primary variables. In this case, not implies that Swedish is not a part of the tact for she. Put differently, not implies that the controlling variables for the two tacts, she and Swedish do not covary and have not been reinforced as a verbal operant.

The most common assertive autoclitic in English is is. It tells the listener something about the causal variables of responses and the relationship among them. The verbal response that chair is big contains two tacts and the autoclitic is. The autoclitic specifies that the same discriminative stimulus controls both that chair and big. In other situations, is is also controlled by the temporal characteristics of the stimulus. The English auxiliaries is and was assert relationships between responses, and responses and their antecedents, but different temporal aspects govern their emission. Technically, assertive autoclitics imply that the operants involved are either tacts or intraverbals. Mands, echoics, and textuals are not typically asserted.

Quantifying autoclitics tact the numerical properties of the controlling stimuli that prompted a primary verbal response. The articles the and a specify the relationship between responses and their controlling variables in two ways. They tact the numerical properties of the controlling variables, their specificity, or both. The definite article the literally points at the controlling variable of the response it accompanies. The articles also make it possible to obtain a more precise response from the listener. Other autoclitics like all, some, few, mostly, one, many and so forth are similar: they specify the numerical properties of discriminative stimuli and their relation to verbal operants.

A number offragmentary tacts, which are autoclitics, appear in the form of grammatical tags, inflections, or bound morphemes. Tags like -s, -ed, and -ing (e.g., spills, spilled, and spilling) tact the temporal relationship between the controlling variables and the speaking. Possessive inflection -s, on the other hand, tacts the controlling relationship between two tacts in the same utterance. They also inform the listener that the discriminative stimuli for the two tacts (boy's hat) are likely to covary. The two conjunctions and and or imply that more than one verbal operant is being controlled by the same discriminative stimulus. In the case of and, however, the two operants are compatible in relation to their single stimulus (small and beautiful). In the case of or the two operants are not compatible (genius or crock). Most, but not all, prepositions tact the spatial relationships between the controlling variables of verbal operants. When a speaker says I heard the sound of music, the autoclitic of discriminatively tacts the variable responsible for the sound. Sound could be of anything, and consequently, as a verbal response, sound could be controlled by any one of several variables. The autoclitic of discriminatively tacts the specific controlling variable of a current verbal response.

No attempt will be made in this paper to catalog all grammatical words and tags in terms of their autoclitic function. The point is that grammar can be accounted for in terms of a causal analysis as against the structural analysis. One final question, however, needs to be considered: Why do autoclitics occur at all? Why do speakers tact the controlling variables of their own VB? Clearly, the speakers need not tact the controlling variables for their own sake, because they already are in touch with them, or else they would not speak. It is often the listener who has no access to the speakers' controlling variables. Therefore, listeners ask questions like when, where, how many, how sure are you, why do you say that, and so forth. Parts of answers to such mands (questions) are autoclitics. In sum, autoclitics occur for the benefit of the listener, who needs to know, (1) why an utterance is made, (2) how strong is that utterance (operant strength), and (3) what temporal, spatial, physical, quantitative, and other properties of stimuli govern that utterance. The precision of listener reaction depends on this kind of information. An imprecise listener reaction is a defective contingency for the speaker. In the final analysis, therefore, the speaker is a beneficiary too.

In Skinner's analysis, ordering (syntax) seen in verbal responses is also partly due to the autoclitic process. But there are other causal variables that help order VB. Some ordered VB may not involve an ordering process at all. For example, the child may initially learn phrases like hi there, how are you, fine, thank you and so on as single functional entities. For adults, too, utterances such as have a nice day may be single functional units, not involving an ordering process.

In some situations, word order may correspond to the sequence of relevant stimuli that generate verbal responses. For instance, a TV sports commentators' speech is temporally ordered according to the order in which the events (stimuli) unfold in front of them. Order may also be due to the order in which verbal stimuli generate more verbal responses, as in the responses given to free association test items and in all intraverbals (e.g., reciting the alphabet or a number series). This suggests that intraverbal control is one of the multiple sources of order (syntax). Order in echoic responses is due to the order in which the antecedent stimuli are generated for the speaker. Yet another source of order is the relative strength of responses, because the strongest is more likely to be emitted first, and the weakest the last, and thus an order would emerge (Skinner, 1957). When a speaker says "Hand me that book, the red one," the order of the verbal operants may be partly determined by their relative strength. Possibly, the main mand part of it ("Hand me [that] book") was stronger than the tact part of it; a weaker tact ("[the] red one") was added because the mand was not effective in generating a quick and correct response from the listener.

Clinical Implications

Educational teaching of elements of autoclitics to children is as old as education itself; elementary schools were once called grammar schools. Initial treatment efficacy studies on teaching grammatical morphemes to children with language disorders were published in the late 1960s and early 1970s. Such studies, however, were conducted by applied behavior analysts (e.g., Guess, Sailor, Rutherford, & Baer, 1968; Schumaker & Sherman, 1970). Subsequently, studies on teaching grammatic morphemes as well as syntactic structures began to appear in SLP journals (see Bricker, 1993 and Hegde, 1998 for historical reviews of treatment research on teaching grammatical elements to children with language disorders). Unfortunately, whether the research was conducted by applied behavior analysts or SLPs, the authors did not describe the treatment targets in terms of verbal operants; many investigators have, and continue to use, the linguistic terms to describe the skills taught to children with language disorders. The methods of teaching and the research designs in which the treatment efficacy was evaluated were all behavioral, however. This state of affairs continues, even in some explicitly behavioral journals (e.g., Journal of Applied Behavior Analysis). Articles published in The Analysis of Verbal Behavior are exceptionalin using both the verbal behavior concepts and behavioral treatment.

In a later section of this paper, I will describe selected clinical research on autoclitics (especially the grammatical morphemes) that has shed light on verbal operant response classes. Teaching various sentence forms as SLPs conceptualize it will always involve autoclitic responses. Conversely, teaching specific grammatical morphemes often (though not always) involve sentences. For instance, to teach the verbal auxiliary is, the clinic ian needs sentences (combined or ordered verbal operants of different classes): The boy is running, The girl is smiling, and so forth (Hegde, 1980).

The response class research, briefly reviewed in a later section, suggests that it is more efficient to target Skinner's autoclitics and other verbal operants rather than the linguistic categories. As teaching various grammatical elements is a significant part of teaching VBs to children with language disorders, it is hoped that SLPs will begin to adopt Skinner's functional units rather than the linguistic structural units in assessment (see Esch's article in this issue) and treatment.

Receptive Language?

Linguistically and popularly, an appropriate response to a verbal stimulus is often described as receptive language or comprehension. In the behavioral analysis, the terms receptive language and comprehension are even more questionable than the term language itself. The implication that a listener passively "receives" and "understands" (or "processes the signal," as in the popular computer analogy) spoken language or read material altogether misses the point of verbal behavior, which constitutes actions of speakers and listeners. Typically, appropriate behavior to verbal stimuli is the basis to assume a mentalistic notion of comprehension, which remains unexplained and unobserved. What is observed is either a reinforceable (appropriate) verbal or a nonverbal response to verbal stimuli. In the case of clients who have obviously limited verbal repertoire, clinicians test comprehension by sampling a correct (reinforceable) nonverbal response to verbal stimulus. Typically, the client is manded (asked) to point to the named stimulus embedded in a stimulus set. If there is a conventional correspondence between the mand, the pointing, and the physical stimulus, the clinician reinforces the pointing response. Thus, in a behavioral analysis, comprehension is an unnecessary term that skirts the valid stimulus-responsereinforcement contingency that is adequate to handle the relevant observations.

In an extension of Skinner's analysis, Michael (1985) suggests that VBs of pointing at, touching, and in responding in other nonverbal ways to verbal stimuli may be called stimulus-selection-based VB. Although teaching stimulus selection (comprehension or receptive language) as a precursor to VB teaching to most children with speech or language disorders is unnecessary, such a teaching is a part of augmentative and alternative intervention for adults and children with limited verbal repertoire. Michael calls the more typical forms of verbal (as well as vocal) behavior topography-based because, for instance, the words dog and cat have different response topographies, whereas stimulus-selection-based VB (e.g., pointing) remains constant regardless of the object pointed to.

Language Treatment Research Supports the Behavioral Analysis

Traditionally, SLPs have relied on structural analysis of language, in which the independent variables are either ignored or assumed to take the form of innate devices, mental schemes, mental images, cognitive concepts, maps, and other presumed entities with little or no empirical validity. As applied scientists, however, SLPs need to intervene in a behavioral process that for some reason has been impaired. They have to produce changes in the language behaviors of their clients. Changes in behaviors can be produced only when their causal variables are under the clinician's control. In this sense, SLPs are more like empirical scientists than structural nativists.

Because the variables figured in the structural analysis are mentalistic and nonmanipulable, SLPs have turned to behavioral intervention for communication and swallowing disorders (Hegde, 1998). Behavioral intervention is also a causal analysis of behaviors, however. Its success as an applied technology depends on its emphasis on a causal analysis of normal or impaired VBs. Whenever some new behavior is clinically modified, the applied technology throws some light on the basic analysis of that behavior.

Treatment Targets Should be Functionally Organized

Clinical research data in treating language disorders in children, published both in the behavioral and speech-language pathology literature, has produced some impressive results that support Skinner's functional units (verbal operants), as against the structural (linguistic) categories. For instance, the VB approach to language intervention has shown that, as Skinner asserted, mands and tacts are functionally independent verbal operants, in the sense that teaching tacts will not automatically result in the production of mands or vice versa (Hall & Sundberg, 1987). Many SLPs know, for instance, that a child who is taught to say "Ball" (a tact) when shown a ball will not automatically produce "I want ball" (a mand). The child who has learned to mand a ball will not necessarily tact it; both need separate training. Interestingly, functional (causal) independence of verbal operants has been demonstrated with chimpanzees (SavageRumbaugh, 1984) and pigeons (Michael, Whitley, & Hesse, 1983; Sundberg, 1985).

In language treatment research conducted by both behavior analysts and SLPs, functional response classes that contradict the validity of structural categories have emerged when (a) what linguists consider separate grammatic structures did not need separate training and (b) when what is considered a single grammatical structure broke down into different response classes that needed separate training. For instance, treatment research has demonstrated that such single linguistic categories as (a) the regular plural (Guess, et. al., 1968), (b) the irregular plural (Hegde & McConn, 1981; Hegde Noll, & Pecora, 1979), (c) the regular past tense (Schumaker & Sherman, 1970), and (d) the irregular past tense (Hegde & McConn, 1981; Hegde Noll, & Pecora, 1979) are each a collection of multiple and functionally independent response classes (see Hegde & Maul, 2006, for a summary of research). Independent response classes need separate training as there is no or socially accepted generalization across functional response classes. Treatment research also has demonstrated that such multiple linguistic categories as (a) the subject-noun and object-noun phrases (McReynolds & Engmann, 1974) and (b) the English auxiliary and copula (Hegde, 1980) are not functionally independent; teaching one of the two will instate the other, regardless of which one is taught. These two studies have used the more powerful ABAB experimental design than the typical teach-one-and-probe-the-other method; one of the two skills was taught, then reversed, and finally reinstated to show that the untaught skill is produced, reversed, and reinstated without training. Essentially, under a causal-experimental analysis, some linguistic distinctions collapse into single functional units and other single structural units break into multiple functional units. Although more research on operant verbal response classes is needed, clinical research designed to remediate deficient language skills supports Skinner's (1957) figurative comment that when "language" is dropped to the floor, it may break into verbal operants, not linguistic structures.

Summary and Conclusions

Skinner's is primarily a cause-effect analysis of VB. His main concern was to describe the controlling variables of verbal operants. Accordingly, he described six kinds of stimulus control and five kinds of verbal operants: motivation (causing mands), environmental events (causing tacts), another speaker's speech (causing echoics), printed or other visual stimuli (causing textuals), one's own speech (causing intraverbals) and the audience effect (not a class of verbal operants, but producing an effect on other verbal operants). Skinner described a secondary process called autoclitics that included grammar and other related phenomena. He proposed that VBs are multiply determined and most everyday utterances are a combination of multiple verbal operants.

SLPs receive better training in the linguistic view of language than the behavioral view. The textbooks and other sources they read routinely distort the Skinnerian analysis and repeat Chomsky's misunderstood and misplaced criticism of Skinner's Verbal Behavior, even as Chomsky's theories have faded and Skinner's natural science approach and behavioral treatment have been gaining worldwide respect. The critics of the behavioral view have an ethical responsibility to first understand Skinner's analysis before making critical comments. SLPs have successfully used the behavioral intervention procedures because linguistics does not describe independent variables that are manipulated in what we call treatment (Hegde, 1998). If the SLPs also adopt a functional (cause-effect) analysis of VBs, they would then be internally more consistent with their concepts and treatment methods. Treatment research in child language disorders has generally supported Skinner's view that VB is not organized structurally, but functionally.

I conclude with a personal epilogue. Many years ago, a distinguished clinical scientist and a friend of mine, Leija McReynolds, then professor at University of Kansas, was visiting me for a few days. After she gave a lecture at California State University-Fresno, and before she left for Stanford University where she was spending her sabbatical year in the Linguistics Department, asked me a sharp question in the campus parking lot: "Giri, do you think the influence of linguistics on us SLPs has been detrimental?" I said "Yes," she smiled gently as she got into my car, and I drove her to the airport. Professor McReynolds (along with Engmann) showed for the first time in SLP that when subjected to an experimental analysis using the single-subject ABAB research design, the grammatically distinct subject noun and object noun phrases collapse into a single functional unit.


Author Contact Information

M.N. Hegde, Ph.D.

California State University

Postal: 1948 Ashcroft Avenue, Clovis, CA 93611

