An explicit technology of generalization.

The publication of the now classic article on generalization, "An Implicit Technology of Generalization" (Stokes & Baer, 1977), spurred interest in generalization as an active process rather than a passive process consisting primarily of a failure to discriminate between training and nontraining settings. Following their description of nine areas in which the extant behavioral research addressed generalization issues, a new interest in generalization of behavior change was home. More than a decade later, their description of categories of techniques that purportedly could be used to produce generalization was refined in "An Operant Pursuit of Generalization" (Stokes & Osnes, 1989). Stokes and Osnes described 12 general ization-promoting strategies that were classified within three broader areas. Their description assisted the field in continuing to focus interest on the fundamental need for the results of behavioral interventions to generalize effectively and to be durable and for behavioral research to actively address generalization. Now, more than a decade following the publication of "An Operant Pursuit of Generalization" and a quarter century after "An Implicit Technology of Generalization" was published, the time has arrived to address the status of generalization-promotion by behavior analysts, both in their conceptual and empirical investigations.


The publication of "An Implicit Technology of Generalization" (Stokes & Baer, 1977) resulted in a groundswell of interest in generalization as an active process that is important for behavior analysts to pursue directly to validate the effectiveness of behavioral programming. This classic article embedded in behavior analysis the realization that our work is functional not only when it produces immediate effects in the immediate environment that is targeted for change, but more importantly, when the effects are more widespread. Baer, Wolf, and Risley (1968) included generality of behavior change as one of the seven dimensions of applied behavior analysis, and concluded that, "in general, generalization should be programmed, rather than expected or lamented" (p 97). Their description of generality is consistent with the description provided by Stokes and Baer: "A therapeutic behavioral change, to be effective, often (not always) must occur over time, persons, and settings, and the effects of the change sometimes should spread to a variety of related behaviors" (p. 350). While acknowledging that their conceptualization of generalization was not consistent necessarily with the traditional understanding and descriptions of the phenomenon, they proceeded to provide a description of generalization as "... the occurrence of relevant behavior under different, non-training conditions (i.e., across subjects, settings, people, behaviors, and/or time) without the scheduling of the same events in those conditions as had been scheduled in the training conditions,, (Stokes & Baer, p. 350). This description appeared to resonate positively within the behavior analytic community, as evidenced by the embracing of the nine categories of generalization outlined in the article: train and hope; sequential modification; introduce to natural maintaining contingencies; train sufficient exemplars; train loosely; use indiscriminable contingencies; program common stimuli; mediate generalization; train "to generalize". Importantly, not only did the article provide a rubric by which behavior analysts could organize their efforts to achieve broad and durable behavior change, it provided the first exhaustive review of the behavioral literature in regards to the process of generalization.

Although it was a critically-acclaimed seminal effort to organize behavior analysis around a conceptualization of generalization, the interest that was piqued following the publication of the article focused primarily on researchers beginning to note whether or not the effects of their work occurred in generalized circumstances. Absent from the new recording of the presence or absence of generalization effects was an accounting of the functional variables that were responsible when generalization was noted and the variables that were responsible when no generalization occurred. It is this recording that is critical in the advancement of the science of behavior. A functional approach is linked with scientific endeavors, and the analytic pursuit of the principles of effective generalization has been deemed an important activity for scientists in behavior analysis (e.g., Stokes, 1992).

In response to these problems, Stokes and Osnes (1989) provided "An Operant Pursuit of Generalization." Noting a need for researchers to "describe the dimensions of their analyses and the scope of their generalization assessment", they posed two critical questions: "Did the behavior occur in generalized circumstances, and what are the functional variables which account for that generalization?" (p. 339). Despite the recommendation of Baer et al. (1968, p. 97) that "in general, generalization should be programmed, rather than expected or lamented", Stokes and Baer (1977) noted that almost half of the applied literature on generalization focused on the "Train and Hope" category. Twelve years later and 21 years following Baer et al., Stokes and Osnes (1989) continued to express the need for behavior analysts to account for the functional variables responsible for generalization when it has been observed. Their refinement of the generalization-promoting categories centered on the basic principles of behavior, in contrast to the emphasis of Stokes and Baer on procedural aspects of treatment deserving careful attention. They proposed three categories of generalization promotion. The first category, exploit current functional contingencies, reflects the function of natural selection by the consequences of behavior. Train diversely, the second category, reflects the contribution of diversity in the exemplars of learning. The third category, incorporate functional mediators, "addresses the relationship between salient conditions of learning and the stimulus control exerted over behavior by environments related to original learning" (Stokes, 1995, p. 429).

Each of the three categories was discussed in terms of four subcategories:

A. Exploit Current Functional Contingencies:

1. Contact natural consequences.

2. Recruit natural consequences.

3. Modify maladaptive consequences.

4. Reinforce occurrences of generalization.

B. Train Diversely:

5. Use sufficient stimulus exemplars.

6. Use sufficient response exemplars.

7. Make antecedents less discriminable.

8. Make consequences less discriminable.

C. Incorporate Functional Mediators:

9. Incorporate common salient physical stimuli.

10. Incorporate common salient social stimuli.

11. Incorporate self-mediated physical stimuli.

12. Incorporate self-mediated verbal and covert stimuli.

It has been 25 years since Stokes and Baer articulated the need for generalization programming in great detail. It has been over 10 years since Stokes and Osnes refined the prior articulation and provided a template for addressing generalization within the work of both practitioners and researchers in behavior analysis. At this time, it is pertinent to address the state of the advancement of generalization programming in behavior analysis application and research today. Have we progressed past the Train and Hope stage of development as a field and advanced the science of human behavior by developing methods that empirically demonstrate a generalization-promoting function?


In an attempt to determine the state of generalization programming today as reflected in behavior analysis journals, a sampling of journals was conducted. The following journals were reviewed for the years 1990-2002: The Journal of Applied Behavior Analysis, Behavior Modification, the Journal of Positive Behavior Interventions, and The Behavior Analyst Today. This sample was selected because two of the journals are long-standing journals in the field (Journal of Applied Behavior Analysis and Behavior Modification). The Journal of Positive Behavior Interventions is a relatively new journal, first published in 1999. As such, it was selected because it is possible that its acceptance practices of research for publication might require more stringent examination of generalization variables than would journals that had been in existence prior to 1977 when "An Implicit Technology of Generalization" was published. The Behavior Analyst Today was selected for that reason, and also because it is available in electronic format, therefore capable of reaching a broad audience at minimal cost. Importantly, it emphasizes functionalism as well.

From these journals, articles were determined to have generalization foci if they contained any of the following features: the word "generalization" or "maintenance" in the title or in the descriptor words, if those were required by the journal; a statement in the abstract that generalization and/or maintenance was a goal of the research (or article, in the case of review or discussion articles); the presence of a condition to assess maintenance or a follow-up condition; the inclusion of generalization probes; or the use of a reversal design that allowed for assessment of durability of effects posttreatment. In total, 93 articles were identified as meeting these requirements. Four were review articles, one was a discussion article, and the remaining 88 articles were research articles. The articles were scrutinized for the following features: explicit attention to the generalization-promoting strategies of Stokes and Baer (1977) and/or the generalization-promoting principles of Stokes and Osnes (1989); research methods that were designed to control for generalization-promoting variables; the inclusion of explicit generalization probes; and the length of follow-up or maintenance conditions.

Review and Discussion Articles

Interestingly, all review articles focused on some type of social behavior. While not providing an extensive review of generalization per se, Singh, Deitz, Epstein, and Singh (1991) provided an analysis of intervention studies of the social behavior of students who were classified as seriously emotionally disturbed. They reviewed 28 studies from 10 j oumals, the majority of which were published after 1980 (N=25). They reported specifically on the studies that programmed for generalization and maintenance, and found that skill generalization, and generalization across settings and untrained persons were programmed for in 14 articles. However, no description of the type of programming was provided. Additionally, they reported separately about studies that assessed follow-up of intervention effects, although it was unclear what the difference between maintenance (assessed in five studies) and follow-up (assessed in 10 studies) was. The reported follow-up times of these studies was predominantly less than six weeks, with a range of two days to one year. Of the five studies that assessed maintenance, the maintenance condition was less than six weeks in duration. Additionally, they reported on "changes in collateral behaviors that occurred as a result of the programmed contingencies" (p. 83) and found that only two studies reported such effects. The authors stated that the lack of assessment of changes in nontargeted behaviors was a "serious omission" due to the primary aim of the studies to enhance social skills of seriously emotionally disturbed students. Singh et al. did not utilize the generalization-promoting categories or principles of either Stokes and Baer (1977) or Stokes and Osnes (1989), but did cite Stokes and Osnes (1986) as having supported the need for generalization, maintenance and follow-up in social skills training programs.

Chandler, Lubeck, and Fowler (1992) provided an extensive review of generalization and maintenance of preschool children's social skills. They reviewed 51 studies from 22 journals in behavior analysis and education that spanned the years 1976 to 1990. They analyzed the articles according to four categories: generalization dimension, generalization assessment design, behavior-change strategies, and generalization-promotion strategies. Additionally, they addressed most (N=14 studies) and least successful generalization (N=8 studies) produced. They described the studies within the generalization-promoting strategies of Stokes and Baer (1977), and stated a continued need for researchers to explore the conditions controlling appropriate generalization to obtain information concerning functional variables that account for generalization, as suggested by Stokes and Osnes (1989). They found that four generalization-promoting strategies were combined most frequently: addressing functional target behaviors (exploiting current functional contingencies), specifying a fluency criterion (incorporating functional mediators), using indiscriminable contingencies (training diversely), and using mediation techniques (incorporating functional mediators). They conclude by stating a need to focus on questions of generalization in the next decade of preschool social skills research, a decade that is now at its end. At the end of that decade, Chandler and Dahlquist (2002) used predominantly the generalization-promoting strategies of Stokes and Baer to present a chapter on "Prevention Strategies and Strategies to Promote Generalization and Maintenance of Behavior" in their book on functional assessment in school settings. This represents a deliberate and laudatory effort to guide practitioners toward the active programming of generalization.

Landrum and Lloyd (1992) reviewed social behavior research with students with emotional or behavioral disorders and examined specifically the extent to which generalization across time, settings, responses, and individuals was addressed explicitly in the studies. Reviewing journals in psychology and special education, 12 studies met their criteria for inclusion. They used the generalization promoting strategies of Stokes and Baer (1977) to guide their analyses of the 12 articles, and discussed their results in terms of the reformulation of the categories suggested by Stokes and Osnes (1989). They found that the studies were relatively evenly divided across four strategies: four studies used train and hope; three studies each used each teaching relevant behaviors and sequential modification (exploit current functional contingencies); and two studies used train sufficient exemplars (train diversely). Maintenance was assessed in seven of the 12 articles, and transfer across responses and across individuals was assessed in only two and five studies, respectively. Furthermore, they reported that these assessments appeared only incidentally or anecdotally. Of final interest here, only one of the 12 studies assessed all four forms of generalization, while six studies assessed two forms of generalization. As a result of their review, the authors recommend that generalization become a dependent variable in more research, as has been suggested since Baer et al. (1968).

Fox and McEvoy (1993) reviewed the assessment and enhancement of generalization and social validity of social-skills interventions with children and adolescents. They state the conclusion early in their article that it is necessary not only to assess but to enhance the generality of interventions for children and adolescents with deficits in social interaction. Issues surrounding the frequently interchangeable use of the terms "generalization" and "generality" were cited as problematic. The topographical definition of generalization used by Stokes and Baer (1977) causes concern due to the implication that the occurrence of generality of social behavior change may be sufficient instead of requiring an empirical demonstration that generalization occurred. They cite an additional problem with the interchangeable use of the terms "follow-up" and "maintenance" (an observation made by the authors of this article, as well). The confusion caused by the interchangeable use of these terms (among other terms in use, including durability and resistance to extinction) results in an inability to determine what the necessary conditions are that result in generalization. Accordingly, the authors recommend that "only through an intensive analysis of generalization and other environmental changes" may questions about the promotion of more general, durable behavior change be answered (p. 343). They proceed to discuss the selected articles that were reviewed along selected dimensions suggested by Stokes and Baer, while noting that other typologies exist (including Stokes & Osnes, 1989). The Stokes and Baer typology was chosen because "it is well-known, frequently referenced, and reasonably efficient in organizing specific generality programming procedures and their results" (p 346). Their results were both encouraging and discouraging. While they noted an increase in social skills training research that included generality procedures, an increase in the diversity of tactics used, and some behavior change across settings, responses, people, or time, failures to replicate effects across studies were apparent. Additionally, they reported that few studies used experimental designs that could determine empirically the relationship between the resultant generality and any particular programming procedure.

Tillman (2000) discussed generalization programming in the context of behavioral consultation and used selected generalization-promoting tactics from both Stokes and Baer (1977) and Stokes and Osnes (1989) to frame the discussion. While reporting early optimism that generalization of problem-solving and intervention skills resulted from consultation, reality showed that only a handful of studies actively examined generalization. Unfortunately, none of these few studies showed that generalization resulted from school based consultation. The discussion continued to suggest explanations for this dismal finding from the conceptualizations offered by Stokes and Baer and Stokes and Osnes (1989). Therefore, while concluding that no generalization appears to exist for school-based consultation activities as evidenced by the few studies that provided such investigations, there is a suggestion that the generalization frameworks proffered by Stokes and Baer (1977) and Stokes and Osnes (1989) can provide assistance in the creation of a consultation generalization program. He discusses this possibility in detail in the remainder of his article.

Research Articles

Eighty-eight research articles were identified that met the criteria for review. Several articles addressed both maintenance and generalization, and/or used both maintenance and follow-up terminology. To summarize, 38 articles used the word "generalization" or "maintenance" in the title or in the descriptor words and/or contained a statement in the abstract that generalization and/or maintenance was a goal of the research; I I articles included a condition to assess maintenance; 29 articles did not discuss maintenance but included follow-up assessment of post-treatment effects; 16 articles specifically addressed generalization and included generalization probes in their design; and 13 articles used reversal designs that allowed for assessment of durability of effects post-treatment but did not discuss maintenance per se.

Articles that Used "Generalization" or "Maintenance" in Titles, Descriptors, or Abstracts

Forty-three percent of the articles (N=38) used the terms "generalization" or "maintenance" explicitly in their titles, descriptors, and/or abstracts. Of these, 30 addressed generalization, and eight addressed maintenance. Approximately 47% of the generalization research (N=14 articles) addressed communication or verbal behavior (i.e., Drasgow, Halle, & Ostrosky, 1998; Hughes, Harmer, Killian, & Niarhos, 1995; Krantz & McClannahan, 1998; Sema, Schumaker, Sherman, & Sheldon, 1991; Stewart, Van Houten, & Van Houten, 1992). Fifty-three percent (N=16 articles) addressed nonverbal behavior (i.e., self-injurious behavior [Lalli, Mace, Livezey, & Kates, 19981; appropriate play by preschoolers [Ward & Stare, 1990]; self-assessment and recruitment of teacher praise by preschoolers [Connell, Carta, & Baer, 1993]).

The bulk of the generalization research provided some overt generalization programming in its procedures. Craft, Alber, and Heward (1998) manipulated the reinforcement schedule by training initially using continuous reinforcement and then fading to intermittent reinforcement in the latter half of their generalization programming condition (exploit current functional contingencies). By introducing generalization programming in multiple baseline fashion, they were able to conclude that the generalization programming condition was responsible for improvements in students' use of methods to recruit teacher praise. Following cessation of all programming, use of the recruiting strategies maintained for five sessions for all four participants who were developmentally disabled. Halle and Holt (1991) controlled for generalization by using a multielement probe design to systematically manipulate the introduction of various stimuli into the training setting with four young adults with moderate mental retardation (train diversely). Their results clearly show that paired-stimulus probing vs. single-stimulus probing resulted in the exhibition of the target behavior, saying "please."

Several studies involved peers in training with individuals who exhibited low levels of social responses, therefore incorporating functional mediators in the design of their studies. For example, Pierce and Schreibman (1997) used this approach to increase the social behaviors of two children with autism. They introduced the peers in multiple baseline fashion thereby demonstrating that the presence of the peer was responsible for increases in the appropriate responding by the target children. Following training, the target children exhibited increased social behaviors in nontraining settings with novel peers. The authors propose that the use of pivotal response training (PRT) constituted the use of "loose training", and may have been responsible for the improvements. Thiemann and Goldstein (2001) also utilized peers in a study to investigate the effects of written text and pictorial cuing with video feedback on the social behaviors of five students with autism. Their use of a multiple baseline design demonstrated that the treatment package was responsible for improvements in four behaviors for each participant. Unfortunately, it was not possible to distinguish the role of the peers from that of the other training variables (i.e., pictorial cuing, video feedback) because all variables were introduced as a package.

Several studies that focused on improving various nonverbal behaviors were designed to control for generalization-promoting variables. Shore, Iwata, Lerman, and Shirley (1994) used diverse training and systematically varied three stimulus parameters (therapist, setting, and demands) to result in varying levels of generalization on novel probes with three participants who exhibited self-injurious behaviors. Unfortunately, the idiosyncratic nature of the generalized responding was troublesome, and precluded drawing firm conclusions about the effectiveness of the use of the systematic varying of the stimulus parameters. However, the investigation provides an example of a study that was designed to control for generalization-promoting variables. Connell et al. (1993) also reported variable levels of generalization in their well-designed study to program generalization of students' transition skills in classroom settings. The use of a multiple baseline design to explore the effects of self-assessment and self-assessment plus recruitment of teacher praise (exploiting current functional contingencies and incorporating functional mediators) allowed for clear examination of generalized effects from the training setting to the classroom.

Ducharme and Holborn (1997) included generalization-promoting procedures in the design of their study that examined social skills of young children with hearing impairments. Following an intervention condition that included multiple training components, they implemented a second intervention condition that overlaid additional teachers, peers, and materials (sufficient stimulus exemplars) and fading of teacher praise (contacting natural consequences) in a dissimilar room. By using a multiple baseline design to introduce the three conditions (ABC), they were able to conclude that the generalization-promoting strategies resulted in large increases in social interactions in the generalization setting. However, a limitation of the study is the presence of training in one setting while generalization assessment is occurring in the novel setting. Unfortunately, this resulted in an inability to determine "pure generalization" (generalization with no training procedures in effect in any setting) to the novel setting. Neef, Lensbower, Hockersmith, DePalma, and Gray (1990) provided a clear investigation of the generalization-promoting functions of multiple training exemplars in their study that taught appropriate use of appliances (washers and dryers) to four adults with mental retardation. By using a counterbalanced design that included two types of instruction and probes with untrained appliances, they were able to clearly determine that more generalization errors were present when a broad range of training exemplars was used and not when simulated versus natural training stimuli were used.

Other studies that focused on nonverbal behaviors were not designed to control for generalization-promoting variables but included generalization programming in their procedures, showing that researchers are cognizant of the need to address generalization actively. For example, Donnelly and Olczak (1990) investigated the effect of differential reinforcement of incompatible behaviors (DRI) (exploiting current functional contingencies) to reduce cigarette pica in two adults with intellectual disabilities. A reversal design was used to show experimental control, and results show clearly that pica behavior decreased when the DRI schedule was in effect and increased when no DRI schedule was present. They included a generalization condition in which other staff members used the DRI schedule with the participants, and reduced levels of cigarette pica maintained while the DRI was in effect. Koegel and Koegel (1990) faded the trainer away from the four students with autism after training them to criterion on self-management procedures (exploiting current functional contingencies and incorporating functional mediators). The participants' stereotypic behaviors maintained at reduced levels when the trainer was faded after they had been trained to use the self-management procedures.

Articles that Addressed Maintenance or Follow-up

Research that addressed maintenance is classified into three categories: research that explicitly investigated variables that resulted in maintenance of intervention effects; research that assessed presence or absence of maintenance post-intervention; research that included follow-up conditions to assess durability of intervention effects; and intervention research that did not address maintenance but utilized reversal designs that allowed for examination of durability of intervention effects. As Fox and McEvoy (1993) pointed out, the use of both "maintenance" and "follow-up" is distracting because it is not possible to discern the difference between the two conditions. Regardless of which term is used, it appears that the authors use both terms to mean that intervention effects are present after the intervention is withdrawn. Therefore, both categories will be grouped together for the purposes of this discussion.

Only eight articles (9%) explicitly addressed the term "maintenance" in their titles. Of these, five addressed nonverbal behavior (i.e., sorting by children with autism [Dozier et al., 2001]; performance on a reading task [Daly, Martens, Kilmer, & Massie, 19961, and the remaining three addressed functional communication training (FCT) [Durand & Carr, 1992; Shirley, Iwata, Kahng, Mazaleski, & Lerman, 1997; Derby et al., 1997]). Additionally, four articles included maintenance assessments in their designs, but did not describe these in their titles or abstracts (increasing employment productivity by adults with mental retardation [Christian & Poling, 1997]; decreasing sleep disorders among young children [Durand & Mindell, 1990]; using spousal feedback with parents of children with autism [Harris, Peterson, Filliben, Glassberg, & Favell, 1998]; increasing teacher use of interventions [Witt, Noell, LaFleur, & Mortenson, 1997]).

The bulk of the research that addressed maintenance and follow-up provided assessments of intervention effects after intervention withdrawal instead of designing the investigations to enhance maintenance. Only four investigations actively programmed for maintenance, and all used the strategy of exploiting current functional contingencies. Altus, Welsh, and Miller (1991) provided an excellent example of an investigation designed for maintenance. By transferring responsibility for provision of positive feedback to members of a student housing cooperative from the researchers to members of the cooperative (exploit current functional contingencies), they demonstrated long-term maintenance of completion of tasks by cooperative members. The investigation began in 1985 and was active through 1986, with follow-up in 1987 and again in 1991. All follow-up assessments showed that task completion remained high, with some decrease noted in the 1991 data. Dozier et al. (2001) utilized fixed-time schedules of reinforcement to maintain the performance of two young children with autism on manipulative tasks (exploit current functional contingencies). Variable-ratio and three fixed-time schedules were introduced using multielement and reversal designs. Results suggested that previously acquired responses were maintained using thin, dense, and yoked FT schedules, although there was variability across participants so results should be interpreted with caution. Similarly, Lerman, Iwata, and Shore (1996) demonstrated maintenance of reduced levels of SIB during extinction conditions when intermittent reinforcement was available prior to extinction with adults with mental retardation. Finally, the participants in the investigation of Bennett and Cavanaugh (1998) used self-correction procedures on multiplication tasks to assist in the maintenance of improved responding (incorporate functional mediators). Their findings indicated that immediate self-correction was more effective than delayed or no self-correction procedures in producing appropriate performance and in maintaining performance following instruction.

Encouragingly, 60% of the articles (N=53) that addressed generalization and maintenance contained follow-up conditions. This suggests that behavior analysts have begun to address seriously the need to assess durability of treatment effects. The length of these conditions was highly variable, ranging from one session at the shortest to one year at the longest. A notable exception is the study of Altus et al. (1991), described previously. This range was noted among the research that addressed verbal behavior issues. Among the research that was implemented in school settings, follow-up was conducted from two sessions to six months. A wide variety of research was conducted in varying settings with various target behaviors, including community, home, hospital, playground, and laboratory settings. The follow-up conditions ranged from 10 days [teaching playground safety skills to elementary school children (Heck, Collins, & Peterson, 2001)] to 10 months [teaching independent living skills to children and young men with visual impairments (Taras, Matson, & Felps, 1993)] in these studies.

The final group of articles reviewed used reversal designs to demonstrate experimental control of intervention procedures. Inherent in the use of reversal designs for this purpose is the problem that, while from a scientific standpoint, the reversal design shows experimental control, from a practitioner's standpoint, it is deleterious for intervention effects to reverse (Miltenberger, 2001). For the purposes of the present discussion, the use of the reversal design allowed examination of the durability of intervention effects after its withdrawal. 15% of the articles (N=13) used reversal designs to demonstrate experimental control in their intervention research, and all were effective in doing so. Therefore, 100% of the intervention research that utilized reversal designs showed experimental control and failed to show durability of intervention effects when intervention was withdrawn. This may suggest that behavior analytic researchers who investigate interventions are caught in a dilemma--if they use the reversal design to demonstrate experimental control and are successful, the research is successful from a scientific standpoint. However, from an applied perspective, the reversal of intervention effects following the withdrawal of the intervention is a disappointment. The intervention areas targeted in these studies included maladaptive behaviors of youth with attention deficit with hyperactivity disorder (Reitman, Hupp, O'Callaghan, Gulley, & Northup, 2001), eye poking (Smith, Russo, & Le, 1999), inappropriate verbal behavior of heroin addicts (Petry et al., 1998), wandering by persons with dementia (Heard & Watson, 1999), automobile safety belt use when leaving the supermarkets (Engerman, Austin, & Bailey, 1997), sleep problems with a toddler (Ashbaugh & Peck, 1998), rapid eating by a young woman with developmental disabilities (Wright & Vollmer, 2002), and food selectivity (Dixon, Benedict, & Larson, 2001). Inspection of these studies reveals that all investigations manipulated highly discriminable interventions, i.e., presence/absence of a token economy, presence/absence of stickers, presence/absence of prompts, access or lack of access to leisure activities, presence/absence of DRL or DRA procedures. In other words, it was readily discriminable to the studies' participants when interventions were active and when they were not. While demonstrating the effectiveness of the interventions, these studies may have inadvertently demonstrated that the withdrawal of highly discriminable interventions results in a loss of intervention effects. Consistent with the generalization-promoting strategy of Stokes and Osnes (1989), it is plausible that further investigations that manipulate the discriminabiliLy of interventions of these types should attempt to demonstrate a generalization-promotion function in addition to demonstrating the effectiveness of the interventions in the immediate time frame.


This paper embarked on an effort to provide at least a partial answer to the question posed earlier: Have we progressed past the Train and Hope stage of development as a field and advanced the science of human behavior by developing methods that empirically demonstrate a generalization-promoting function? The answer appears to be mixed. On the encouraging side, researchers who are investigating interventions are more often than not including assessments of maintenance in their investigations. Unfortunately, on the discouraging side, researchers are continuing to investigate highly discriminable interventions that fail to demonstrate durability after their withdrawal while demonstrating excellent experimental control and satisfying the scientific process. By carrying their research another step further and including an additional condition to decrease the discriminabiliLy of the intervention in an effort to promote maintenance, both the practitioner and the scientific audiences could be satisfied. The current status of generalization research, whether designed to control for generalization-enhancing variables or to establish the durability of the procedures, suggests that generalization continues to be an elusive entity. When obtained, it appears to require much effort. For researchers to demonstrate a functional relationship between procedures and generalization, much effort is required in the design and implementation of the research. For practitioners to design interventions that result in generalization, more effort is required than to simply demonstrate the immediate effectiveness of the procedures. Such required effort may discourage both researchers and practitioners from delving deeply into the somewhat gray area of generalization-promotion. However, it is precisely this increased effort that is necessary in order for behavior analysis to show the generality of the outcomes of its labors. If such generality fails to be demonstrated, it may be necessary for behavior analysis to "throw in the towel" and acknowledge that our procedures are very effective at producing behavior change but need to be utilized ad infinitum because long-lasting and widespread behavior change is a highly obscure commodity.

Conversely, on the encouraging side, there are at least a dozen examples of research presented here that were designed solely to demonstrate the functional relationship between training variables and generalization. It is important to remember, also, that the literature reviewed here is from only a few journals. Undoubtedly, it is safe to assume that a broader literature review would yield even more reason for optimism. Each investigation that controls for generalization variables can and should be considered a model for other investigators to use. A systemic method for accessing and utilizing the extant data base on generalization-promotion may be helpful in increasing the frequency of research in the area. If you will, imagine behavior analysts being able to access the currently imaginary Journal of Generalization-Promotion, which would serve as a central receiving point for research and interventions that focus on this critical area.

Another optimistic result of the present review was the extent to which the authors of the research utilized proficiently their discussions of Stokes and Baer (1977) and Stokes and Osnes (1989). It appears obvious that the categories provided by Stokes and Baer have demonstrated maintenance, and that, in and of itself, constitutes one level of effective intervention. Acknowledging the need for generalization promotion is now a well-entrenched part of behavior analysis that has resulted in a growing data base across diverse areas of the field. Investigators are describing their efforts in terms of the generalization-promoting categories and the categories appear to be driving generalization research. In short, it appears that the categories are becoming increasingly more functional, an outcome that hopefully would please Baer et al. (1968). In that respect, it could be concluded that they are becoming more explicit than implicit, with Train and Hope more an historical artifact than a present day albatross.

However, lest we become too confident that we are making adequate strides in the area of generalization promotion, let us remember that the conceptualization continues to be stronger than the empirical base that supports it. To continue to advance our efforts in this critical area, each behavior analyst should assume responsibility to "raise the bar" and plan no empirical investigations and interventions without generalization promotion as part of the research and intervention plan. Accomplishing the most generalized effects in the least intrusive manner while subjecting the endeavor to a rigorous scientific process may best ensure that our efforts remain true to the field's tenets of empiricism and parsimony. In this manner, an explicit technology of generalization may have the best opportunity to continue along its current healthy, albeit slow, course of development.


Pamela G. Osnes & Tara Lieblein

University of South Florida
