EIBT research after Lovaas (1987): a tale of two studies.Abstract
Since the publication of Lovaas's (1987) seminal seminal /sem·i·nal/ (sem´i-n'l) pertaining to semen or to a seed.
Of, relating to, containing, or conveying semen or seed. paper, serious questions have surfaced regarding design features that compromise the validity of treatment efficacy data resulting from studies of early intensive behavioral treatment (EIBT) for children with autism autism (ô`tĭzəm), developmental disability resulting from a neurological disorder that affects the normal functioning of the brain. It is characterized by the abnormal development of communication skills, social skills, and reasoning. . Lovaas and his colleagues have acknowledged the legitimacy of some of these questions, and guidelines guidelines,
n.pl a set of standards, criteria, or specifications to be used or followed in the performance of certain tasks. have emerged to improve the validity of future efficacy studies: (1) use random assignment of participants; (2) use uniform assessment protocols; (3) document enough methodological detail to support replication. Two recent studies are examined in reference to their compliance with these guidelines (Howard, Sparkman, Cohen cohen
(Hebrew: “priest”) Jewish priest descended from Zadok (a descendant of Aaron), priest at the First Temple of Jerusalem. The biblical priesthood was hereditary and male. , Green, & Stanislaw, 2005; Sallows Sallows is a fell in the English Lake District, rising between the valleys of Kentmere and Troutbeck. It is the highest point in the upland area to the south of Garburn Pass, variously termed Kentmere Park and Applethwaite Common on Ordnance Survey maps. and Graupner, 2005). Findings indicate that different levels of compliance result in different degrees of threats to internal validity Internal validity is a form of experimental validity . An experiment is said to possess internal validity if it properly demonstrates a causal relation between two variables  . .
Keywords: autism, autism spectrum disorders A spectrum disorder in psychiatry is hard to define precisely but is a mental disorder having something to do with a spectrum of subtypes or closely related disorders. The spectrum model is proposed as a more coherent way of understanding psychiatric symptomatology. , early intensive behavioral treatment, Lovaas, random assignment, uniform assessment, replication.
Since the publication of Lovaas's (1987) seminal study, a growing body of research has been conducted to document the treatment efficacy of early intensive behavioral treatment (EIBT) for children with autism spectrum disorders (e.g., Birnbrauer & Leach, 1993; Harris, Handleman, Gordon, Kristoff, & Fuentes, 1991; McEachin, Smith, & Lovaas, 1993; Scheinkopf & Siegal, 1998). However, serious questions have been raised about the validity of much of this research. Lovaas's (1987) study in particular, which was designed to examine the efficacy of treatment offered at the UCLA UCLA University of California at Los Angeles
UCLA University Center for Learning Assistance (Illinois State University)
UCLA University of Carrollton, TX and Lower Addison, TX Young Autism Project, has been subjected to considerable criticism (e.g., Gresham & MacMillan, 1997; Schopler, Short, & Mesibov, 1989). While much of this criticism was challenged by Lovaas and his colleagues, some of it was acknowledged by them to be valid. Moreover, these acknowledgements served as the basis for emphasizing three research guidelines (among others) that could improve the validity of follow-up treatment efficacy studies. Unfortunately, the application of these guidelines has not always been consistent.
The current paper summarizes those criticisms of the Lovaas (1987) study that Lovaas and his colleagues have acknowledged as valid, and it summarizes the research guidelines that evolved from these criticisms. Further, two recently published EIBT studies are reviewed and evaluated relative to their compliance with these guidelines. The first study was conducted by Howard, Sparkman, Cohen, Green, and Stanislaw (2005) and the second was conducted by Sallows and Graupner (2005).
Criticisms of the Lovaas (1987) Research
Criticisms of the Lovaas (1987) research have been addressed by Tristram Smith, a colleague of Lovaas's and Research Director of the Multi-Site Young Autism Project. Specifically, Smith (in Smith, Groen, & Wynn, 2000) noted two criticisms with which Lovaas and his associates reportedly concur CONCUR - ["CONCUR, A Language for Continuous Concurrent Processes", R.M. Salter et al, Comp Langs 5(3):163-189 (1981)]. :
First, assignment to groups was based on whether or not therapists were available to provide intensive treatment rather than on a more arbitrary procedure, such as the use of a random numbers table. Thus, assignment could have been biased [italics added]. Second, because children were referred to outside examiners, they received a variety of different tests rather than a uniform assessment protocol. Hence, assessment results may have been unreliable [italics added]. (p. 270)
While concurring con·cur
intr.v. con·curred, con·cur·ring, con·curs
1. To be of the same opinion; agree: concurred on the issue of preventing crime. See Synonyms at assent.
2. with these criticisms, this concurrence CONCURRENCE, French law. The equality of rights, or privilege which several persons-have over the same thing; as, for example, the right which two judgment creditors, Whose judgments were rendered at the same time, have to be paid out of the proceeds of real estate bound by them. Dict. de Jur. h.t. was nonetheless qualified. For example, Lovaas and his colleagues have reportedly been dubious about the importance of the second criticism. Nevertheless, to address these (and other) criticisms relative to future research, they emphasized "the need for replication to confirm the results" (Smith et al., 2000, p. 270). Also inherent in their responses are three guidelines to be addressed by follow-up treatment efficacy studies: (1) random assignment of participants to treatment conditions; (2) use of uniform assessment protocols across all participants; and (3) documentation of sufficient methodological detail to allow for independent replication. Unfortunately, mixed results are evident in published follow-up studies relative to these recommendations. In the next section, we examine two prominent, recently-published EIBT studies to illustrate this point.
The Treatment Efficacy Study of Howard, Sparkman, Cohen, Green, and Stanislaw (2005)
Howard et al. (2005) studied 61 children diagnosed with either autistic disorder Autistic disorder
A severe neuropsychiatric disorder of early childhood onset, historically regarded as a psychosis of childhood but now classified as a pervasive developmental disorder. or pervasive developmental disorder-not otherwise specified (PDD-NOS PDD-NOS Pervasive Developmental Disorder, Not Otherwise Specified ). These participants were referred by nonprofit A corporation or an association that conducts business for the benefit of the general public without shareholders and without a profit motive.
Nonprofits are also called not-for-profit corporations. Nonprofit corporations are created according to state law. agencies ("regional centers") whose primary function is to meet the case management needs of people with developmental disabilities developmental disabilities (DD),
n.pl the pathologic conditions that have their origin in the embryology and growth and development of an individual. DDs usually appear clinically before 18 years of age. . To be eligible for the study, participants had to satisfy the following criteria: (a) receive a diagnosis of Autism or PDD-NOS before the 4th birthday; (b) be exposed to English as the primary language at home; (c) be available to begin treatment before the age of four years; and (d) have not received more than 100 hours of treatment prior to participating in the study.
The children in the Howard et al.'s (2005) study participated in one of three different, multicomponent treatment conditions. Each is summarized below:
Group #1: Intensive Behavior Analytic Intervention (IBT (1) (Instructor Based Training) Training courses conducted by human teachers.
(2) (Internet Based Training) Training courses provided via the Internet. ). In the IBT group, participants under 3 years of age received 1:1 intervention for 25 to 30 hours per week, while those over 3 years of age received 1:1 intervention for 35 to 40 hours per week. The IBT participants received treatment across multiple settings including school and home. Using discrete trial training, incidental Contingent upon or pertaining to something that is more important; that which is necessary, appertaining to, or depending upon another known as the principal.
Under Workers' Compensation statutes, a risk is deemed incidental to employment when it is related to whatever a teaching, as well as "other behavior analytic procedures" (Howard et al., p. 7), 50 to 100 trials per hour were presented. Further, parents were provided with training in fundamental behavior analytic strategies, maintenance and generalization gen·er·al·i·za·tion
1. The act or an instance of generalizing.
2. A principle, a statement, or an idea having general application. data collection techniques, and methods for implementing their children's treatment programs "outside of regularly scheduled intervention hours" (p. 7). Parents were also required to attend meetings with agency staffers one to two times per month.
Group #2: Autism Educational Programming (AP). The AP participants received 25 to 30 hours per week of 1:1 or 1:2 interventions delivered in public school classrooms designated to serve students with autism. These participants received a range of interventions including activities derived from the TEACCH TEACCH Treatment and Education of Autistic and related Communication Handicapped Children (University of North Carolina at Chapel Hill) model, discrete trial training, Picture Exchange Communication System (PECS) training, and sensory integration therapy Children with sensory integration dysfunction frequently experience problems with their sense of touch, smell, hearing, taste and/or sight. Along with this will often be difficulties in movement, coordination and sensing where one's body is in a given space. .
Group #3: Generic Educational Programming (GP). The GP participants received 15 hours per week of 1:6 interventions in special education, preschool classrooms designated to serve either early intervention ear·ly intervention
n. Abbr. EI
A process of assessment and therapy provided to children, especially those younger than age 6, to facilitate normal cognitive and emotional development and to prevent developmental disability or delay. or communicatively handicapped students. Developmentally appropriate instructional activities were employed, emphasizing "exposure to language, play activities, and a variety of sensory experiences" (Howard et al., p. 8). In addition, a certified See certification. speech-language pathologist provided most of the participants with language therapy once or twice a week.
pertaining to data that have been submitted to standardization procedures.
standardized morbidity rate
see morbidity rate.
standardized mortality rate
see mortality rate. assessments targeting cognitive, nonverbal non·ver·bal
1. Being other than verbal; not involving words: nonverbal communication.
2. Involving little use of language: a nonverbal intelligence test. , receptive receptive /re·cep·tive/ (re-cep´tiv) capable of receiving or of responding to a stimulus. and expressive language, and adaptive skills were administered to the participants during intake and at follow-up (after approximately 14 months of treatment). At intake all three groups had "similar" (Howard et al., p. 11) mean scores on all but one measure. The only difference achieving statistical significance was in the domain of nonverbal skills. Moreover, for all three groups, the mean standard scores across most skill domains were considerably below 100. At follow-up, the differences in the mean scores of the participants in the AP and GP groups were not statistically significant. On the other hand, "the IBT group had higher mean scores in all domains than the other two groups combined; and those differences were statistically significant" (Howard et al., p. 11).
At follow-up, the IBT group's mean standard scores for the cognitive, nonverbal, communication, and motor skills domains were within normal range; the only domain in which the AP and GP groups scored in the normal range was motor skills. Thirteen IBT group participants exhibited gains in their IQ scores "from one standard deviation In statistics, the average amount a number varies from the average number in a series of numbers.
(statistics) standard deviation - (SD) A measure of the range of values in a set of numbers. or more below average (i.e., IQ of 85 or lower) at intake to within one standard deviation of average or above (i.e., IQ of 86 or higher) at follow-up" (Howard et al., p. 11). Three other IBT participants whose intake IQ scores were in the normal range (i.e., 84, 89, and 97) exhibited follow-up gains (i.e., from 84 to 122, 89 to 114, and 97 to 102). At intake none of the AP participants exhibited IQ scores within the normal range; at follow-up, two exhibited IQs within the normal range. The IQ scores of 3 participants in the GP group changed from one (or more) standard deviations below average (at intake) to within normal range (at follow-up). Finally, two GP participants who exhibited intake IQ scores within the normal range displayed a decrease in their IQ scores at followup.
Compliance Guideline guideline Medtalk A series of recommendations by a body of experts in a particular discipline. See Cancer screening guidelines, Cardiac profile guidelines, Gatekeeper guidelines, Harvard guidelines, Transfusion guidelines. #1: Random Assignment
The Howard et al. study did not follow the first guideline. Specifically, a quasi-experimental pretest-posttest, nonequivalent groups design was employed. Participants were assigned to the groups by their respective individual education plan (IEP IEP
In currencies, this is the abbreviation for the Irish Punt.
The currency market, also known as the Foreign Exchange market, is the largest financial market in the world, with a daily average volume of over US $1 trillion. ) or individual family service plan (IFSP IFSP Individualized Family Service Plan
IFSP ITA Fluid Service Pallet ) teams where "parental preferences weighed heavily" (p. 6). More specifically, each child's team considered "a range of educational options" which included (but was not limited to) placement in one of the three groups.
The Howard et al. study is appropriately characterized char·ac·ter·ize
tr.v. character·ized, character·iz·ing, character·iz·es
1. To describe the qualities or peculiarities of: characterized the warden as ruthless.
2. as a nonequivalent group design because the participants were not randomly assigned to the three conditions. McGuigan (1997) has defined random assignment as "a procedure that assures that each member of a population or universe has an equal probability of being selected" (p. 89). According to according to
1. As stated or indicated by; on the authority of: according to historians.
2. In keeping with: according to instructions.
3. Durso and Mellgren (1989), random assignment is the "most important" method of controlling extraneous variables Extraneous variables are variables other than the independent variable that may bear any effect on the behaviour of the subject being studied.
Extraneous variables are often classified into three main types:
Required or necessary as a prior condition: Competence is prerequisite to promotion.
n. for a true experiment" (p. 106). Similarly, Graziano and Raulin (2004) have described random assignment as "the most basic and single most important control procedure" (p. 207). Failure to randomly assign is a "basic weakness" (Kerlinger, 1973, p. 321) of nonequivalent group designs. Similarly, Tristram Smith (T. Smith, personal communication, July 25, 2005) has reported that one of the Howard et al. study's "limitations" is its use of nonrandom assignment. Thus, by failing to use random assignment, the design employed by Howard et al. is not considered a true experiment, but rather quasi-experimental (Cozby, 2001).
Howard et al. acknowledge that their use of nonrandom assignment constitutes a limitation of the study. However, they also assert that the three groups were "very similar" on "key" pretreatment pretreatment,
n the protocols required before beginning therapy, usually of a diagnostic nature; before treatment.
n See predetermination. , dependent measures, and that this is the "main purpose of random assignment" (p. 15). But is this its main purpose? According to Kerlinger (1973), random assignment is used to provide the rationale for assuming that groups are equal "in all characteristics" (p. 127; emphasis added), not just equal (or, worse yet, merely "similar") with respect to "key" (Howard et al., p.15) or "pertinent" (Kerlinger, p. 321) dependent variables. When "randomization randomization (ranˈ·d·m is not used ... it is not possible [italics added] to assume that the groups are equal" (p. 322). For McGuigan (1997), "the great value of randomization is that it randomly distributes extraneous ex·tra·ne·ous
1. Not constituting a vital element or part.
2. Inessential or unrelated to the topic or matter at hand; irrelevant. See Synonyms at irrelevant.
3. effects, whatever they may be, over the experimental and control conditions"(p. 90). When "we do not randomly assign participants to groups, ... we can expect confounds" (McGuigan, 1997, p. 90). Thus, lacking the presumption A conclusion made as to the existence or nonexistence of a fact that must be drawn from other evidence that is admitted and proven to be true. A Rule of Law.
If certain facts are established, a judge or jury must assume another fact that the law recognizes as a logical of equivalence, "we must consider the likelihood that alternative hypotheses may account for the results." (McBurney, 1998, p. 249). Two such alternative hypotheses shall now be considered.
Alternative Hypothesis alternative hypothesis Epidemiology A hypothesis to be adopted if a null hypothesis proves implausible, where exposure is linked to disease. See Hypothesis testing. Cf Null hypothesis. #1: In accounting for the results reported by Howard et al., alternative hypothesis #1 centers on how differentially motivated to help their children the participants' parents were across the three conditions. Remember that, according to Howard et al., the parental preferences regarding educational placement (i.e., regarding assignment to a particular treatment condition) "weighed heavily" (p. 6). Further, recall that in the IBT treatment condition "parents received training in basic behavior analytic strategies, assisted in the collection of maintenance and generalization data, implemented programs with their children outside of regularly scheduled intervention hours, and met with the agency staff 1-2 times a month" (p. 7). In the other two treatment conditions, no such comparable demands on the parents were identified. Thus, one plausible alternative hypothesis is that those parents who were willing to actively participate in their children's treatment were more motivated to help them change, and thus more likely to choose the IBT condition, while those less motivated were more likely to opt for the AP and GP conditions.
Graziano and Raulin (2004) offered a similar alternative hypothesis in their discussion of a hypothetical Hypothetical is an adjective, meaning of or pertaining to a hypothesis. See:
1. Highly or excessively active, as a gland.
2. Having behavior characterized by constant overactivity.
3. Afflicted with attention deficit disorder. children" (pp. 224225). In this hypothetical quasi-experiment, the researcher formed the groups by asking the parents whether or not they were willing to effectuate ef·fec·tu·ate
tr.v. ef·fec·tu·at·ed, ef·fec·tu·at·ing, ef·fec·tu·ates
To bring about; effect.
[Medieval Latin effectu for their children a 4 week diet which excluded the food additives food additives, substances added to foods by manufacturers to prevent spoilage or to enhance appearance, taste, texture, or nutritive value. By quantity, the most common food additives are flavorings, which include spices, vinegar, synthetic flavors, and, in the :
The children of those parents who were willing to expend the effort were put in the experimental group, and those who were not were put in the control group. The serious confounding in this procedure is that the experimental and the control groups are different in terms of parents' willingness to try the dietary restrictions. In other words, the dietary restriction treatment is confounded with parents' willingness to cooperate. We might assume that parents who are willing to do everything necessary to change their child's diet in hope of decreasing their child's hyperactivity may be more motivated to help their child to change. Any posttreatment differences between the groups on measures of hyperactivity might be due to either factor: dietary restriction or parental willingness to cooperate. (p. 225)
Similarly, in Howard et al.'s study, type of autism treatment (IBT, AP, GP) is confounded with parent's willingness to actively participate in treatment. Note that it is not the differing roles played by parents across the treatment conditions that is considered a confound con·found
tr.v. con·found·ed, con·found·ing, con·founds
1. To cause to become confused or perplexed. See Synonyms at puzzle.
2. ; rather, it is differences in parental motivation that is the confound. Post-treatment differences between the IBT group and the two other conditions might be due to the IBT parents being more motivated than the parents in the other conditions to assist their children in changing. There are a number of plausible explanations why this increased motivation may serve as a confound. Perhaps the IBT parents, being more motivated, implemented the treatment program at times, and in locales, that exceeded what was required of them. Or perhaps the IBT parents were more motivated because their children had been more responsive to their earlier attempts to teach them (i.e., prior to their entry into the study), thus giving the parents some hope of success when eventually exposed to a highly structured training regimen regimen /reg·i·men/ (rej´i-men) a strictly regulated scheme of diet, exercise, or other activity designed to achieve certain ends.
Alternative hypothesis #2. This hypothesis concerns bias associated with the influence of the nonparental members of the IEP/IFSP teams, including the special education and case management professionals. As fiduciaries within a federally mandated special education process, these team members had both a legal and ethical responsibility to advocate assigning to the IBT group only those children for whom such a placement would be "appropriate" based on the child's IEP/IFSP. An "appropriate" placement is one which permits the child to "benefit educationally" from the instruction (Bateman & Linden Linden, city, United States
Linden, city (1990 pop. 36,701), Union co., NE N.J., in the New York metropolitan area; inc. 1925. During the first half of the 20th cent. , 1992/1998, pp. 143-144). So, although the parents' choice of treatment conditions may have, according to Howard et al., "weighed heavily," the other members of the team doubtless used their expertise to influence the eventual decision.
Indeed, during part of the time in which the Howard et al. study was conducted, representatives of Therapeutic Pathways (the service provider under whose auspices aus·pi·ces 1
Plural of auspex.
under the auspices of with the support and approval of [Latin auspicium augury from birds]
Noun the Howard et al study was conducted) were empowered to play a determinative role in the IEP/IFSP process, regardless of parental preferences. Specifically, the Howard et al study was conducted "from 1996 through 2003" (Howard et al., p. 6). In their 1999 manual "In-home Programs for Young Children with Autism," Therapeutic Pathways required the parents to grant Therapeutic Pathways the ultimate decision making power regarding the range and content of the treatment program, as well as eventual school placement1. A reasonable assumption is that Therapeutic Pathways staffers used this power to place in the IBT condition those participants who, in their professional judgment, were more likely to benefit from the program, and to refer the other participants to the other conditions. If this is the case, then experimenter-based, biased assignment to groups provides an obvious alternative explanation of the results.
Consider also another example of the decisive role played by Therapeutic Pathways. As a "nonpublic" (i.e., private) agency, the representatives of the service provider were free to refuse to treat any child who, in their judgment, would not benefit from their program. Howard et al. do not inform us whether or not the service provider exercised this option and, if they did, the criteria that were used. Obviously, if they refused to treat some children whom they judged would not benefit from the program, then this, too, suggests bias.
Compliance with Guideline 2: Use of Uniform Assessment Protocol
Although not identified by Smith (T. Smith, personal communication, July 25, 2005) as a limitation of the study, a close reading of the published paper (Howard et al., pp. 8-10) indicates that the researchers failed to use a uniform assessment protocol during intake and follow-up, thus risking unreliable measurement. Consider these assessment issues in detail across these domains: Cognitive skills cognitive skill Psychology Any of a number of acquired skills that reflect an individual's ability to think; CSs include verbal and spatial abilities, and have a significant hereditary component , nonverbal skills, receptive and expressive language, and adaptive skills.
Assessment of Cognitive skills. The participants' cognitive skills were assessed using a number of different instruments at intake. Specifically, 42 participants were assessed using the Bayley Scales of Infant Development-Revised; 10 participants were assessed using the Wechsler Primary Preschool Scales of Intelligence-Revised; 3 were assessed using the Developmental Profile-II; and 2 were assessed using the Stanford-Binet Intelligence Scale Stanford-Binet Intelligence Scale
test used to measure IQ; designed to be used primarily with children. [Am. Education: EB, IX: 521]
See : Intelligence . Three additional instruments were used to assess each of three children, respectively: Differential Abilities Scale, Developmental Assessment of Young Children, and Psychoeducational Profile Revised.
At follow-up, "the test used ... varied with the chronological ages chron·o·log·i·cal age
n. Abbr. CA
The number of years a person has lived, used especially in psychometrics as a standard against which certain variables, such as behavior and intelligence, are measured. of the child" (p. 9). A majority of the participants (i.e. 47 children) participated in cognitive assessments using an instrument that was not used at intake (i.e., the Wechsler Primary Preschool Scales of Intelligence-Revised). However, three instruments used during intake assessment of cognitive skills were also used at follow-up, albeit for a small number of participants. Specifically, 4 participants were assessed at follow-up using the Bayley Scales of Infant Development Bay·ley Scales of Infant Development
Standardized tests used to assess the mental, motor, and behavioral progress of children during the first two and one-half years of life. ; 3 were assessed using the Stanford-Binet Intelligence Scale; and 2 were assessed using the Differential Abilities Scale.
Assessment of Nonverbal skills. At intake, the nonverbal skills of 48 participants were assessed using the Merrill-Palmer Scale of Mental Tests mental tests: see intelligence; psychological tests. ; and one participant was assessed using the StanfordBinet Performance Test. At follow-up, 54 participants were assessed with the same instrument used during intake (i.e., the Merrill-Palmer Scale of Mental Tests). One participant was assessed at follow-up using an the Leiter International Performance Scale Revised, which was not used at intake.
Assessment of Receptive and Expressive Language. The Reynell Developmental Language Scales were used to assess the receptive and expressive language skills of 46 participants at intake. Other instruments used at intake included the Rossetti Infant-Toddler Language Scale (for 5 participants); the Receptive-Expressive Emergent emergent /emer·gent/ (e-mer´jent)
1. coming out from a cavity or other part.
2. pertaining to an emergency.
1. coming out from a cavity or other part.
2. coming on suddenly. Language Scales-Revised (for 3 participants), and the Preschool Language Scale (for 3 participants). One child was assessed with three instruments at intake, including the Toddler Developmental Assessment, the Peabody Picture Vocabulary Test-3rd Edition, the Expressive Vocabulary Test vocabulary test A component of IQ tests in which a person is asked to define words of varying level of difficulty, and use them in context, which provides the examiner with a measure of the person's intellectual achievement and aptitude. See IQ test. , and the Developmental Profile-II language scale. In the case of one client, there was no assessment of receptive and expressive language skills.
At follow-up, 47 participants were assessed using the same instrument that was used during intake (the Reynell Developmental Language Scales). Other instruments used at follow-up (only some of which had also been used during intake) included the Sequenced Inventory of Communication Development-Revised Edition (for 3 participants); Peabody Picture Vocabulary Test-3rd edition along with the Expressive Vocabulary Test (for 2 participants); Preschool Language Scales-3 (for 2 participants); and the Expressive One-Word Picture Vocabulary Test along with the Receptive One-Word Picture Vocabulary Test (for 1 participant). In the case of 6 participants, the receptive and expressive language skills were not measured at all during follow-up.
Assessment of Adaptive Skills. At intake, 54 participants were assessed using the Vineland Adaptive Behavior Scales a·dap·tive behavior scale
A series of tests used to quantify the ability of mentally retarded and developmentally delayed individuals to live independently. . Other instruments used were: the personal adjustment or self-help scales of the Denver Developmental Screening Test Denver developmental screening test Psychology A screening test that assesses a child's neurodevelopmental maturation. See Psychological testing. II (3 participants), Developmental Profile-II (1 participant), and the Rockford Infant Development Evaluation Scales (1 participant). Two participants were not assessed at all during intake. At follow-up, 56 participants were assessed using the Vineland Adaptive Behavior Scales, and 6 participants received no assessment.
Compliance with Guideline #3: Replicability
According to Tristram Smith (T. Smith, personal communication, July 25, 2005), another weakness of the Howard et al. study is that it provides "limited information about the interventions". This weakness makes it next to impossible to replicate rep·li·cate
1. To duplicate, copy, reproduce, or repeat.
2. To reproduce or make an exact copy or copies of genetic material, a cell, or an organism.
A repetition of an experiment or a procedure. . Similarly, Smith's third, remaining criticism (i.e., that there were "unclear procedures for rating the presence or absence of symptoms of autism") also implies that it would be difficult to attempt replication without greater clarity in this area (not to mention that this weakness raises the issue of possible biased sampling and biased assignment to groups).
Other Methodological Problems
According to Howard et al., follow-up assessments were conducted by examiners who were not blind to treatment condition assignments of the participants. This raises the obvious issue of examiner bias in favor of IBT, which is clearly a threat to internal validity. Howard et al. argue, however, that given the substantial number of different examiners reportedly used, it is "just as likely" (p. 15) that some assessors were biased against IBT as for IBT. Unfortunately, no evidence is offered to support this assertion.
Indeed, although the examiners are identified as "independent" of both the investigators and the treatment programs, it remains unclear how these follow-up examiners were funded and thus, how independent they actually were. As Howard et al. reported, one of the agencies funding the research was Valley Mountain Regional Center (VMRC VMRC Virginia Marine Resources Commission
VMRC Valley Mountain Regional Center (Stockton, California)
VMRC Virtual Machine Remote Client (Microsoft)
VMRC Virtual Machine Remote Control ). During the time frame in which the Howard et al. study was conducted, VMRC also contracted with independent vendors to do assessments (N. McGonigle, personal communication, June 14, 2006). Were any of these VMRC-funded vendors used to conduct any of the follow-up assessments in the Howard et al. study? If so, then at least some of these follow-up assessments were funded by the same agency that provided funding for the research (i.e., VMRC). This raises the issue of conflict of interest, thereby strengthening concerns about potential (presumably pre·sum·a·ble
That can be presumed or taken for granted; reasonable as a supposition: presumable causes of the disaster. unintentional) bias on the part of the examiners.
An additional potential conflict of interest problem concerns Howard et al.'s third author, an individual who served as the Clinical Director of VMRC during the study's time frame. As the supervisor of some VMRC staffers involved in making placement decisions, what role, if any, did he play in influencing their decisions? There is clearly a conflict between (1) his role as a fiduciary fiduciary (fĭd`shēĕ'rē), in law, a person who is obliged to discharge faithfully a responsibility of trust toward another. with a responsibility to see to it that only those who are likely to benefit from the IBT treatment package are so assigned and (2) his role as researcher with a responsibility to avoid biased assignment to groups. How was this conflict in roles resolved? Howard et al. do not say.
The Treatment Efficacy Study of Sallows and Graupner (2005)
Sallows and Graupner (2005) studied 24 children with a diagnosis of autism. At intake these children met six criteria: (1) they ranged in age between 24 to 42 months; (2) they had a Mental Development Index "ratio estimate" (i.e., MA divided by CA) of 35 or more; (3) they were "neurologically within normal limits." (Note, however, that children with abnormal EEGs or controlled seizures In counterdrug operations, includes drugs and conveyances seized by law enforcement authorities and drug-related assets (monetary instruments, etc.) confiscated based on evidence that they have been derived from or used in illegal narcotics activities. were accepted as determined by a pediatric pediatric /pe·di·at·ric/ (pe?de-at´rik) pertaining to the health of children.
Of or relating to pediatrics. neurologist Neurologist
A doctor who specializes in disorders of the brain and central nervous system.
Mentioned in: Cervical Disk Disease
a specialist in neurology. , and no child was excluded based on this criterion.); (4) a developmental diagnosis was established by "independent child psychiatrists child psychiatrist Psychiatry A psychiatrist specialized in mental, emotional, or behavior disorders of children and adolescents; CPs are qualified to prescribe medications " (p. 420); and (5) the diagnosis met the DSM-IV DSM-IV
Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV). This reference book, published by the American Psychiatric Association, is the diagnostic standard for most mental health professionals in the United States. and Autism Diagnostic Interview-Revised The Autism Diagnostic Interview-Revised (ADI-R) is structured interview conducted with the parents of individuals who have been referred for the evaluation of possible autism or autism spectrum disorders. criteria for autism. (A "trained examiner" administered both instruments.) In addition, (6) "there were no parental criteria for involvement beyond agreeing to the conditions in the informed consent document" (p. 420).
Each participant was assigned to one of two groups. The nature of each group is summarized below:
Group #1: Clinic-Directed. This group received treatment "replicating the parameters of the UCLA intensive behavioral treatment" (Sallows & Graupner, p. 420). Specifically, the group received "the treatment procedure and curriculum ... initially described by Lovaas (Lovaas et al., 1981) except that no aversives were used." Additional procedures, buttressed but·tress
1. A structure, usually brick or stone, built against a wall for support or reinforcement.
2. Something resembling a buttress, as:
a. The flared base of certain tree trunks.
b. by subsequent research (e.g., Koegel & Koegel, 1995) were also employed (Sallows & Graupner, p. 422). During the first two years, participants received an average of 38 hours per week of direct treatment. Thereafter, as the children began school, the weekly direct treatment hours were gradually decreased. This group "received 6 to 10 hours per week of in-home supervision from a senior therapist and weekly consultation by the senior author or clinic supervisor" (p. 421). Further, "parents were instructed to attend weekly team meetings and were encouraged to extend the impact of treatment by practicing the newly learned material with their child throughout the day" (p. 420).
Group #2: Parent-Directed. This group received essentially the same treatment as Group #1, except that it was less intense. Specifically, "parents in the parent-directed group chose the number of weekly treatment hours provided by therapists" (p. 421). Thus, during the first two years, participants averaged 31.5 hours per week of direct treatment "with the exception that one family chose to have 14 hours both years" (p. 421). As with Group #1, direct treatment hours were then slowly decreased as the child entered school. Further, this group "received 6 hours per month of in-home supervision from a senior therapist (typically a 3-hour session every other week) and consultation every 2 months by the senior author or clinic supervisor" (p. 421). As with Group #1, parents were told to attend weekly team meetings and urged to practice their newly acquired skills throughout the day with their children.
Sallows and Graupner (2005, p. 417) reported that the "outcome after 4 years of treatment, including cognitive, language, adaptive, socia l, and academic measures, was similar for both groups." For example, on average, the full scale IQ for all participants showed a 25 point increase. Specifically, the authors noted that
Parent-directed children, who received 6 hours per month of supervision ... did about as well as clinic-directed children, although they received much less supervision. This was unexpected, and it may have been due in part to parent-directed parents taking on the senior therapist role, filling cancelled shifts themselves, actively targeting generalization, and pursuing teachers and neighbors to find peers for daily play dates with their children. Although many parent-directed parents initially made decisions regarding treatment that resulted in their children progressing slowly ..., many parents then sought input from treatment supervisors and rapidly learned to avoid making the same mistake twice, becoming quite skillful after a few months. (p. 433)
Compliance with Guideline #1: Random Assignment
Sallows and Graupner's study is a product of the Wisconsin Young Autism Project. As participants in the Lovaas' Multi-Site Young Autism Project, these researchers "worked in collaboration with and observed the guidelines set by the National Institutes of Mental Health" (p. 419). Thus, in adherence to NIMH-approved research protocol, preschoolers diagnosed with autism were matched "on pretreatment IQ (Bayler MA divided by CA)" and then "randomly assigned by a UCLA statistician" to the clinic-directed group or the parent-directed group. In short, matched random assignment was used, thus satisfying the first guideline. Indeed, it is noteworthy that, while parents clearly had the option to drop out of the study if unhappy with their child's group assignment, "none dropped out upon learning of their group assignment, minimizing bias in selection of participants and group composition" (p. 420).
By randomly assigning participants to groups, this study avoided many of the problems associated with nonrandom assignment (see previous discussion). Further, in employing matched random assignment, the study achieved additional benefits. Random assignment is employed to make it more likely that "the difference between subjects that might affect the outcome of the experiment will be even, or averaged out" (Durso & Mellgren, 1989, p. 159). However, by chance, subjects with characteristics likely to strengthen post-treatment performance may still be disproportionately dis·pro·por·tion·ate
Out of proportion, as in size, shape, or amount.
dispro·por assigned to one condition over the other. Matching is recommended as a means of addressing this threat to internal validity under certain conditions. Specifically, whenever possible, the researcher should use matching if "there is a subject characteristic that is highly correlated cor·re·late
v. cor·re·lat·ed, cor·re·lat·ing, cor·re·lates
1. To put or bring into causal, complementary, parallel, or reciprocal relation.
2. with the dependent variable" (Durso & Mellgren, 1989, p. 162). Citing a number of studies (e.g., Bibby, Eikeseth, S., Martin, N. T., Mudford, O. C., & Reeves, D., 2002; Lovaas, 1987), Sallows and Graupner (2005) identified IQ as one of the "most commonly noted predictors" (p. 419) of post-treatment outcome. So, participants in the Sallows & Graupner study were first matched on pretreatment IQ and then randomly assigned to either the clinic-directed or parentdirected group, thereby bolstering the benefits achieved when only simple random assignment is used.
Compliance with Guideline #2: Uniform Assessment
During intake, pretreatment measures were taken of all participants, using five different instruments: (a) the Bayley Scales of Infant Development, Second Edition; (b) the Merrill-Palmer Scale of Mental Tests; (c) Reynell Developmental Language Scales; (d) Vineland Adaptive Behavior Scales and (e) the Early Learning Measure (an experimental assessment tool developed by Smith, Buch, & Gamby, 2000). In addition, direct observation, reports of other professionals, and parent interviews were used to determine the developmental history, "supplemental treatments" history, and presence/absence of functional speech. In sum, a uniform assessment protocol appears to have been followed at intake, thus adhering to the second guideline. However, at follow-up, which occurred annually over a four-year period, this guideline was not strictly followed. With the exception of the Early Learning Measure (which was only given a second time, after several months of treatment), all of the pretreatment instruments were administered at follow-up to at least some of the participants. Instruments administered only at follow-up (i.e., not at intake) included the Wechsler Preschool and Primary Scale of Intelligence-Revised; the Wechsler Intelligence Scale for Children--WISC-III; the Leiter R; the Clinical Evaluation clinical evaluation Medtalk An evaluation of whether a Pt has symptoms of a disease, is responding to treatment, or is having adverse reactions to therapy of Language Fundamentals, Third Edition; the Woodcock woodcock: see snipe.
Any of five species (family Scolopacidae) of plump, sharp-billed migratory birds of damp, dense woodlands in North America, Europe, and Asia. Johnson III Tests of Achievemen; the Personality Inventory for Children; and the Child Behavior Checklist. Explaining the use of these other instruments, Sallows and Graupner reported that "as children grew older or became too advanced for the norms of pretreatment tests, we used other age-appropriate tests" (p. 421).
Compliance with Guideline #3: Replicability
The guideline calling for sufficient information to allow for replication was largely satisfied. The researchers not only employed a widely available, detailed set of instructional strategies (e.g., Lovaas et. al., 1981), they also provided additional detail within the body of their paper. For example, they reported that no aversives were used, and that in addition to Lovaas et al. (1981), other treatment procedures inspired by more recent research (e.g., Koegel & Koegel, 1995) were also used. Additional treatment strategies used included conducting only two to three training trials at a time, and using continuous, immediate, and powerful reinforcement reinforcement /re·in·force·ment/ (-in-fors´ment) in behavioral science, the presentation of a stimulus following a response that increases the frequency of subsequent responses, whether positive to desirable events, or . "Between these brief (initially 30 seconds long) learning periods, staff members played with the children to keep the process more like play than work, generalize generalize /gen·er·al·ize/ (-iz)
1. to spread throughout the body, as when local disease becomes systemic.
2. to form a general principle; to reason inductively. learned material into more natural settings, and continue to build social responsiveness" (see Sallows & Graupner, 2005, p.422-423 for a discussion of treatment strategies used).
Almost two decades have elapsed e·lapse
intr.v. e·lapsed, e·laps·ing, e·laps·es
To slip by; pass: Weeks elapsed before we could start renovating.
n. since the publication of Lovaas's (1987) original treatment efficacy study. During the first decade, many of the criticisms of that study appeared in print, along with subsequent responses to these criticisms by Lovaas and his colleagues. These responses acknowledged the legitimacy of some of the criticisms, thereby suggesting some minimal, necessary guidelines to be followed by follow-up research. In this paper, two recent treatment efficacy studies were described and evaluated in reference to the three guidelines. Results indicated that when these guidelines are followed, the results are better equipped to withstand criticisms; and when they are not followed, the results are more vulnerable to these critiques.
Utilizing a pretest-posttest nonequivalent groups design, the Howard, Sparkman, Cohen, Green, and Stanislaw (2005) study failed to demonstrate the superiority of early intensive behavioral treatment over that provided by special day classes in public schools. This failure was further exacerbated by Howard et al.'s use of a non-uniform assessment protocol (suggesting unreliability of measurement), as well as their failure to provide anything approaching adequate information about the details of each treatment condition, thus making replication impossible. In comparison, the Sallows and Graupner (2005) study utilized matched, random assignment, an assessment protocol more closely approximating uniformity, and sufficient detail to allow for replication, thus advancing our knowledge of the efficacy of parent-directed early intensive behavioral treatment (EIBT) as a less intrusive in·tru·sive
1. Intruding or tending to intrude.
2. Geology Of or relating to igneous rock that is forced while molten into cracks or between other layers of rock.
3. Linguistics Epenthetic. , less costly alternative to clinic-directed EIBT treatment.
Author's Note: Disclaimer: The views expressed here are not necessarily shared by my employer, the Stanislaus County Office of Education (Modesto, CA), nor by my fellow employees.
Bateman, B. D., & Linden, M. A. (1992/1998). Better IEPs: How to develop legally correct and educationally useful programs (3rd ed.). Longmont, CO: Sopris West.
Bibby, P., Eikeseth, S., Martin, N. T., Mudford, O. C., & Reeves, D. (2002). Progress and outcome for children with autism receiving parent-managed intensive interventions. Research in Developmental Disabilities, 23, 81-104.
Birnbrauer, J. S., & Leach, D. J. (1993). The Murdoch Early Intervention Program after two years. Behaviour Change, 10, 63-74.
Cozby, P. C. (2001). Methods in behavioral research (7th ed.). Mountain View, CA: Mayfield.
Durso, F. T., & Mellgren, R. L. (1989). Thinking about research: Methods and tactics of the behavioral scientist. St. Paul St. Paul
as a missionary he fearlessly confronts the “perils of waters, of robbers, in the city, in the wilderness.” [N.T.: II Cor. 11:26]
See : Bravery , MN: West Publishing.
Graziano, A. M., & Raulin, M. L. (2004). Research methods: A process of inquiry (5th ed.). Boston, MA: Pearson.
Gresham, F. M., & MacMillan, D. L. (1997). Autism recovery? An analysis and critique of the empirical evidence on the Early Intervention Project. Behavioral Disorders behavioral disorder Psychiatry A disorder characterized by displayed behaviors over a long period of time which significantly deviate from socially acceptable norms for a person's age and situation , 22, 185-201.
Harris, S., Handleman, J., Gordon, R., Kristoff, B., & Fuentes, F. (1991). Changes in cognitive and language functioning of preschool children with autism. Journal of Autism and Developmental Disabilities, 21, 281-290.
Howard, J. S., Sparkman, C. R., Cohen, H. G., Green, G., & Stanislaw, H. (2005). A comparison of intensive behavior analytic and eclectic e·clec·tic
1. Selecting or employing individual elements from a variety of sources, systems, or styles: an eclectic taste in music; an eclectic approach to managing the economy.
2. treatments for young children with autism. Research in Developmental Disabilities, 26, 359-383.
Kerlinger, F. N. (1973). Foundations of behavioral research (2nd ed.). NY: Holt holt
A wood or grove; a copse.
[Middle English, from Old English.]
the lair of an otter [from , Rinehart and Winston.
Koegel, R. L., & Koegel, L. K. (1995). Teaching children with autism: Strategies for initiating positive interactions and improving learning opportunities. Baltimore: Brookes.
Lovaas, O. I. (1987). Behavioral treatment and normal educational and intellectual functioning in young autistic autistic /au·tis·tic/ (aw-tis´tik) characterized by or pertaining to autism. children. Journal of Consulting and Clinical Psychology The Journal of Consulting and Clinical Psychology (JCCP) is a bimonthly psychology journal of the American Psychological Association. Its focus is on treatment and prevention in all areas of clinical and clinical-health psychology and especially on topics that appeal to a broad , 55, 3-9.
Lovaas, O. I., Ackerman, A. B., Alexander, D., Firestone fire·stone
1. A flint or pyrite used to strike a fire.
2. A fire-resistant stone, such as certain sandstones.
Noun 1. , P., Perkins, J., & Young, D. (1981). Teaching developmentally disabled children: The me book. Austin, TX: Pro-Ed.
Lovaas, O. I., Smith, T., & McEachin, J. J. (1989). Clarifying comments on the young autism study: Reply to Schopler, Short, and Mesibov. Journal of Consulting and Clinical Psychology, 57, 165167.
McBurney, D. H. (1998). Research methods (4th ed.). Pacific Grove Pacific Grove, residential and resort city (1990 pop. 16,117), Monterey co., W central Calif., on a point where Monterey Bay meets the Pacific Ocean; inc. 1889. , CA: Brooks/Cole.
McEachin, J. J., Smith, T., & Lovaas, O. I. (1993). Long-term outcome of children with autism who received early intensive behavioral treatment. American Journal of Mental Retardation mental retardation, below average level of intellectual functioning, usually defined by an IQ of below 70 to 75, combined with limitations in the skills necessary for daily living. , 43, 589595.
McGuigan, F. J. (1997). Experimental psychology: Methods of research (7th ed.). Upper Saddle River Saddle River may refer to:
In 1913, law professor Dr. .
Sallows, G. O., & Graupner, T. D. (2005). Intensive behavioral treatment for children with autism: Fouryear outcome and predictors. American Journal of Mental Retardation, 110, 417-438.
Smith, T., Groen, A. D., & Wynn, J. W. (2000). Randomized ran·dom·ize
tr.v. ran·dom·ized, ran·dom·iz·ing, ran·dom·iz·es
To make random in arrangement, especially in order to control the variables in an experiment. trial of intensive early intervention for children with pervasive developmental disorder per·va·sive developmental disorder
Any of several disorders, such as autism and Asperger's syndrome, characterized by severe deficits in many areas of development, including social interaction and communication, or by the presence of repetitive, . American Journal of Mental Retardation, 105, 269-285.
Smith, T., & Lovaas, O. I. (1997). The UCLA Young Autism Project: A reply to Gresham and MacMillan. Behavioral Disorders, 22, 202-218.
Smith, T., McEachin, J. J., & Lovaas, O. I. (1993). Comments on replication and evaluation of outcome. American Journal on Mental Retardation, 97, 385-391.
Therapeutic Pathways (June 1999, version 4). In-home programs for young children with autism manual.
(1.) While I am not a lawyer, surrendering broad, decision making power to one member of an IEP/IFSP team seems inconsistent with federal law. Specifically, I doubt that parents can voluntarily surrender their rights, or the rights of other IEP/IFSP members, given that all are considered equal partners in the IEP/IFSP process (see Bateman, B. D. & Linden, M. A., 1992/1998).
Author Contact Information:
Ted Schoneberger, P.O. Box 157, Turlock, CA 95381, Phone: (209) 556-5655, E-mail: TSberger@aol.com