Trends in the assessment of infants and toddlers with disabilities.

The Education of the Handicapped Act Amendments of 1986 (Public Law 99-457) added a formula grant program to assist states in establishing comprehensive community services to infants and toddlers with disabilities and their families (Federal Register, 1989). This law requires a timely, comprehensive, multidisciplinary evaluation, including assessment activities related to the child and the child's family. A few states were already providing educational and ancillary services to infants with disabilities, beginning at birth, but the assistance provided through this law greatly expands services offered in all states. Professionals and parents in each state are planning and piloting programs in an effort to identify the best practices for adoption and implementation in their state. This article reviews the assessment and evaluation of infants and toddlers with disabilities, or at risk for developing disabilities, as defined in P.L. 99-457; notes some of the recent contributions to supplement our knowledge of assessment of infants and toddlers; reviews current practices; and suggests future trends.


P.L. 99-457

P.L. 99-457 defines and identifies the component parts of evaluation and assessment as follows.

Evaluation refers to the procedures used by appropriate qualified personnel to determine a child's initial and continuing eligibility for services as a child with a developmental delay. Each state is required to define the term developmental delay regarding the levels of functioning or other criteria and the procedures the state will use to determine the existence of a delay in each of the following areas: cognitive development; physical development, including vision and hearing; language and speech development; psychosocial development; and self-help skills.

Assessment refers to ongoing procedures used throughout the period of a child's eligibility to identify: (a) the child's unique needs; (b) the family's strengths and needs related to development of the child; and (c) the nature and extent of early intervention services that are needed by the child and the child's family to meet the needs in the evaluation process described previously.

The law further specifies the following characteristics of the evaluation and assessment of the child. It must:

(1) Be conducted by personnel trained to utilize appropriate methods and procedures;

(2) Be based on informed clinical opinion; and

(3) Include the following:

(i) A review of pertinent records related to the child's current health status and medical history.

(ii) An evaluation of the child's level of functioning in each of the following developmental areas:

(a) Cognitive development.

(b) Physical development, including vision and hearing.

(c) Language and speech development.

(d) Psychosocial development.

(e) Self-help skills. (Federal Register, 1989, p. 26320)

The assessment of the child must determine his or her unique needs in terms of these domains, including the identification of services appropriate to meet these needs. Another critical element of the law is the 45-day timeline during which the evaluation and initial assessment of each child must be completed. In addition, any family assessment must be completed within the same timeline and according to the criteria specified in the law. Because family assessment is a new component and deserves in-depth attention, it will not be addressed here.

P.L. 99-457 is very clear in its mandate of the multidisciplinary team for conducting the evaluation and assessment process. Further, the law lists and defines some of these participants. Participants will deffer based on the needs of a child and family. What is not as well defined is how assessment personnel will work together as a team. It is expected that states will address this in their plans. In general, team members will be expected to represent their disciplines and report their findings to other members of the team, including the family representatives. On some multidisciplinary teams, the members work with greater awareness of what other members do, preferring to work from an interdisciplinary or transdisciplinary perspective (see Fewell, 1983; Rossetti, 1990 for reviews of these models).

The law is quite flexible concerning where assessment can occur. This can be viewed positively; the law allows teams to test children in environments selected with the child and the family in mind. For example, one child might be assessed in a hospital; a second child, at home; and others, in a therapist's office or a day-care center. Certain requirements must be met in the administration of some tests to declare results valid; these requirements may affect the setting of particular assessments.

The law (Federal Register, 1989) is far more specific regarding the nondiscriminatory assessment procedures that must be followed:

* All tests and other procedures must be administered in the child's or the parent's native language or their preferred mode of communication.

* Test procedures and materials must not be culturally or racially discriminatory.

* More than one procedure must be used for determining a child's eligibility.

* Assessments must be completed by qualified personnel.

In the final regulations, a note was added that the evaluation-assessment process be broad enough to obtain information required in the Individualized Family Service Plan (IFSP) concerning (a) family strengths and needs related to the development of the child, and (b) the child's functioning level in each of the five developmental areas noted in Section 677(d)(1) of the law.


In the past 2 years, a proliferation of books have been published on the topic of infant and toddler assessment and related themes (Bailey & Wolery, 1989; Gibbs & Teti, 1990; Rossetti, 1990; Widerstrom, Mowder, & Sandall, 1991). These books are helpful in two important ways. First, they provide comprehensive reviews of tests and procedural considerations for the assessment of infants and toddlers. This material is particularly useful for multidisciplinary team members who were trained to assess school-aged students and have limited experience with infants, toddlers, or children with disabilities. Second, these texts include information on new trends and experimental procedures, many of which have not been widely publicized and are not accessible through the usual channel of test publishing houses.

Some important cautions appear in a book edited by Zelazo and Barr (1989), Challenges to Developmental Paradigms: Implications for Theory, Assessment and Treatment. Peter Wolff's chapter, "The Concept of Development: How Does It Constrain Assessment and Therapy?" is particularly timely, given the increased referrals for assessment and services resulting from the inclusion of "at-risk" young children in P.L. 99-457. Wolff has addressed two serious concerns. First, he expressed concern over early intervention practitioners' unquestioned acceptance of popular practices that may be diametrically opposed to theories they espouse regarding child development and assessment. He suggested that practitioners would do well to involve themselves in a theoretically informed analysis of their activities to ensure that practices are congruent with intentions. Wolff's second concern was practitioners' quickness to refer and serve at-risk or what he called "marginally" handicapped children. He cited evidence from biological and behavioral developmental sciences suggesting that it may be impossible to predict subtle variation in normal and deviate behavioral development. This provides further reasons for intervenors to thoroughly document what is done and analyze their findings from a theoretical perspective. Wolff concludes "if we reject such a remedy, it may be preferable to dedicate our overstrained resources to the assessment of, and intervention in, developmental handicaps that qualify as elementary situations or 'intermediate' developmental handicaps, and to treat marginal handicaps only after they have been documented. In the final analysis, however, the decision rests with an informed and critical public and a self-critical profession as to which developmental variations constitute real handicaps requiring early intervention; and which handicaps are the products of our therapeutic zeal and theoretically uninformed imagination" (p. 26). Wolff's cautions, despite the promises, pleas, and platitudes of parents and professionals who believe infant intervention is the most effective means for preventing later developmental problems, deserve consideration as practitioners move forward with P.L. 99-457 and seek ways to optimally use limited resources to improve outcomes for children with disabilities and their families.


Before selecting a test or practice in the assessment of the domains that are required in the law, one should review some critical factors in test selection. First, examiners should know the reasons for a child's assessment. A test selected to determine a child's eligibility for services may not be appropriate to determine strengths or needs; and the test may be inadequate for developing desired outcomes, objectives, or instructional strategies on IFSPs. Second, examiners must select tests that are appropriate for the child, taking into account his or her disabilities. Finally, practitioners should ensure that the tests are administered by qualified personnel in a manner that is consistent with the test requirements if results are to be used for decision making and measuring performance over time.

In briefly highlighting some of the current tests and practices in the assessment of infants and toddlers, notations will be limited to a few standardized tests, criterion-referenced measures, and some observation procedures. Important aspects of family assessment (identifying needs, priorities, and resources) and assessment of interactions of infants and toddlers with family members, will not be addressed in this article because of space limitations. Assessments in these areas are very appropriate for consideration in IFSPs. This article will be limited almost entirely to issues of child development and learning. For a more comprehensive review of the literature, see the texts listed previously.

Cognitive Assessment

The most widely used nationally standardized test of cognitive ability for infants and toddlers is the Mental Scale of the Bayley Scales of Infant Development (BSID) (Bayley, 1969). This test was constructed using traditional psychometric techniques in which items are selected for inclusion in the list based only on their ability to reflect an increase in percentage of children responding correctly to items as their age increases. Thus, there is no logical sequencing of items, nor does the test reflect an underlying theory of intelligence (Garwood, 1982). Researchers have reported Metal Development Indexes exceeding the norms when the BSID has been given to at-risk infants (Campbell et al., 1986). These data have serious implications if program specialists are reporting BSID scores to parents or are using them in program evaluation.

Cognitively oriented programs will find the Ordinal Scales of Psychological Development (Uzgiris & Hunt, 1975) to be an assessment that is more consistent with their program orientation. In ordinal scales, items are logically sequenced and related to previous and subsequent items within a particular scale. There is considerable evidence that ordinal scales are appropriate for children with disabilities (Dunst, 1980; Kahn, 1976).

Both of the preceding tests are limited to children whose skills are typical of children in the sensorimotor period of development, or just a few months beyond in the case of the BSID. Two respected tests begin at the 2.5-year level (Kaufman Assessment Battery for Children [Kaufman & Kaufman, 1983]; McCarthy Scales of Children's Abilities [McCarthy, 1972]), but these tests should not be used for children whose scores are likely to fall at the lower end of the age range. For this reason, some examiners have relied on two older tests, the Infant Intelligence Scale (Cattell, 1960) and the Merrill-Palmer Test of Mental Abilities (Stutsman, 1948) with appropriate age ranges (e.g., the Merrill-Palmer range is 18 months to 6 years). Another strength of the Merrill-Palmer is the dependency on highly attractive manipulatives and low language demands.

Motor Assessment

In addition to the psychomotor scale of the BSID, the Peabody Developmental Motor Scales (Folio & Fewell, 1983) are standardized scales of gross and fine motor skills from the birth-to-7-year range. These scales have been widely used for more than a decade. They are nationally standardized and they have an accompanying curriculum rare in such tests.

A more recent test, The Movement Assessment of Infants (Chandler, Andrews, & Swanson, 1981) is a neurodevelopmental tool that examines four components of movement during the first year of life: tone, primitive reflexes, automatic reactions, and volitional movement. This criterion-referenced measure would meet needs for assessment for programming, but would be insufficient for diagnosis and evaluation of movement disabilities.

Language and Speech Assessment

It is appropriate to view communication, language, and speech as separate skills when children are in the early months and years of life. For this reason, tests have been developed that focus on these skills individually and jointly. Among the scales that have some standardization data are the Sequenced Inventory of Communicative Development, Revised (Hedrick, Prather, & Tobin, 1984), the Receptive Expressive Emergent Language Scale (Bzoch & League, 1978), and the Preschool Language Scale (Zimmerman, Steiner, & Pond, 1979).

In recent years, practitioners have favored more naturalistic forms of language assessment, particularly the use of language sampling and analysis (see Miller, 1981). A recently published, semistructured communication assessment is Assessing Prelinguistic and Early Linguistic Behaviors in Developmentally Young Children (Olswang, Stoel-Gammon, Coggins, & Carpenter, 1987).

Psychosocial Assessment

Psychosocial assessment includes social, emotional, behavioral state, adaptive, and, sometimes, play skills. Because this domain is less well defined than other domains, there is more variability in assessment content and process. In newborns and the very young, examiners focus on behavioral style and temperament. Among the currently used standardized measures are the Infant Temperament Questionnaire (Carey & McDevitt, 1978) and the Toddler Temperament Scale (Fullard, McDevitt, & Carey, 1978). An observational scale, the Carolina Record of Individual Behavior (Simeonsson, Huntington, Short, & Ware, 1982) permits the assessment of various states.

Examiners more interested in social-interactional skills will find direct observation/recording techniques most useful, such as the Social Interaction Scan procedure (Odom et al., 1988). One of the few standardized measures, the Vineland Adaptive Behavior Scales (Sparrow, Balla, & Cicchetti, 1984), also covers a variety of skills.

Self-Help Skills

Self-help skills are likely to be divided into domains such as toileting, grooming, and dressing. These skills are generally observed in the appropriate context and are an important part of the curriculum for young children with disabilities. Items that assess self-help skills are not as specific to age scores (by month), but are almost entirely criterion-related. For IFSPs, examiners will find some age and standard scores on tests that are standardized, such as the Battelle Developmental Inventory (Newborg, Stock, Wnek, Guidubaldi, & Svinicki, 1984) and the Vineland, noted previously. In addition, self-help strains or subtests are likely to be found in the tests covered in the next section.


The one-test, multidomain assessment has become a common practice in programs for young children with disabilities. This type of test has been popular for 10 reasons: 1. An agency can purchase one test and have subtests in all domains. 2. The tests are inexpensive, compared with the price of five single-domain tests. 3. The tests can be administered by staff with limited testing experience. 4. The test items are either sequenced or grouped by domain and age; thus, they are easily translated into instructional objectives. 5. Curriculum packages frequently accompany the test. 6. The tests can be administered over and over again to the same child. 7. With some of the tests, it is possible to compare progress in one domain to that in another. 8. Some of the tests have items clustered as a screening test. 9. The tests often span several ages, providing continuity in assessment and evaluation over a period of years. 10. Because the tests are most often scored in months, it is easy to show positive gains, even when a child may be losing ground.

What then, are the drawbacks? There are as many disadvantages as there are advantages: 1. These tests were developed to meet program needs; thus, authors were curriculum oriented rather than test oriented, an orientation that has implications for the validity of the basic concept of the test, as well as for the content of the test. 2. Many of the items in these tests were actually taken from a pool of developmental tests such as the Gesell Developmental Schedules (Gesell & Amatrada, 1947). 3. Age equivalents were also taken from developmental schedules and not from normative samples. 4. In some tests, age ranges are not equivalent; that is, the number of items and the effort needed to progress from a score of 18 months to a score of 24 months, is simply not comparable to the demands of a 6-month move in other parts of the scale. 5. Similarly, the requirements in one domain on one age level may be far less or more demanding than requirements of the same age level of another domain. 6. Because only a few of these tests are standardized, it is difficult to compare changes made when measured on one test to changes made on other tests. 7. Tests often offer several scoring options, decreasing the potential validity of the test results. 8. With items related to age levels, evaluators have no standard scores; thus, they resort to less valid ways to measure change, such as extrapolated scores and efficiency indexes. 9. Few of these tests were actually tried with normally developing children before they were used with children with disabilities. 10. Because the tests are often curriculum referenced, teachers tend to teach to the test and thus severely limit the facilitation of many appropriate, developmentally relevant skills.

Considering both strengths and weaknesses, what choice does one make? No one test should be the only source of information in the multidisciplinary evaluation of a child. Given that IFSP assessment data can be gathered either at the time of an evaluation or after a child begins receiving services, it is likely that some programs will want to use a one-test option for reporting IFSP data. However, two caveats are needed. First, the test selected should be consistent with the intervention program's philosophy of how children develop and how staff should facilitate learning. Second, examiners should be aware of the many weaknesses inherent in these tests.

Examples of assessment systems based on the one-test, multidomain model are The Battelle Developmental Inventory (Newborg et al., 1984), the Early Learning Accomplishment Profile (Glover, Preminger, & Sanford, 1978), the Early Intervention Development Profile (Schafer & Moersch, 1981), and the Griffiths' Mental Developmental Scales (Griffiths, 1954).


New trends in the assessment of infants and toddlers emerge from two sources: (a) Program staff discover new methods or alternative ways to elicit information from and about children, and (b) researchers provide new findings about child development, often suggesting parallels between intelligence and other behavior. Some trends emerging from applied settings are the use of the following:

* Play (Fewell, 1986; Fewell & Kaminski, 1988; Fewell & Vadasy, 1983; Linder, 1990).

* Ecological inventories (MacDonald, 1989).

* Arena assessment (Linder; Wolery & Dyk, 1984).

* Judgment-based assessment (Neisworth & Fewell, 1990).

* Adaptive assessment (Fewell & Sandall, 1986; Johnson-Martin, Jens, & Attermeier, 1986; Schafer & Moersch, 1981).

* Assessments of the child in interactions with family members and peers.

Researchers have shown us the relevance of visually presented novelty preferences (Fagan & Shepherd, 1987; Fagan & Singer, 1983); mastery motivation (Redding & Morgan, 1988); information processing (Zelazo, 1989); and infant emotional, sound, and gestural responses (Meltzoff & Kuhl, 1989; Hsu, Nwokah, Dobrowolska, & Fogel, 1990).

Researchers have suggested new, very exciting directions for assessment; however, far more must transpire before their work can be used effectively in early intervention programs. Any reader who has ever seen a mobile with black-and-white bull's-eye or checkerboard patterns over a baby's crib is aware that research can influence practice. Researchers' studies are usually limited to a small sample of normally developing children within a very limited age range; the studies require technical laboratory equipment and conditions, as well as training. Moreover, the findings can be narrow and specific, not likely generalizable to the functional needs of early intervention programs serving infants and toddlers with disabilities. Much study and experimentation is necessary before these techniques are practical for the assessment of children with disabilities.

More useful are the trends that have emerged from programs. By clustering these trends, it is possible to describe these contributions in more detail. Play, arena, and ecological testing share common traits and goals. All of these new directions have naturalistic formats in which the child is an active partner of the testing. For example, in the Play Assessment Scale (Fewell, 1986), the child selects a toy and interacts with it in his or her preferred way. It is the examiner's task to score interactions, making a quick, clinical interpretation of the interaction and determining where the action or its equivalent appears in the sequenced scale of play behavior. Arena assessment, also referred to as transdisciplinary play-based assessment (Linder, 1990), is an observational assessment in which staff from several disciplines focus on their particular domains within the context of play. These methods are functional and much easier to use than traditional standardized techniques in the assessment of children with disabilities. Demands are not made on the child in terms of what, how, when, or for how long they demonstrate skills. These naturalistic observational assessment procedures are often used before they have been carefully researched; thus, data from these measures must never be used for making decisions about diagnosis, services, or placement. These naturalistic formats can be quite valuable, however, in supplementing test information gathered through traditional methods.

Adaptive assessment refers to an array of techniques ranging from modifications in the administration of items (Schafer & Moersch, 1981; Fewell & Langley, 1984; Johnson-Martin et al., 1986) to the use of new technologies, such as computers (Schlater, Fewell, & Sandall, 1987), assistive devices, and interactive laser discs (Oregon State System of Higher Education, 1988). Item modifications have been a necessity to avoid blatantly penalizing a child who cannot see, hear, or make voluntary movements. P.L. 94-142 and P.L. 99-457 require that testing be appropriate and not penalize a child because of his or her disability. Though it is extremely important to ensure that this does not occur, it frequently forces an examiner to forgo item validity. Adaptations are seldom researched, and they usually occur spontaneously as the examiner spots the problem. A few test authors, such as those noted, have provided instructions on how to administer items to children with disabilities. None, however, has enough data to report norms; thus, the validity of the scores on these tests is diminished.

Certainly a trend in the 1990s will be increased use of computer and interactive laser disc technology in assessment. Test catalogues are beginning to include software sections that offer computer test-taking options. The Seattle Inventory of Early Learning Software (SIELS) (Schlater et al., 1987) is an example of an individualized, computer-generated test and curriculum package. SIELS was designed to assist large agencies in serving children and families in isolated areas so that parents can test and instruct their child in their home under the tutorage of a master agency staff member.

Without some form of testing, it is unlikely that systems can provide infants and toddlers and their families with intervention services. It is critical that the testing planned is appropriate to answer the questions posed. Does a child have special needs? If so, what are they? What and how should this child be taught? What included in the intervention was effective? Different audiences ask these questions. Different kinds of tests answer these questions. Tests that supply the answers an administrator must have fail to answer the questions teachers, staff, and parents raise. Research and experience are directing test authors to pursue new directions that will lead to more appropriate, functional, and accurate testing. There has been a decided shift from product-oriented testing to process-oriented testing that will enable examiners to be more sensitive to individual differences. Until new, valid, and reliable methodologies are developed, examiners should continue to select the most appropriate of the traditional measures. At the same time, practitioners should explore the newer directions that appear to be more effective in conveying accurate samples of the behaviors of children with disabilities and help us determine our instructional goals and teaching strategies.


Title Annotation:Special Issue: Trends and Issues in Early Intervention; analysis in the tests used to assess handicapped children
Author:Fewell, Rebecca R.
Publication:Exceptional Children
Date:Oct 1, 1991
Previous Article:Professional skills, concerns, and perceived importance of work with families in early intervention.
Next Article:The next decade of research on the effectiveness of early intervention.

