What do alternate assessments of alternate academic achievement standards measure? A multitrait-multimethod analysis.
The technical soundness of AA-AASs, however, remains an area of concern. Basic questions about the constructs measured and their relationship to other measures of achievement remain largely unsubstantiated by rigorous research and validation studies. The paucity of published studies or documentary evidence for validity in states' AA-AAS technical manuals supports this assertion. The National Study of Alternate Assessments (NSAA) report (SRI International, 2009) provides a comprehensive descriptive summary of key attributes of AA-AASs and resulting accountability data for each state. The NSAA indicates that directors of AA-AASs in 41% of the states and one territory reported conducting a formal study to document that test and item scores are related to internal or external variables as intended. The NSAA also reported that in 59% of the states, a formal study had documented measurement of the construct relevance of their test. The information, however, is not widely available.
According to the U.S. Department of Education's nonregulatory document Alternate Academic Achievement Standards for Students With the Most Significant Cognitive Disabilities (U.S. Department of Education, 2005), "An alternate assessment must he aligned with the state's content standards, must yield results separately in both reading/language arts and mathematics, and must be designed and implemented in a manner that supports use of the results as an indicator of AYP." (adequate yearly progress; U.S. Department of Education, 2005, p. 15).
The AA-AASs are an important component of each state's assessment system and must meet the federal regulations outlined in Title I of the Elementary and Secondary Education Act (1965). The AA-AASs must also meet standards of high technical quality--reliability, validity, accessibility, objectivity, and consistency--expected of other educational tests (i.e., Standards for Educational and Psychological Testing, American Educational Research Association [AERA], American Psychological Association [APA], and National Council on Measurement in Education [NCME], 1999).
In addition, AA-AASs must have the following:
* An explicit structure.
* Guidelines for determining which students may participate.
* Clearly defined scoring criteria and procedures.
* A report format that communicates student performances in terms of academic achievement standards.
If the AA-AASs meet the required standards for technical quality and use, then educators can report the results of AA-AASs for up to 1% of the total student population for AYP purposes, in January 2009, the U.S. Department of Education published the Standards and Assessment Peer Review Guidance: Information and Examples for Meeting Requirements of the No Child Left Behind Act of 2001. This document extends the Standards for Educational and Psychological Testing (AERA et al., 1999) and provides even more specific guidance concerning validity evidence for AA-AASs. For example, the Technical Quality subsection [4.1] of the Peer Review Guidance document specifically asks the following:
(b) Has the State ascertained that the assessments, including alternate assessments, are measuring the knowledge and skills described in its academic content standards and not knowledge, skills, or other characteristics that are not specified in the academic content standards or grade level expectations?
(e) Has the State ascertained that test and item scores are related to outside variables as intended (e.g., scores are correlated strongly with relevant measures of academic achievement and are weakly correlated, if at all, with irrelevant characteristics, such as demographics)? (p. 35)
Towles-Reeves, Kleinert, and Muhomba (2009) have further identified this lack of availability. In a recent review of research on AA-AASs, these authors identified 23 empirical studies completed since 2003. Specifically, Towles-Reeves et al. lament that "there is considerably less research that has examined the extent to which actual student scores were associated with empirically verified instructional or other outcome variables" (p. 245). These authors called for "future research to investigate the relationship between AA-AASs (regardless of approach: portfolios, performance assessments, and checklists) and another accepted measure of student learning" (p. 246). They concluded, "there is no evidence to support the correlation of alternate assessments with other accepted measures of student learning" (p. 246). This claim is a serious one and should cause all users of these assessments to be cautious when interpreting AAAAS scores.
PREVIOUS RESEARCH ON AA-AASs
Some published evidence for the validity of the constructs that AA-AASs have measured does exist, but Towles-Reeves et al. (2009) did not review it. A validation study of the Idaho Alternate Assessment (IAA; Idaho Department of Education, 1999) scores focused on evidence about the underlying construct being measured (Elliott, Compton, & Roach, 2007). That study examined the relationships between ratings on the IAA for students with significant disabilities, corresponding scores on the general assessment, and ratings on two norm-referenced teacher rating scales: the Academic Competence Evaluation Scales (ACES; DiPerna & Elliott, 2000) and the Vineland Adaptive Behavior Scales (VABS; Sparrow, Balla, & Cicchetti, 1985). The study investigated IAA performance for a representative group of students with disabilities (N = 116) who, according to their individualized education program (IEP) teams, were eligible (SWD-Es) and participated in the state's alternate assessment, as well as for another group of students who had disabilities (N = 54) but were not officially eligible (SWD-NEs) for the alternate assessment. The study assessed both groups of students with the IAA and compared the students' results with other indirect assessments of performance, all of which were teacher-completed measures. The researchers included SWD-NEs as a control group for two reasons: (a) so that they could explore the relationship between the state's regular assessment (Idaho Standards Achievement Tests, ISAT; Idaho Department of Education, 2008) with accommodations and the IAA with a sample that could complete the ISAT; and (b) because the performance of SWD-NEs better matched that of SWD-Es than the performance of a group of students without disabilities would have. This analysis is critical because of the emphasis that AA-AASs, like general assessments, focus on academic content. We examined this seminal alternate assessment study in detail because it provided the basis for the design of the present study.
The evidence of interest in Elliott et al. (2007) concerned relationships between the constructs measured by the IAA and two other types of variables: (a) the ISAT, and (b) established rating scale measures of academic competence (ACES) and adaptive behavior (VABS). Correlations calculated for SWD-NEs between the IAA and the ISAT were in the medium (reading, language arts) or large (mathematics) ranges within content areas, but correlations also tended to be in these ranges when calculated across content areas (e.g., r between AA-AAS mathematics and ISAT reading = .67). When calculated for the entire sample, IAA reading, language arts, and mathematics scales all shared more variance with measures of adaptive behavior and academic enablers than with measures of academic skills. The correlations for SWD-NEs tended to be about twice as large as the same IAA to ACES Academic Skills relations for the SWD-Es.
Elliott et al. (2007) concluded that the evidence to support the validity of the IAA was mixed, yet on balance promising. The relationship between the reading, language arts, and mathematics achievement level ratings on the IAA and the concurrent scores on the ACES Academic Skills scales for the eligible students varied across grade clusters but in general were medium at best. When the researchers examined correlations for the same score relationships for the not-eligible students, the magnitude of the correlations increased noticeably. Collectively, these findings furnished evidence that the IAA scales measure skills indicative of the academic content characterized in the state's content standards. The medium to large correlations between the IAA and ISAT for the not-eligible students further reinforced that point. The evidence for both groups of students supports the validity of the IAA scores. Although the correlations among academic skills on the IAA and other measures indicated a meaningful amount of shared variance (i.e., 20% to 40%), in some cases, particularly at the elementary grade levels, there was more shared variance with the academic enabling and adaptive behavior constructs.
PURPOSE OF THE PRESENT RESEARCH: A MULTISTATE REPLICATION STUDY
The IAA validity study that Elliott et al. (2007) conducted served as the model for the present multistate investigation. The participating states in the current study all used a comprehensive rating-scale approach to AA-AASs. Each of the alternate assessments had been aligned with the state's grade-level extended standards and thus designed to focus on academic skills. To understand each of these states' alternate assessments, the current study incorporated a multitrait-multimethod (MTMM) design to determine the relationship among the AA-AAS, the state's general achievement test, and two established teacher-based rating scales.
Relationships with other variables is one of five main types of validity evidence that the Standards for Educational and Psychological Testing (AERA et al., 1999) addresses. Evidence based on relationships with other variables includes the degree to which scores from an instrument converge with indicators of similar constructs (convergent validity) and diverge from indicators of dissimilar constructs (divergent validity), as well as the degree to which the scores share no relationship with indicators of unrelated constructs (discriminant validity).
Campbell and Fiske (1959) suggested an approach by which researchers could use scores from multiple methods that were indicative of multiple traits as evidence for the validity of a new measure. The multitrait-multimethod (MTMM) approach allows for an integrative multivariate framework within which researchers can systematically gather information about convergent and discriminant validity in a single study. The MTMM approach is also useful for providing evidence about the construct being measured. In the current study, multiple traits that we considered included academic performance, academic skills, academic enablers, and adaptive behavior. The multiple methods considered included individually administered achievement tests and teacher-completed rating scales.
The method of choice for characterizing evidence based on relationships with other variables, as well as for completing an MTMM matrix, is often the Pearson correlation. The magnitude can be considered an effect size; and the square of the correlation is the amount variance shared between the two sets of scores. The social sciences generally use Cohen's (1992) guidelines for classifying effect sizes, with a medium effect size (r = .30 or r = -.30) intended as an effect that the naked eye can witness. Coheffs suggestions for small (r =. 10 or r = -.10) and large (r = .50 or r = -.50) effect sizes were to be noticeably and equally smaller and larger, respectively. Researchers typically consider the values that Cohen suggested as the inner boundaries of each range, such that small positive correlations are between .10 and .30. Hopkins (2001) extended Cohen's framework to include nonexistent (r = .00), very large (r = .70 or r = -.70), and nearly perfect correlations (r = .90 or r = -.90). The range for nearly perfect correlations is consistent with indicators of acceptable reliability for scores from a single measure, and is higher than researchers would expect for measures of different traits (e.g., a nationally normed rating scale and an AA-AAS based on state content standards) or for measures using different methods (e.g., an AA-AAS and a test used for the general assessment).
Legislation and research related to AA-AASs have inspired two research questions about the constructs that these assessments measure. We used an MTMM replication design when we examined evidence to address the following questions across multiple states:
1. Do the AA-AAS subscale scores measure distinct content areas that correlate with the same content areas on each state's general assessment when researchers use both on a common sample?
2. Which constructs (academic skills, academic enablers, adaptive skills) do AA-AASs measure when used with students with disabilities who are eligible (SWD-Es) and students with disabilities who are not eligible (SWDNEs) across six states?
The sample included 402 elementary school students (in third grade through fifth grade) and 317 middle school students (in sixth grade through eighth grade) from six states, balanced across two groups: SWD-Es (n = 361) and SWD-NEs, (n = 358). The sample included a higher number of male students (n = 444) than female students (n = 274). Participants were primarily European American (n = 433), African American (n = 83), and Latino American (n = 115) students. These demographics were consistent across grade bands and across eligibility groups. The sample consisted of students from the following six states: Indiana (n = 284), Arizona (n = 131), Nevada (n = 104), Idaho (n = 81), Mississippi (n = 49), and Hawaii (n = 46). Table 1 depicts these demographics disaggregated by grade band and eligibility group.
Participants represented 13 different disability categories. Students identified with a specific learning disability comprised the largest proportion of SWD-NEs. Students identified with mental retardation comprised the largest proportion of SWD-Es. Other highly represented categories for SWD-Es were autism and multiple disabilities. There were no major differences in disability category representation across grade bands. Table 2 contains disability category data for the current sample disaggregated by grade band and eligibility status.
Measures used in the current study included AAAASs from all six states; general achievement tests from the two states (Indiana and Idaho) that had the largest sample sizes across groups; and established measures of academic skills, academic enablers, and adaptive behavior. Alignment studies of all the AA-AASs with their states' extended content standards had been completed to ensure that the assessments focused on academic skills.
Alternate Assessments of Alternate Academic Achievement Standards (AA-AASs). The Arizona Instrument to Measure Standards-Alternate (AIMS-A; Arizona Department of Education & Elliott, 2006) is an assessment based on rating scale, performance task, and multiple-choice components. Each student is assessed on all three content standards in reading (on one of two forms) and all five content standards in mathematics. Each student's special education teacher administers the AIMS-A to him or her. This instrument incorporates teacher judgment in decisions concerning administration and interpreting and reporting scores. Performance evidence is based on state-required standardized scales, tasks, and items. A team with representatives from the state department of education, researchers, parents, and other stakeholders developed the AIMSA. The AIMS-A technical manual reports internal consistency as an estimate for reliability. The technical manual also reports some evidence for validity based on relationships with other variables, internal structure, and consequences of the assessment.
The Hawaii State Alternate Assessment (HSAA; Hawaii Department of Education, 2007) is a rating scale with a portfolio of evidence submitted for independent scoring. Each student is assessed on all three content standards in language arts and all five content standards in mathematics. Each student's special education teacher and a certified educator who is not the student's teacher administer the HSAA to him or her. This instrument incorporates teacher judgment into selecting materials, making decisions concerning administration, and interpreting and reporting scores. The ratings are based on a diverse set of evidence, including student work samples, collected through both daily and on-demand approaches. A team with representatives from the state department of education and from an assessment company, researchers, parents, and other stakeholders developed the HSAA. Internal consistency and interrater scoring consistency were calculated as estimates of reliability. A comprehensive HSAA technical manual reports evidence for validity based on relationships with demographic variables, internal structure, and consequences of the assessment.
The Idaho Alternate Assessments (IAA; Idaho Department of Education, 2005) are evidence-based rating scales delivered online, and the U.S. Department of Education approved them in 2003. This instrument assesses each student on all five content standards in reading/language arts and all seven content standards in mathematics. Each student's special education teacher administers the IAA to her or him. This instrument incorporates teacher judgment into selecting materials, making decisions concerning administration, and interpreting and reporting scores. The ratings are based on a diverse set of evidence, including student work samples, collected through both daily and on-demand approaches. A team with representatives from the state department of education, researchers, parents, and other stakeholders developed the IAA. The use of coefficient alphas at the scale level and interrater agreement at the item level established the reliability of the IAA rating scales. The coefficient alphas for the scales ranged from .84 to .94. The mean interrater agreement approached 85%. In Elliott et al. (2007), the mean interrater agreement across each scale was 93%. Roach (2003) determined that the item content for the IAA aligned well with Idaho's general education content standards. The IAA technical manual also provides evidence for validity based on relationships with other variables, internal structure, and consequences of the assessment.
The Indiana Standards Tool for Alternate Reporting (ISTAR; Indiana Department of Education, 2009a) is a rating scale designed as a measure of academic and functional progress from birth to employment, and the U.S. Department of Education approved it in 2006. This assessment evaluates each student on all seven content standards in language arts and all seven content standards in mathematics. The items on the ISTAR are derivatives from the Indiana State Academic Standards and the extensions or foundations to the standards in the areas of language arts and mathematics. The range of items at each level depends on the content of the state standards. The student's special education teacher and a certified educator who is not the student's teacher administer the ISTAR. This instrument incorporates teacher judgment into selecting content and materials, making decisions concerning administration, and interpreting and reporting scores. The ratings are based on a diverse set of evidence, including student work samples and teacher recollection of student performance. A team with representatives from the state department of education, researchers, parents, and other stakeholders developed the ISTAR. The use of multiple raters established the reliability of the ISTAR. The ISTAR technical manual reports interrater reliability by using intraclass correlations between raters' scores, which ranged from .85 to .99 with a mean of .95, and by using the Kappa statistic and percent agreement. With regard to validity evidence based on relationships with other variables, five statistical procedures were conducted to examine the variability of groups assessed with the ISTAR instrument, with the consistent finding of discrimination between groups of students at differing grade bands and between students with and without disabilities. The ISTAR technical manual also provides evidence for validity based on internal structure and consequences of the assessment.
The Mississippi Alternate Assessment of Extended Curriculum Frameworks (MAAECF; Mississippi Department of Education, 2009) consists of evidence-based rating scales with portfolios submitted for independent scoring. Each student demonstrates on all four content standards in language arts and various (by grade) content standards in mathematics. Each student's special education teacher administers the MAAECF, and both the student's teacher and a school- or district-based educator score it. This instrument incorporates teacher judgment into selecting materials, making decisions concerning administration, and interpreting and reporting scores. The ratings are based on a diverse set of evidence, including student work samples, and are collected through daily approaches, guided by state instructions regarding the type and amount of evidence. A team with representatives from the state department of education, researchers, and other stakeholders developed the MAAECF. Internal consistency and interrater scoring consistency were calculated as estimates of reliability. The MAAECF technical manual provides evidence for validity based on relationships with demographic variables, alignment to content standards, internal structure, and consequences of the assessment.
The Nevada Alternate Scales of Academic Achievement (NASAA; Nevada Department of Education, 2009) are performance-based assessments in which each student demonstrates on two different benchmark skills for three language arts standards and three mathematics standards. Thus, there are six total performance measures for each student for language arts and six for mathematics. The student's teacher administers the NASAA after instruction, and it is videotaped. This instrument incorporates teacher judgment into selection of content and materials, decisions concerning administration, and interpreting and reporting scores. A second, independent team of raters scores the videotaped performances, as well. A diverse set of evidence, including student work samples collected through both daily and on-demand approaches and based on state-provided instructions, forms the basis of the ratings. A team with representatives from the state department of education, an assessment company, researchers, parents, and other stakeholders developed the NASAA. Interrater scoring consistency was calculated as an estimate of reliability. The NASAA technical manual also furnishes evidence for validity based on a content analysis, relationships with demographic variables, and consequences of the assessment.
Grade-Level General Assessments for Idaho and Indiana. The Idaho Standards Achievement Tests (ISAT; Idaho Department of Education, 2008) are multiple-choice assessments of student achievement in reading, language arts, and mathematics. The reading assessment consists of two components: reading process and comprehension/interpretation. The writing assessment also includes two components: writing process and writing components. The mathematics assessment consists of five components: numbers and operations; concepts and principles of measurement; concepts and language of algebra and functions; principles of geometry; and data analysis, probability, and statistics. State standards across these areas determined item content. Internal consistency was calculated as an estimate of reliability. Coefficient alphas ranged from .80 to .91 for the grade bands included in the current study and were slightly higher in reading and mathematics than in language arts. Evidence for validity based on content and based on internal structure was also collected.
The Indiana Statewide Testing for Educational Progress-Plus (ISTEP+; Indiana Department of Education, 2009b) is an assessment that includes multiple-choice, short answer, and essay questions to measure student achievement in language arts and mathematics. Assessment in both language arts and mathematics includes sections on both basic and applied skills. State standards across these areas determined item content. Internal consistency was calculated as an estimate of reliability. Coefficient alphas ranged from .90 to .94 for the grade bands included in the current study. Evidence for validity based on content and based on relationships with other variables was also collected.
Established Rating Scale Measures of Related Constructs. The Academic Competence Evaluation Scales (ACES; DiPerna & Elliott, 2000) measure students' skills, attitudes, and behaviors that contribute to academic competence. The teacher version of the ACES is an 81-item questionnaire with two separate scales (Academic Skills and Academic Enablers). The Academic Skills scale includes three subscales (reading/language arts, mathematics, and critical thinking), and the Academic Enablers scale includes four subscales (interpersonal skills, motivation, study skills, and engagement). Teachers rate items on the basis of the level of the students' academic skills compared with grade-band expectations from 1 (far below) to 5 (far above). Teachers rate the existence/frequency of academically enabling skills from 1 (never) to 5 (almost always). This instrument categorizes sum scores at the scale and subscale levels into developing (weaknesses), competent, and advanced (strengths) ranges. Coefficient alpha for ACES has a mean of .99 on the Academic Skills and Academic Enablers scales across grade bands. The test--retest reliability of ACES over a 2- to 3-week interval ranges from .88 to .97. The reported standard error of measurement for the Academic Skills scale ranges from 2.5 to 3.1; and for the Academic Enablers scale, it ranges from 3.6 to 4.7. The developers also examined validity evidence based on test content, internal structure, relationships with other variables, and the consequence of testing. Two scales, Academic Skills and Academic Enablers, were derived from factor analysis. In relation to other measures of standardized achievement (e.g., Iowa Tests of Basic Skills; University of Iowa College of Education, 2009) and behavior (e.g., Social Skills Rating System; Gresham & Elliott, 1990), the ACES also demonstrated solid evidence for the convergent, discriminant, and criterion-related validity.
The Vineland Adaptive Behavior Scales, 2nd Edition (VABS-II; Sparrow, Cicchetti, & Balla, 2006) were designed to assess individuals with and without disabilities from birth to adulthood in four domains: communication, daily living, socialization, and motor skills. The classroom edition form used in this study has 244 items. This instrument categorizes sum scores for composites and domains into low (at least two standard deviations below the mean), moderately low (one to two standard deviations below the mean), adequate (within a standard deviation of the mean), moderately high (one to two standard deviations above the mean), and high (at least two standard deviations above the mean) adaptive levels. The VABS-II is a widely used instrument, which was standardized on 3,000 individuals ranging in age from birth to 19 years and representing a diverse demographic population. The reliability of the VABS-II is adequate for the four domains but is poor for some of the subscales within each domain. Median split-half reliability coefficients across ages range from .83 for motor skills to .90 for daily living skills. Interrater reliability for the domains is lower and ranges from .62 to .78. The standard errors of measurement for the various scales range from 3.4 to 6.6 (depending on age).
Participant recruitment and data collection took place across all six states during spring 2007. Educators evaluated all participants in the study by using the state's AA-AAS, the ACES, and the VABS-II. Trained raters in their home schools scored the performances of the SWD-NEs on their states' AA-AASs. In most cases, the scorer was the actual teacher of the students, but if that teacher had not received training to use the alternate assessment, a special education assessment coordinator responsible for scorer training completed the scoring. In all states, the educators followed the steps required for reliable ratings. Educators administered the ACES and the VABS-II during the week before or just after the completion of each state's AA-AAS, and the member of the research team who represented each state's department of education scored the ACES and the VABS-II. Only SWD-NEs participated in each state's general assessment, as part of their regular statewide achievement testing.
Because of the nature of the research questions, the data analyses were correlational and exploratory, intended to provide quantitative indexes of the relationships among various academic skills and related behaviors. Calculations of the means and standard deviations on the ACES and VABS-II facilitated comparison of the SWD-Es and SWD-NEs. Pearson correlations between each state's AA-AAS and the established, norm-referenced assessments characterized the validity evidence on the basis of relations with other variables. Because of variability in recruiting success, not all states had large numbers of participants available for each analysis. We excluded from these analyses cells that would have had less than 10 participants. We examined correlations between the same scales for similarity across grade clusters, as well as between eligibility groups, providing a form of within-study replication. In addition to the magnitude of the correlations, we provide information about statistical significance to facilitate comparisons and clarify the probability that such correlations could occur by chance. Significance tests were made at [alpha] = .05 and were one-tailed, based on the general prediction that all of the relationships would be positive, regardless of magnitude.
Descriptive analyses of established measures of academic skills, academic enablers, and adaptive behavior indicated that SWD-Es and SWD-NEs were distinct groups in the current sample. Among SWD-Es, the mean scores for ACES Academic Skills at both grade bands were in the developing range. The standard deviation among scores for SWD-Es was about half the standard deviation among scores for SWD-NEs, reflecting a restriction of range, because almost all the SWD-Es scored at the very low end of the ACES. The mean Academic Skills scores for SWDoN-Es were in the developing range at both grade bands but were higher than the mean scores for SWD-Es. Using the pooled standard deviation at each grade band, we found that SWD-NEs in elementary school scored 1.39 standard deviations higher on Academic Skills and that SWD-NEs in middle school scored 1.03 standard deviations higher. Table 3 depicts means and standard deviations on the established measures of academic skills, as well as academic enablers and adaptive behavior, across grade bands and eligibility groups.
The pattern between groups on ACES Academic Enablers was similar. Among SWD-Es, the mean scores for Academic Enablers at both grade bands were in the developing range. These scores were higher in the developing range than were the scores for Academic Skills; and the standard deviations were consistent across groups, indicating that range restriction was not an issue when considering Academic Enablers. The mean Academic Enablers scores for SWD-NEs were in the competent range and were higher at both grade bands than the mean scores for SWD-Es. Using the pooled standard deviation at each grade band, we found that SWD-NEs in elementary school scored .95 standard deviations higher on Academic Enablers and that SWD-NEs in middle school scored .49 standard deviations higher.
The pattern between SWD-Es and SWD-NEs on the VABS-II also indicated nonoverlapping groups. Among SWD-Es, the mean scores on the Adaptive Behavior composite at both grade bands were at the low level, nearly three standard deviations below the normative mean of 100. The mean Adaptive Behavior scores for SWD-NEs were at the adequate level and were higher at both grade bands than were the mean scores for SWD-Es. Using the normative standard deviation of 15, the researchers found that SWD-NEs in elementary school scored 2.18 standard deviations higher on Adaptive Behavior than SWD-Es and that SWD-NEs in middle school scored 1.51 standard deviations higher than SWD-Es.
RELATIONSHIPS AMONG AA-AAS SUBSCALES AND WITH GENERAL ACHIEVEMENT TESTS
The correlations between language arts or reading and mathematics within AA-AASs tended to be in the very large range or higher (11 of 14 coefficients) across grade bands and states. The only exceptions to this trend occurred among Nevada students at both grade bands (medium range) and Idaho students at the middle school grade band (large range). Table 4 provides a detailed account of the correlations between the reading and mathematics subscales within each state's AA-AAS at the various grade bands. In Idaho and Indiana, where samples were large enough to meaningfully disaggregate by group, this trend was prevalent for both SWD-Es (four of six coefficients in the very large range or higher) and for SWD-NEs (five of six coefficients in the very large range or higher). Idaho was the only state that had scores for both reading and language arts; correlations were in the very large range or higher across groups and grade bands, except for the middle school SWD-E sample, in which this correlation was in the medium range.
Idaho and Indiana also had samples of SWD-NEs that were large enough for interpreting correlations between alternate assessment scores and general assessment scores. The strength of these correlations varied, and the coefficients between content areas that were intended to represent the same construct were not systematically different from the coefficients that were to represent distinct constructs. Among 10 correlations between tests that were designed to measure the same construct, 6 coefficients were in the medium range and 4 coefficients were in the small range. Among 16 correlations between tests that were designed to measure distinct constructs, 8 were in the medium range or higher, 5 were in the small range, and 3 were in the nonexistent range. Table 5 depicts correlations between AA-AASs and general assessments in Idaho and Indiana.
RELATIONSHIPS AMONG AA-AAS SCORES AND SCORES FROM ESTABLISHED MEASURES OF RELATED CONSTRUCTS
Correlations between AA-AAS scores and ACES Academic Skills scores tended to be in the large range or higher (17 of 25 coefficients). This trend held for reading and language arts (9 of 14 coefficients in the large range or higher) and for mathematics (8 of 11 in the large range or higher). It also held at the elementary school grade band, in which 10 of 13 correlations were in the large range or higher. The correlations were spread more evenly at the middle school grade band, with 4 of 12 coefficients in the medium range, 3 in the large range, and 4 in the very large range. Exceptions to these trends included Nevada in both content areas at the elementary school grade band, one of two Arizona reading assessments at the middle school grade band, and Indiana language arts at the elementary school grade band. These exceptions were in the small or nonexistent ranges. Table 6 depicts the correlations between AA-AAS subscales and established measures. Sample sizes were large enough to disaggregate these correlation coefficients by eligibility status in Idaho and Indiana, but no consistent differences between SWD-Es and SWD-NEs from these states were observable. For SWD-Es, these correlations were spread, with 4 of these 10 coefficients in the medium range and 3 in the small range. For SWD-NEs, 5 of 10 coefficients were in the small range, with the other coefficients spread across ranges.
A great deal of variation was observed for correlations between AA-AAS scores and ACES Academic Enabler scores. In language arts and reading, 5 of 14 coefficients were in the medium range, 3 were in the large range, and 4 were in the very large range. In mathematics, correlations tended to be in the medium (4 of 11 coefficients) or large (3 of 11 coefficients) ranges. Variation was great within grade bands. At the elementary school grade band, 5 of 13 coefficients were in the very large range, but another 4 coefficients were in the medium range. Correlations at the middle school grade band tended to be in the medium or large ranges (9 of 12 coefficients). Separate trend analyses were done by state when considering the relationship between AA-AAS scores and ACES Academic Enablers scores by eligibility group. In Idaho, correlations at the elementary school grade band were in the large range (3 of 3 coefficients) for SWD-Es but were in the medium range (3 of 3 coefficients) for SWD-NEs. All six correlation coefficients at the middle school grade band across eligibility groups and content areas were in the small or medium ranges. In Indiana, correlations were in the small range (5 of 8 coefficients) across eligibility groups, with some correlations in the medium (2 of 8 coefficients) or nonexistent (1 of 8) ranges, but without any clear trend between eligibility groups.
Correlations between AA-AAS scores and VABS-II Adaptive Behavior composite scores tended to be in the very large range or higher (15 of 25 coefficients). This trend held at the elementary school grade band (8 of 13 coefficients in the very large range or higher) and at the middle school grade band (7 of 12 coefficients in the very large range and the remaining 5 in the large range). This trend also held for reading and language arts (9 of 14 coefficients in the very large range) and for mathematics (6 of 11 coefficients in the very large range or higher). Exceptions to this trend were all at the elementary school grade band: Indiana language arts (medium range), Nevada language arts (small range), and Nevada mathematics (medium range).
In Idaho and Indiana, correlations between AA-AAS scores and the VABS-II Adaptive Behavior Composite were higher for SWD-Es than for SWD-NEs in 11 out of 12 comparisons. These correlations for SWD-Es were similar to those for the sample aggregated across groups, with 11 of 12 coefficients in the large range or higher. The correlation coefficients among SWD-NEs were primarily in the medium range (10 of 12 coefficients).
Correlations were also calculated between the ACES scales and the VABS-II. The correlation between the VABS-II Adaptive Behavior Composite and the ACES Academic Enablers scores was in the very large range for SWD-Es (r = .70) and in the large range for SWD-NEs (r = .67). The correlation between the Adaptive Behavior Composite and ACES Academic Skills scores was in the large range for SWD-Es (r = .50) and in the medium range (r = .44) for SWD-NEs. The correlations between ACES Academic Enablers and ACES Academic Skills scores were in the medium range for both groups (r = .40 for SWD-Es and r = .37 for SWD-NEs).
Very little published research examines the constructs measured by AA-AASs. This is partially attributable to the challenges of assessing the student population for whom alternate assessments are intended and in some states because of the lack of adequate sample sizes to conduct MTMM studies. The current study addresses these limitations and represents a response to the call by Towles-Reeves et al. (2009) for empirical research that examines the relationships among student scores on AA-AASs and other established measures of student achievement. This study generally replicated the Elliott et al. (2007) study and extended its findings to alternate assessments in five additional states. The study used a known-groups sample to examine the relationships between these states' alternate assessments and grade-level tests of general content standards, along with an established measure of achievement, the ACES, that yields scores based on a large national normative sample. These results go beyond the call by Towles-Reeves et al. by providing an indication of the magnitude of relations among the constructs measured by AA-AASs and the construct of adaptive behavior, as measured by the VABS-II. To the extent that the current accountability legislation demands assessments that clearly measure the core academic domains, validity studies of AA-AASs should result in the refinement of assessment instruments, with the ultimate intent of measuring constructs that strongly correlate with measures of academic skills and less strongly correlate with measures of adaptive behavior. As a means of reviewing and summarizing the key findings of this multistate validity study, we revisit and furnish data-based answers to our two motivating questions.
DO THE AA-AAS SUBSCALE SCORES MEASURE DISTINCT CONTENT AREAS THAT CORRELATE WITH THE SAME CONTENT AREAS ON EACH STATE'S GENERAL ASSESSMENT, WHEN BOTH ARE USED ON A COMMON SAMPLE?
In most states, the relationships among content areas (typically the correlation between reading and mathematics) within the AA-AAS are in the range that would be acceptable reliability coefficients for a single, unitary construct. This finding could indicate that some degree of success in one of the content areas (likely reading) is a prerequisite for success in the other content area (mathematics) and that for students in this sample, variation in the prerequisite skill explains all the variation in both sets of scores. The pattern of correlations between reading and mathematics scores being lower at the middle school band than at the elementary school band supports this explanation. A second possibility is that both reading and mathematics are measuring a third construct, such as adaptive behavior. High correlations between reading and mathematics scores support this explanation not only for SWD-Es but also for SWD-NEs, a group of students who exhibited academic skills, academic enablers, and adaptive behavior much closer to the normative mean on validated measures. One important precaution to this interpretation is that regardless of the magnitude of a correlation, it is only one of multiple pieces of evidence required to show that a single construct is being measured.
When considering SWD-NEs, the general assessment scores and AA-AAS scores in the current sample do not share much variance. In only one case does any combination of the scores from Idaho or Indiana share more than 25% of the variance, and that is the relationship between language arts on the general assessment and mathematics on the AA-AAS. There is no pattern of scores from within content areas (e.g., general assessment reading and AA-AAS reading) sharing any stronger relationships than do combinations across constructs (e.g., general assessment reading and AA-AAS mathematics). This finding is consistent with the high correlations observed between AA-AAS reading and mathematics scores. Although relationships between general assessment scores and AA-AAS scores in the Elliott et al. (2007) study were stronger (8 of 9 coefficients in the medium range or higher), there likewise existed no pattern of same content areas sharing stronger relationships. These results indicate that when taken by SWD-NEs, the general assessment scores reflect a very different construct than the one reflected in the AA-AAS scores. It is important to remember that the alternate assessments are not designed to be used with SWD-NEs, and these correlations may be limited by a restriction of range, as students in this group generally have high scores when evaluated on AA-AASs.
WHICH CONSTRUCTS (ACADEMIC SKILLS, ACADEMIC ENABLERS, ADAPTIVE SKILLS) DO AA-AASs MEASURE WHEN USED WITH SWD-Es AND SWD-NEs ACROSS SIX STATES?
Scores from the AA-AASs and the ACES Academic Skills scale shared a degree of variance, typically in the range representing related but distinct constructs. This relationship is somewhat stronger at the elementary school band and does not vary by eligibility group. This latter finding contrasts with results from Elliott et al. (2007), in which the correlations between scores from AA-AASs and the ACES Academic Skills scale were much larger for SWD-NEs than for SWD-Es. In the current study, the AA-AASs appear to reflect academic skills for SWD-Es to the same degree that they would reflect these skills if used with SWD-NEs.
Relationships between AA-AAS scores and ACES Academic Enablers scale scores vary greatly in strength, an indication that states' AA-AASs reflect enabling behaviors to different degrees. When disaggregated by eligibility group, grade, and state within subsamples from Idaho and Indiana, these constructs share relatively little variance, which indicates that they measure two constructs that are not highly related. The one exception to this rule is the Idaho fourth-grade SWD-E group, in which the two scores share substantial variance and may represent a common construct. In the Elliott et al. (2007) study, the scores of the SWD-NEs on the alternate assessment shared a great deal of variance with ACES Academic Enablers scale scores; however, both groups yielded correlations between these scores that were indicative of related constructs. It is likely that the IAA measures academic enablers to a greater degree than most alternate assessments included in the study, largely because its scoring rubric incorporates a progress level during the year along with performance. That is, the rubric implicitly emphasizes sustained engagement and persistence, which is akin to two academic enabling subscales, engagement and motivation.
Scores for AA-AASs in both reading and mathematics share a great deal of variance with adaptive behaviors across grades, states, and eligibility groups. These two types of measures appeared to reflect similar constructs across all states, with the exception of Nevada. Some educators would call these academic and workplace survival skills, or functional skills. Many students with presymbolic and concrete symbolic communication skills need to develop very basic reading and mathematics skills. Such skills are part of some objectives in the extended content standards with which AA-AASs are aligned. Elliott et al. (2007) found that correlations between academic skills, as measured by the IAA and adaptive behaviors, were in this same range; although in the corresponding analysis SWD-NEs were not evaluated with the VABS-II.
The final piece to this discussion on shared variance is the degree to which academic skills, academic enablers, and adaptive behavior are overlapping constructs, even when measured by established, norm-referenced tools. In the current sample of students with disabilities, the strongest relationship is between adaptive behavior and academic enablers, which appear to be highly related constructs. Adaptive behavior and academic skills were related but distinct constructs, as were academic skills and academic enablers. These relationships likely explain some of the shared variance between adaptive behavior and AA-AASs in this sample. Given the conceptual relationships among these constructs, it seems unlikely that any good measure of academic achievement in this population will be entirely independent of adaptive behavior. Scores from all three measures represent constructs that are distinct but related.
Historically, developing appropriate accountability standards for the unique population of students who are eligible for AA-AASs has been a challenging endeavor from several perspectives. Although the students are a very small proportion of the entire population, they represent an extraordinarily broad range of abilities and needs. Thus, teachers who work with them often must develop curricula and individualized supports to provide appropriate instruction in their class, rooms. Although legislative efforts focus on accountability for student learning in the core academic subjects, many teachers opt to maintain a difficult balance between academic skills and nonacademic skills in their classroom instruction, in a good-faith effort to provide what they deem to be essential tools for these students to live successful lives outside school. For some teachers, the ideal alternate assessment is a test that reflects this balance; other stakeholders believe that the AA-AASs should measure only the academic skills included in the content standards. The incompatibility of the two views in the collaborative development of these tests may result in assessments that do not measure what they purport to measure with the same level of precision and focus that has come to be expected of large-scale assessments.
The results of this study indicate that alternate assessments often measure a number of constructs. The findings are important in that they begin to disaggregate the complexities in measuring academic achievement in the subpopulation of students with the most significant cognitive disabilities. Interwoven in these measures for students performing at the extreme extensions to the grade-level standards are features of academic readiness and functional skills. Although an appropriate educational program for this unique group of students arguably represents a number of constructs, they may not be reflected in the inferences that educators ultimately make from the reported scores.
Research on the academic achievement of students with significant disabilities is challenging. Researchers generally must deal with relatively small, but heterogeneous, samples of students. In the present study, teachers assessed such students with relatively new evidence-based rating scales and trained scorers scored them. These assessments were designed to be aligned with state content standards and expected classroom instruction. Even when educators furnish excellent instruction, many students with significant disabilities and a limited number of years of exposure to academic skills still perform at the lowest end of the assessments.
In the current assessments, a disproportionate number of students were near the lower end of the score distribution for the ACES Academic Skills. This distribution resulted in a restriction of range that may have reduced correlations involving these scores.
A second challenge with samples from the population of students featured in this study is the impracticality of using direct assessments. Thus, to collect information on concurrent measures of related constructs, we had to use a number of indirect measures (i.e., teacher rating scales). The ACES and VABS-II are psychometrically sound measures of academic competence and adaptive behavior, respectively; but the same teacher who also furnishes ratings on an AA-AAAS completes them both. This limitation may have resulted in increased similarities among the various ratings on different instruments, and therefore led to inflated correlations across constructs.
Because the current study only assessed participants with AA-AASs and general assessments from their own states, the researchers applied the MTMM methodology to the sample from each state rather than to a large sample from several states. A logical extension of this methodology would be to assess students from multiple states with AA-AASs from each, as well as assessing SWD-NEs with general assessments and AA-AASs from each state, so that researchers could use an MTMM framework to draw conclusions about trait, method, state, and group factors. Such a study could address the issue of whether AA-AASs are more similar to general assessments from the same state than to general assessments from other states, as well as whether AA-AASs and general assessments from one state measure constructs similar to AA-AASs and general assessments from another. The research would also enhance evidence in the current study about the relationship between AA-AASs and known measures, as well as the differences among these relationships for SWD-NEs versus SWD-Es.
The current data are the result of an MTMM study that incorporated two eligibility groups, two grade bands, six states, and four criterion measures. The following five main trends are apparent:
1. Reading and mathematics scores from AA-AASs may reflect a unitary construct.
2. When these measures are used with SWD-NEs, the scores only moderately relate to scores from states' general achievement tests.
3. The AA-AAS scores reflect a construct that is related to, but also distinct from, academic skills.
4. The AA-AAS scores reflect a construct that is highly related to adaptive behavior.
5. Even when measured using established tools, academic skills, academic enablers, and adaptive behavior are related constructs.
These results collectively indicate that the constructs measured by AA-AASs share common ground with such related constructs as adaptive behavior, academic skills, and academic enablers. It is a positive finding that none of these relationships are so strong that the AA-AASs would be interpreted as measures of any of these three constructs rather than as measures of academic achievement for students with the most severe cognitive disabilities. However, the results of this study indicate that educators must make continued efforts to ensure that the constructs measured by AA-AASs are clear and distinct from other constructs, particularly adaptive behavior.
American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME). (1999). Standards for educational and psychological testing. Washington, DC: Author.
Arizona Department of Education, & Elliott, S. N. (2006). Arizona instrument to measure standards--Alternate: Technical report. Retrieved from http://www. azed.gov/ess/SpecialProjects/aims-a/
Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validity by the multitrait-multi-method matrix. Psychological Bulletin, 56, 81-105.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.
DiPerna, J. C., & Elliott, S. N. (2000). Academic competence evaluation scales. San Antonio, TX: Psychological Corporation.
Elementary and Secondary Education Act, 20 U.S.C. 6301 et seq. (1965).
Elliott, S. N., Compton, E., & Roach, A. T. (2007). Building validity evidence for scores on a state-wide alternate assessment: A contrasting groups, multi-method approach. Educational Measurement: Issues & Practice, 26(2), 30-43.
Gresham, F. M., & Elliott, S. N. (1990). Social skills rating system. Circle Pines, MN: American Guidance Service.
Hawaii Department of Education. (2007). Frequently asked questions about the HSAA and alternate assessments. Retrieved from http://doe.k12.hi.us/nclb/
Hopkins, W. G. (2001). A scale of magnitudes for effect statistics. A New View of Statistics. Retrieved from http://www.sportsci.org/resource/stats/effectmag.html Idaho Department of Education. (1999). Idaho alternate assessment. Boise, ID: Author.
Idaho Department of Education. (2005). Idaho alternate assessments. Retrieved from the Idaho Department of Education web site: http://www.sde.state.id.us/ SpecialEd/AltAssessment/iaamanual.pdf
Idaho Department of Education. (2008). Idaho standards achievement tests. Idaho State Board of Education. Retrieved from http://www.boardofed.idaho.gov/ saa/Technical-Reports.asp
Indiana Department of Education. (2009a). Indiana standards tool for alternate reporting. Retrieved from https://ican.doe.state.in.us/beta/istarinfo.htm
Indiana Department of Education. (2009b). ISTEP+ program manual. Retrieved from http://doe.state.in.us/ istep/ProgramManual.html
Individuals With Disabilities Education Act, 20 U.S.C. [section]1400 et seq. (1997).
Individuals With Disabilities Education Improvement Act of 2004. Pub. L. No. 108-446, [section] 118 Star. 2647 (2004).
Mississippi Department of Education. (2009). Technical manual for the Mississippi alternate assessment of the extended curriculum frameworks for students with significant cognitive disabilities. Retrieved from http://www. mde.k12.ms.us/maaecf/
Nevada Department of Education. (2009). Nevada alternate scales of academic achievement. Retrieved from http://nde.doe.nv.gov/Assessment_NASAA.htm
No Child Left Behind Act, 20 U. S. C. [section] 16301 et seq. (2001).
Roach, A. T. (November 2003). Alignment of Idaho academic standards with the Idaho alternate assessment. Boise, ID: Idaho Department of Education.
Sparrow, S. S., Balla, D. A., & Cicchetti, D. V. (1985). Vineland Adaptive Behavior Scales: Classroom edition. Circle Pines, MN: American Guidance Service.
Sparrow, S. S., Cicchetti, D. V., & Balla, D. A. (2006). Vineland-II teacher rating form manual. Circle Pines, MN: Pearson Assessment.
SRI International. (2009). National study of alternate assessments. Retrieved from http://policyweb.sri.com/ cehs/projects/displayProject.jsp?Nick=nsaa
Towles-Reeves, E., Kleinert, H., & Muhomba, M. (2009). Alternate assessment: Have we learned anything new. Exceptional Children, 75, 233-252.
University of Iowa College of Education. (2009). Iowa tests of basic skills. Retrieved from http://www. education.uiowa.edu/itp/itbs/
U.S. Department of Education. (2005). Alternate academic achievement standards for students with the most significant cognitive disabilities. Retrieved from http:// www.ed.gov/policy/elsec/guid/altguidance.pdf
U.S. Department of Education. (January 2009). Standards and assessment peer review guidance: Information and examples for meeting requirements of the No Child Left Behind Act of 2001. Retrieved from http://www.ed. gov/policy/elsec/guid/saaprguidance.pdf
RYAN J. KETTLER
STEPHEN N. ELLIOTT
PETER A. BEDDOW
Peabody College of Vanderbilt University
Sweet Elementary School, Emmett, Idaho
Indiana Department of Education
KRISTOPHER J. KAASE
Mississippi Department of Education
Arizona Department of Education
Nevada Department of Education
Hawaii Department of Education
RYAN J. KETTLER (CEC TN Federation), Research Assistant Professor; STEPHEN N. ELLIOTT (CEC TN Federation), Professor and Dunn Family Chair; and PETER A. BEDDOW (CEC TN Federation), Doctoral Candidate, Department of Special Education, Peabody College of Vanderbilt University, Nashville, Tennessee. ELIZABETH COMPTON (CEC ID Federation), Principal, Sweet Elementary School, Emmett, Idaho. DAWN MCGRATH (CEC IN Federation), Acting Coordinator for Special Education, Indiana Department of Education, Indianapolis. KRISTOPHER J. KAASE, Deputy Superintendent for Instructional Programs and Services, Mississippi Department of Education, Jackson. CHARLES BRUEN, Director of Data Analysis, Arizona Department of Education, Phoenix. LISA FORD (CEC NV Federation), Special Education Consultant, Nevada Department of Education, Carson City. KENT HINTON, Director of Assessment, Hawaii Department of Education, Honolulu.
Correspondence concerning this article should be addressed to Ryan J. Kettler, 410E Wyatt Center, Peabody #59, 230 Appleton Pl., Nashville, TN 37203-5721, (615) 343-5702 (e-mail: r.j.kettler@ vanderbilt.edu).
The current study was implemented as part of the Consortium for Alternate Assessment Validity and Experimental Studies (CAAVES) project, funded by the U.S. Department of Education (awarded to the Idaho Department of Education; #S368A0600012). The positions and opinions expressed in this article are those solely of the author team.
Manuscript received February 2009; accepted July 2009.
TABLE 1 Demographics by Grade Band and Group Elementary School n (%) Demographic SWD-Es SWD-NEs Gender (a) Female 79 (41) 73 (35) Male 114 (59) 136 (65) Ethnicity African American 16 (8) 29 (14) Asian American 13 (7) 5 (2) European American 99 (51) 141 (67) Latino American 42 (22) 24 (11) Native American 1 (1) 1 (0) Native Hawaiian 9 (5) 2 (1) Other / not identified 13 (7) 7 (3) State Arizona 46 (24) 23 (11) Hawaii 26 (13) 6 (3) Idaho 20 (10) 21 (10) Indiana 42 (22) 122 (58) Mississippi 16 (8) 11 (5) Nevada 43 (22) 26 (12) Total 193 (48) 209 (52) Middle School n (%) Demographic S7)-Es SWD-NEs Gender (a) Female 66 (40) 56 (38) Male 101 (60) 93 (62) Ethnicity African American 12 (7) 26 (17) Asian American 8 (5) 1 (1) European American 95 (57) 98 (66) Latino American 36 (21) 13 (9) Native American 6 (4) 3 (2) Native Hawaiian 6 (4) 1 (1) Other / not identified 5 (3) 7 (5) State Arizona 49 (29) 13 (9) Hawaii 12 (7) 2 (1) Idaho 20 (12) 20 (13) Indiana 52 (31) 92 (62) Mississippi 11 (7) 11 (7) Nevada 24 (14) 11 (7) Total 168 (53) 149 (47) Note: SWD-Es = students with disabilities who were eligible; SWD-NEs = students with disabilities who were not eligible. (a) Gender was not reported for one SWD-E. TABLE 2 Disability Status Frequencies by Grade Band and Group Elementary School n (%) Disability Category SWD-Es SWD-NEs Autism 42 (22) 15 (7) Deaf-blindness 4 (2) 0 (0) Deafness or hearing impairment 2 (1) 1 (0) Developmental delay 1 (1) 1 (0) Emotional disturbance 1 (1) 12 (6) Mental retardation 91 (47) 23 (11) Multiple disabilities 34 (18) 2 (1) Orthopedic impairment 1 (1) 1 (0) Other health impairment 9 (5) 17 (8) Specific learning disability 2 (1) 115 (55) Speech or language impairment 0 (0) 18 (9) Traumatic brain injury 0 (0) 1 (0) Visual impairment 2 (1) 1 (0) Not known 4 (2) 2 (1) Middle School n (%) Disability Category SWD-Es SWD-NEs Autism 26 (15) 10 (7) Deaf-blindness 2 (1) 0 (0) Deafness or hearing impairment 2 (1) 0 (0) Developmental delay 0 (0) 0 (0) Emotional disturbance 1 (1) 19 (13) Mental retardation 97 (58) 31 (21) Multiple disabilities 22 (13) 0 (0) Orthopedic impairment 2 (1) 0 (0) Other health impairment 7 (4) 12 (8) Specific learning disability 4 (2) 71 (48) Speech or language impairment 1 (1) 4 (3) Traumatic brain injury 1 (1) 0 (0) Visual impairment 1 (1) 1 (1) Not known 2 (1) 1 (1) Note: SWD-Es = students with disabilities who were eligible; SWD-NEs = students with disabilities who were not eligible. TABLE 3 Descriptive Statistics forAcademic Skills, Academic Behavior, and Adaptive Behavior by Grade Band and Group Elementary School Mean (SD) Scale/Subscale SWD-Es SWD-NEs Academic Skills 38.81 (9.40) 64.86 (24.33) Reading 13.32 (3.83) 21.91 (7.72) Mathematics 9.37 (2.50) 15.96 (6.46) Critical thinking 16.18 (3.83) 27.24 (9.71) Academic Enablers 93.21 (33.25) 129.00 (41.11) Interpersonal skills 32.19 (9.88) 39.16 (9.25) Engagement 17.90 (8.74) 27.30 (7.52) Motivation 21.70 (8.35) 30.86 (10.38) Study skills 23.41 (9.56) 34.35 (12.24) Adaptive Behavior 60.44 (17.66) 93.12 (13.04) Communication 61.03 (17.07) 93.38 (12.61) Daily living skills 60.86 (18.98) 94.89 (12.49) Socialization 69.41 (13.06) 92.83 (15.96) Motor skills 71.73 (24.59) 115.17 (17.56) Middle School Mean (SD) Scale/Subscale SWD-Es SWD-NEs Academic Skills 40.16 (15.74) 63.36 (27.62) Reading 13.38 (12.12) 20.98 (13.78) Mathematics 9.35 (11.75) 17.49 (13.28) Critical thinking 17.03 (12.40) 28.25 (15.02) Academic Enablers 105.92 (38.77) 123.28 (31.78) Interpersonal skills 34.74 (15.66) 38.15 (14.57) Engagement 19.79 (14.02) 24.65 (13.01) Motivation 24.18 (14.43) 29.52 (14.83) Study skills 27.85 (15.55) 32.11 (15.30) Adaptive Behavior 64.03 (20.46) 86.65 (16.00) Communication 63.97 (19.97) 89.03 (15.32) Daily living skills 65.29 (21.05) 86.58 (15.66) Socialization 71.37 (18.84) 87.41 (18.50) Motor skills 81.47 (30.76) 114.90 (19.44) Note: SWD-Es = students with disabilities who were eligible; SWD-NEs = students with disabilities who were not eligible. TABLE 4 Correlations Between AA-AAS Subscales by Grade Band for Both SWD-Es and SWD-NEs State Comparison Elementary School Middle School Arizona Reading/mathematics .97 * .93 * Hawaii Reading/mathematics .92 .87 Idaho Reading/ mathematics .94 * .83 * Language/mathematics .93 * .89 * Reading/ language .97 * .83 * Indiana Language/mathematics .61 * .93 * Mississippi Reading/mathematics .87 * .95 * Nevada Reading/mathematics .43 .34 Note: AA-AAS = alternate assessment of alternate academic achievement standards; SWD-Es = students with disabilities who were eligible; SWD-NEs = students with disabilities who were not eligible. * p < .05 (one-tailed). TABLE 5 Correlations of AA-AAS and General Assessment Scores for SWD-NEs General Assessment Grade Band/State AA-AAS Subscale Reading Language Mathematics Arts Elementary School Idaho Reading .31 .31 .12 Language arts .24 .22 .07 Mathematics .31 .40 .26 Indiana Language arts -- .13 .19 Mathematics -- .13 .18 Middle School Idaho Reading .37 .43 .07 Language arts .33 .43 .07 Mathematics .40 .68 * .42 Indiana Language arts -- .30 * .37 * Mathematics -- .28 * .48 * Note: AA-AAS = alternate assessment of alternate academic achievement standards; SWD-NEs = students with disabilities who were not eligible. * p < .05 (one-tailed). TABLE 6 Correlations ofAA-AAS Subscales With Established Measures by Grade Band for Both SWD-Es and SWD-NEs Elementary School Academic Academic Adaptive State Skills Enablers Behavior Arizona Reading 1 .63 * .66 * .86 * Reading 2 -- -- -- Mathematics .69 * .69 * .86 * Hawaii Language arts .61 * .86 * .78 * Mathematics .61 * .78 * .87 * Idaho Reading .68 * .72 * .86 * Language arts .61 * .72 * .84 * Mathematics .74 * .73 * .90 * Indiana Language arts .29 * .31 * .47 * Mathematics .52 * .48 * .75 * Mississippi Language arts .57 * .32 .57 * Mathematics .66 * .17 .57 * Nevada Language arts -.07 .18 .17 Mathematics .06 .33 * .47 * Middle School Academic Academic Adaptive State Skills Enablers Behavior Arizona Reading 1 .18 .80 * .88 * Reading 2 .82 * .18 .70 * Mathematics .82 * .21 .67 * Hawaii Language arts .55 .65 * .85 * Mathematics .47 .58 .83 * Idaho Reading .67 * .36 * .59 * Language arts .72 * .52 * .72 * Mathematics .79 * .42 * .70 * Indiana Language arts .46 * .45 * .72 * Mathematics .45 * .41 * .65 * Mississippi Language arts .42 .40 .53 * Mathematics .59 .61 .58 Nevada Language arts -- -- -- Mathematics -- -- -- Note. Correlations were not reported when the sample size was less than 10. AA-AAS = alternate assessment of alternate academic achievement standards; SWD-Es = students with disabilities who were eligible; SWD-NEs = students with disabilities who were not eligible. * p < .05 (one-tailed).
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Exceptional Children|
|Author:||Kettler, Ryan J.; Elliott, Stephen N.; Beddow, Peter A.; Compton, Elizabeth; McGrath, Dawn; Kaase, K|
|Date:||Jun 22, 2010|
|Previous Article:||Instruction in a strategy for compare--contrast writing.|
|Next Article:||Effects of using modified items to test students with persistent academic difficulties.|