Integrating frequency-based mathematics instruction with a multi-level assessment system to enhance response to intervention frameworks.
Responsiveness to Intervention (RtI) refers to a recent innovation in education utilizing a multitiered service delivery model with two overlapping functions: first, to identify students who are struggling in the classroom and remediate academic deficits, and second, to distinguish between students who are behind due to a history of poor instructional experiences and those in need of special education services for remediation of an actual learning disability. (Jenkins, Hudson, & Johnson, 2007). RtI promotes a new focus on teaching and learning, focusing on how responsive students are to instruction. The term as originally coined, "Responsiveness" places the agency or label of special education on the teaching methodologies and measures student responsiveness to those procedures.
RtI was derived from the provisions outlined in the Individuals with Disabilities Improvement Act of 2004 (IDEA, 2004), which states that "in determining whether a child has a specific learning disability, a Local Education Agency may use a process that determines if the child responds to scientific, research-based intervention as part of the evaluation process" [Section 614 (b)(6)(B)]. As such, RtI offers an alternative to the traditional practice of diagnosing learning disabilities based on a pronounced dual discrepancy between intellectual capacity (as determined by intelligence tests) and academic proficiency in various subjects (as determined by achievement tests). RtI is not mandated, but IDEA 2004 now prohibits states from requiring this discrepancy model.
In many ways, RtI constitutes a profound paradigm shift in the way that students with educational problems are perceived and taught in the classroom. According to the traditional approach, if a significant dual discrepancy is observed between intelligence test scores and achievement scores, the problem is generally considered to exist within the student. The student is then labeled with a learning disability and committed to the special educational system. If a significant discrepancy is not observed, the student returns to the general education classroom. Due to strict qualification guidelines related to the current provision of special education services, funding to provide additional support to students that are only marginally failing is not generally available. Yet, it's clear that without an effective intervention, the deficits are only likely to increase. For this reason, the dual discrepancy model is often referred to as the "wait-to-fail" model and has come under increasing widespread criticisms as being an ineffective and inadequate framework for special education (Francis et al., 2005). In contrast to the dual discrepancy approach, the RtI framework emphasizes identifying and supportng all students with pronounced academic deficits. This change in perspective of how to provide services has even led to a new term, "the enabled learner" (Tilly, 2006) and is creating a challenge for our school psychologists to move from the use of traditional psychometric tests (i.e. intelligence and achievement tests) to an "edumetric" problem solving model focused on measuring changes in individual performance over time (Canter, 2006). In summary, "RtI is a set of scientifically research-validated practices that are deployed in schools using the scientific method as a decision-making framework" (Tilly, 2006 p. 22).
When a school-wide approach is adopted, the RtI framework most commonly utilizes what is referred to as the Standard Protocol Model (Shores & Chester, 2009). The model was based on the research in curriculum-based measurement of reading skills conducted by Deno and Mirkin (Deno, 1985, 2003; Deno & Mirkin, 1977). CBM grew out of the need for educators to access more frequent performance data in the academic foundation skills of reading, spelling, writing, and mathematics (Deno, 1985; Shinn 1989). Teachers can use these criterion-referenced assessments to compare student progress to a grade level standard as well as to analyze individual growth compared to previous performance. The Standard Protocol Model shown in Figure 1 is typically conceptualized as a pyramid or triangle with three tiers of intervention.
[FIGURE 1 OMITTED]
In Tier 1, risk status may be established with the use of universal screening measures using benchmark scores, standardized achievement test results, or median scores from several progress monitoring measures (Stecker, 2007). If an academic deficit is observed, the problem is initially assumed to reside within the instructional environment rather than within the student. For instance, if most students demonstrate poor performance, then the teacher may need additional training. If the data indicate that a small percentage of students are not responding to a high-quality evidence-based core education program, then smaller group and more time-intensive intervention is provided in Tier 2. At this level, progress is generally monitored more frequently. Students who do not demonstrate satisfactory progress in Tier 2 commensurate with peers become candidates for Tier 3 intervention, where even more time-intensive interventions are employed with even smaller group sizes (Vaughn & Linan-Thompson, 2003). Services can be provided by general education teachers or special education teachers. Students are ultimately identified as eligible for special education services when their response to effective instruction is significantly inferior to that of peers (Vaughn & Fuchs, 2003). More specifically, students are classified with a learning disability if their rate of growth based on progress monitoring data and their level of performance are more than 1 standard deviation below the mean level and slope of their classmates (Ardoin, Witt, Connell, & Koenig, 2005). Through this process it is possible to differentiate between students with a true learning disability and those that are under-achieving due to a history of poor instructional practices (Vaughn & Fuchs, 2003).
According to the Response to Intervention Adoption Survey of 2009, the RtI framework is becoming increasingly popular in mainstream education as an alternative to the dual discrepancy model. With the support of the American Association of School Administrators, the Council of Administrators of Special Education, and the National Association of State Directors of Special Education, the results of this survey showed that 71% of respondents reported to be either piloting RtI, implementing RtI, or are in the process of a district-wide RtI implementation. These results are compared to 60% in 2008 and 44% in 2007 (Pascopella, 2010). Yet, the way in which these schools are implementing RtI remains unclear and likely varies widely.
RtI seems to hold great promise as an improved framework that by design appeases two legally mandated protocols within the American educational reform movement today. Those are, firstly, the identification and remediation of students with learning deficits and secondly, the district's reporting of the adequate yearly progress of its students with an emphasis on creating data systems that inform administrators and educators as to the progress of all students. However, RtI still lacks strong empirical support--especially in relation to (1) interventions that are yoked to the use of "technically sound instruments" as required by federal law (Kame'enui, 2007); (2) the procedures and criteria that should be followed to move students between tiers (Stecker, 2007); as well as (3) the optimal frequency of progress monitoring assessments used within different tiers (for discussion see Fletcher, 2006; Fuchs & Fuchs, 2006). Until the instructional practices, decision-making processes, and the consequences of various assessment schedules inherent within RtI frameworks are fully developed and empirically established, the quality of implementations and the associated outcomes will presumably continue to vary widely.
Therefore, the purpose of the present investigation was to broaden current research on RtI frameworks by analyzing the application of a multi-level system of assessment that uses Precision Teaching, a frequency building (1) instructional intervention designed to methodically improve student progress. We decided to focus our analysis on mathematics outcomes because the majority of RtI research studies to date have examined processes related to reading achievement. Much less research has been devoted to math applications, yet the prevalence of students identified as having a learning disability in math is similar to the incidence of those identified as having reading disabilities (Gross-Tsur, Manor, Shalev, 1996).
Using Frequency-Based Performance as a Screening Measure in Mathematics
It is estimated that between 5-8% of all school age children have a math disability (MD) as determined by the dual discrepancy model (Geary, 2003). A math disability typically manifests as problems in simple arithmetic such as number sense, number and operations, and word problem solving (D. P. Bryant, Bryant, & Hammill, 2000; Fuchs et al., 2004). These basic performance deficits can lead to long-term academic problems (Geary, 2003) and to the development of disturbing behavior (Pisecco, Wristers, Swank, Silva, & Baker, 2001).
Research suggests that the absence of fluency in basic math skills limits the ability to solve more complex problems and understand more advanced concepts (Geary, 2003; Gersten, Jordan, & Flojo, 2005). As such, the fluency metric can be used to uniquely discriminate between expert and novice performance. To illustrate this point, Fleischner, Garnett, & Shepherd (1982) compare the math fact computation skills of primary school students who are identified as having learning disabilities with average students and found performance was essentially indistinguishable based on the measure of percentage correct. On timed assessments however, students with learning disabilities completed only one-third as many math fact problems as their non-identified peers. On the basis of this research and similar findings highlighting the ability of frequency-based measures to uniquely distinguish between advanced and at-risk students, most universal screening and progress monitoring measures used within RtI frameworks are rate-based measures.
The Power of Frequency-Based Instruction: Precision Teaching
While frequency-based assessment can uniquely identify struggling students, frequency-based instruction can be used to drive learning outcomes. Fluent performance is defined as true mastery and demonstrated when an individual can perform a task smoothly, accurately, and without hesitation (Binder, 1990). In the early stages of B.F. Skinner's study of human behavior, he identified continuous measurement and rate of performance as key metrics with which to study human performance (Skinner, 1953). In the 1960's, Dr. Ogden Lindsley created the Precision Teaching methodology and its visual graphic representational tool, The Standard Celeration Chart (illustrated in Figure 2). Precision Teaching adheres to Skinner's early laboratory findings highlighting the importance of rate as a critical measure of human performance and applies these findings in the educational arena (Lindsley, 1972).
[FIGURE 2 OMITTED]
"Precision Teaching is adjusting the curricula for each learner to maximize learning shown on the learner's personal standard celeration chart. The instruction can be by any method or approach" (Lindsley, 1991, p.259). The system of PT therefore allows educators the freedom to present any instruction, specify objectives, state pinpoints, and analyze individual performance with a rate-based measure (Binder & Watkins, 1990; Johnson & Layng, 1992,1994; Johnson & Street, 2004). Skill acquisition is observed and compared on an individual basis; students need not be compared with one another. Mastery is then defined by both accurate and fluent performance. As such, PT can be further used to evaluate the effectiveness of a particular instructional method or approach.
Within a PT protocol, teachers measure student performance in small increments of time, such as one minute. Instructional concepts are broken down into component curricular pieces that combine into composite skills. For instance, students may practice building frequency in the component skills of number writing and skip counting before learning the composite skill of basic multiplication. Students practice these component pieces of instruction to build high rates of correct responding and to reduce rates of incorrect responding. Correct and incorrect performance rates are tracked simultaneously on the individual student's Standard Celeration Chart. Consequently, "the chart" provides both snapshots of academic skills at one moment in time and learning ability over long spans of time. While it is beyond the scope of this paper to describe the features of PT, further description is provided by Johnson and Layng, (1992), Binder (1988), Lindsley (1972), as well as Pennypacker, Koenig and Lindsley, (1972).
Many successful applications of using PT to accelerate academic outcomes have been reported in special education and general education settings. Some of the earliest research was conducted by Eric Haughton- an early PT pioneer. In the area of math education, Haughton (1972, 1980) showed that a program of building tool skills such as math facts and number writing to a fluent rate improved underachieving students' math performance to the level of their competent peers. Beck and Clement (1991) extended these findings by demonstrating how frequency-based academic interventions can be successfully implemented in general education settings (Beck & Clement, 1991). In their study, referred to as the Great Falls Precision Teaching Project, public school teachers conducted daily frequency building sessions with their students for 20 to 30 minutes across a range of basic skills. Students completed (1) two 1-minute timings in two different academic skills, (2) recorded their best timed performance, and (3) monitored progress relative to the mastery-based rate criteria needed to advance through increasingly complex curriculum objectives. Within three years, students in the school district improved between 19 to 44 percentile points on the reading, writing, and math subtests of the Iowa Test of Basic Skills.
As a final example, for the past three decades, Morningside Academy, a private school and professional development provider, has employed PT methodologies in its Morningside Model of Generative Instruction (Johnson & Layng, 1992, 1994; Johnson & Street, 2004). With procedures that focus on building component skills to a fluent rate, this approach typically allows children and youth to gain two grade levels per school year. As is common in the initial assessments used within RtI frameworks, students are given precision placement tests to determine if a skill is (1) at an instructional level, (2) accurate but not fluent, or (3) accurate and fluent. With rigorous instruction based upon sound instructional design principles, students move from acquiring skills to achieving fluent levels of performance by setting and reaching daily goals targeted on the Standard Celeration Chart. Mathematics instruction employs practice in a range of component skills (e.g., digit writing and basic math fact computation) and composite skills (e.g., multi-digit multiplication computation and word problem solving).
Previous RtI Application Studies in Math
Several studies have examined RtI models developed and implemented for research purposes or otherwise described an RtI model already in place to develop student math skills. For instance, Fuchs et al. (2004) analyzed the effect of a 16-week Tier 2 math problem solving intervention for third grade students (n=301). The TerraNova Achievement Test was used as the universal screening measure to determine at-risk status for a math disability (MD), reading disability (RD) or both (MRD). Progress was monitored using rate-based probes included in the Monitoring Basic Skills Progress Assessment (MBSP) Math Computation and Math Concepts and Applications tests (Fuchs, Hamlett, & Fuchs, 1998). A complete description of these measures is provided in the methods section. The intervention consisted of problem solving instruction and practice. Controlled comparisons were made to students identified as not at-risk that either received or did not receive the treatment. Differential levels of responsiveness were observed as a function of risk status, whereby students not at-risk demonstrated the most improvement.
Bryant et al. (2008) examined the effects of a Tier 2 mathematics intervention with 1st grade students (n=42). The Texas Early Mathematics Inventories: Progress Monitoring (TEMI-PM) test score were used as the universal screening measure in Tier 1. Students that scored below the 25th percentile were selected for Tier 2 intervention. The 20-minute intervention was delivered four days per week for 23 weeks. Similar to the Fuchs et al. (2004) study, instruction provided in the Tier 2 intervention was not differentiated to address individual academic deficits such as problems with basic number writing, subtraction, division, or answering math facts. Instead, instruction for all students emphasized basic number skills (e.g., counting, sequencing, and comparing numbers), place value, and addition/subtraction combinations. Results showed that posttest scores for the at-risk students were significantly higher than expected based on pretest scores which resulted in a main effect for the intervention.
Ardoin and colleagues (2005) implemented a three-phase RTI model with 4th grade students in two classes. Universal screening measures used in Tier 1 consisted of curriculum-based measurement (CBM) probes in addition, subtraction and multiplication which were developed using the website www.interventioncentral.com. Students were instructed to answer as many problems as possible within 2 minutes on each probe. The probes were used as the dependent measures to monitor progress across each tier of intervention. The screening data indicated that class wide deficits in certain types of subtraction problems existed (4-digit by 4-digit with regrouping), thus a class wide intervention targeting related skills was implemented. In order to tailor instruction to instructional need, two more CBM probes were implemented breaking down the deficit composite problem into some of its component parts (2-digit by 2-digit subtraction without regrouping and 2-digit by 2-digit subtraction with regrouping). Results indicated that intervention should begin with instruction in 2-digit by 2-digit subtraction without regrouping. As an interesting addition to this study, the authors controlled for motivation effects. This was done by setting up daily "goals" for the students to exceed their previous scores in repeated practice. Upon meeting their daily improvement goal, students would be allowed to select from a variety of rewards (e.g., note pads, small toys). If as a result of the implementation of this motivation system, a students' second score exceeded baseline performance by at least 20% (Noell, Freeland, Witt, & Gansele, 2001) and met instructional criteria based upon suggested 4th grade rates of fluency by Shapiro (1996) (i.e., 40-88 digits correct in 2 minutes), then those students were considered to have motivational deficits ("won't do") rather than skill deficits ("can't do"). For those students whose scores did not change within the motivation system, a Tier 2 class-wide intervention was implemented. In Tier 3, more intensive instruction was provided to the students (n=5) who did not respond adequately to the classwide intervention. Results revealed that only one student did not respond to Tier 3 intervention.
How the Present Study Extends the Literature
Although each of the studies described above utilized a frequency-based assessment system to identify struggling learners and to monitor progress, none provided differentiated instruction to address the academic deficits of individual students or utilized mastery-based criteria to facilitate differentiated Tier 2 instruction. The purpose of the present investigation is to illustrate how a multi-leveled system of assessment combined with PT interventions in mathematics can be incorporated into an RtI framework. The goal of this process is to better target individual student weaknesses and thereby further accelerate learning outcomes for all students.
Framework of the Multi-Level Assessment System
The hallmark feature of the multi-level assessment system is to align daily frequency building practice on deficit component skills with weekly standardized fluency probes on composite skills in order to monitor and ensure progress on annual normative assessments. To this end, three levels of assessment intensity are prescribed: Macro, Meta, and Micro. Figure 3 is designed to help teachers and administrators understand the relative importance of each assessment level. These levels of assessment are appropriate for any tier of RtI implementations.
[FIGURE 3 OMITTED]
Macro level assessment. The "Macro" level includes the utilization of annual standardized assessments that are most often norm-referenced, but teachers may also use criterion-referenced assessments. The specific types of tests employed are most likely determined by the adopted local, state or national requirement. As a result of the No Child Left Behind legislation, states are mandated to report scores to the federal government in 4th grade, 8th grade and high school. Consequently, most states have enacted a state specific assessment program with which to report the adequate yearly progress (AYP) of their students. In addition, some school districts administer supplemental tests more regularly in an attempt to monitor student progress annually. Examples of widely accepted annual normative assessments are the Iowa Test of Basic Skills (ITBS), the Woodcock-Johnson Tests of Achievement-III (WJ-III), the Stanford Achievement Test (SAT) and the Wechsler Individual Achievement Test-II (WIAT-II). Teachers may not need to administer assessments themselves if annual assessment scores required by the school district are available. At the minimum, parallel alternate forms of one assessment protocol should be utilized consistently year to year in order to better analyze student progress relative to previous performance on the same assessment.
The Macro level scores serve two main agendas. First, the aforementioned analysis of progress on repeated administrations of the same measure illustrate where student performance lies in comparison to same age peers as well as relative to previous performance. The second major use of Macro level scores is placement in curricular sequences. For norm-referenced assessments, the grade equivalent scores are used to establish a present level of performance for precision placement in state adopted grade-leveled curriculum.
Meta level assessment. The "Meta" level of assessment is defined by the use of weekly Curriculum-Based Measurement (CBM) tools that closely align to the curricular content being taught daily in the classroom. In mathematics, these assessments may include the measures reviewed in the previous RtI application studies. In addition to progress monitoring, CBM measures can be used to set annual goals because the procedures allow for three types of referencing: curriculum, normed, and individual (Malmquist, 2004). Curriculum referencing occurs because the materials are intended to be representative of the classroom's curriculum. School administrators can thereby compare the scores of their own students within and across the district in order to establish local norms for comparison. Finally, the student's own scores can be tracked and analyzed for progress and need for intervention. Since the CBM measures are closely aligned with the district's curriculum, teachers can continually analyze scores to see if each student is placed in appropriate grade level material or if a student is ready to advance to the next curriculum level.
The Meta level assessment data can be used to make various intervention decisions. For instance, after reviewing data teachers may decide to provide additional learning opportunities for students to engage frequency building practice on composite skill applications that the student can perform accurately, but not fluently. Conversely, the teacher may decided to break down composite skills that the student does not perform accurately into their component parts and provide frequency building practice opportunities in those component parts. Teachers can then analyze whether the daily practice sessions are having an effect on the student's ability to perform the tasks on the weekly probes.
Micro level assessment. The "Micro" level of assessment is the most sensitive measurement level. It reflects the frequency building work students engage in daily. At this level, students engage in deliberate practice of component and composite skills, which are measured as count-per-minute increments and recorded on the Standard Celeration Chart. PT interventions are provided when a student repeatedly does not reach a daily frequency goal. For example, a student whose written single digit math computation rate does not increase as projected may temporarily switch from written practice to oral practice. Once adequate oral responding is achieved, the student would resume written practice.
Instructional Group Placement and Movement Between RtI Tiers
Homogeneous placement within and across classrooms is the most efficient way to utilize the multi-level system of assessment as described in this paper. In many American schools, there may exist multiple students across classrooms that display skills that are functioning significantly behind their same-age peers. If they are grouped according to their academic skill level, and not their age, then they can be monitored and moved through the curriculum accordingly. When school-wide homogeneous placement criteria are used, educators should consider the academic level of the student as well as the social appropriateness of the placement. In general, a good rule of thumb is to have no more than four years of age difference within the group. Therefore, even if an 11 year old is performing significantly below grade level, that student would not be placed with a 6 year old even if the two students were functioning at the same academic skill level. Students who are one to three grade levels behind according to their Macro scores are placed in Tier 2, which consists of differentiated instruction and supplemental frequency building practice. Students who are four or more grade levels behind are placed in Tier 3, which also consists of differentiated instruction and supplemental frequency building practice. However, at this level, students receive instruction in smaller group or on an individual basis with paraeducator support. In the current study, supplemental practice provided for students in Tier 2 and Tier 3 occurred in the general education classroom during small group instruction with the classroom teacher or a paraeducator under the direction of the classroom teacher.
All students are monitored using Meta level assessment tools, regardless of the tier in which the student is placed. In contrast to most RtI models, all students in the current study were assessed with the same frequency within each tier. The difficulty level of assessment used was matched to the grade level of the curriculum placement. For instance, a 6th grader student who requires 2nd grade level instruction according to Macro level assessment data, was assessed using 2nd grade level CBM materials. These Meta level assessments are used until the student's performance indicates readiness to move up in grade level according to the exit criterion embedded in the CBM protocol. Once students in Tier 3 close the gap to three or fewer grade levels, they are placed in Tier 2. Likewise, once students in Tier 2 reach the grade level material matched to their age, they move to Tier 1. In this way, the RtI framework used in conjunction with the multi-level system of assessment assures that even those students who are initially significantly behind their same-age peers will be afforded the possibility of "catching up" to their peers.
Application of the Multi-Level System of Assessment in Mathematics Participants & Setting
To illustrate the multi-level system of assessment, the mathematics progress of two 4th grade students placed in Tier 2 intervention were examined in the present study. Both students attended a small private school in Seattle, WA where students with and without disabilities are educated side-by-side in an inclusion model. The specialized focus of the school is to educate students on the autism spectrum alongside their typically developing peers.
At the time of her entrance into the program, Sarah was 11 years old and was falling behind her same age peers in math. Prior to 4th grade, Sarah was evaluated by private professionals in order to determine if she had a diagnosable condition. By all accounts, her test results did not indicate any learning disabilities. However, according to the public school records from her previous elementary school, she was qualified for special education services and given an Individualized Education Plan (IEP) under the eligibility category of Specific Learning Disability, which allowed her to access extra support in math. At the time, her public school district was not using the RtI model to identify and serve struggling learners within the general education classroom. Consequently, the only option the teachers had to provide additional support for Sarah was through the IEP system. Sarah entered 4th grade as a shy student displaying very little confidence in her math skills, however she excelled in every other academic subject where she tested at or above grade level. She displayed extraordinary empathy and willingness to support the other students and, despite her initial shyness, she developed some very impressive leadership skills.
Mason joined the program as a sweet natured, shy 10 year old boy whose previous educational experience consisted of a combination of home schooling with his mother and another private school placement for three days per week. Mason's diagnosis placed him on the higher functioning end of the autism spectrum. He was compliant, displayed no behavioral difficulties, and was eager to please adults. He had marked deficits in higher level language processing and reasoning skills which manifested themselves in difficulty with reading comprehension, oral language, and social interaction with peers. Upon entrance into the program, his math calculation and concrete thinking skills were in the average range overall for his grade level. However, his math fluency score was at least one grade level behind his peers. Moreover, his higher level thinking and language processing deficits were already impacting his ability to successfully complete math reasoning and word problem examples and he was becoming increasingly anxious when completing these tasks during math class.
Due to space limitations, only Sarah's educational program will be detailed. However, it should be noted that the instructional methods and decision-making processes were identical for both Sarah and Mason. The purpose of including the results of both students is to show how the same multi-level system of assessment integrated within an RtI framework can be applied to improve the academic outcomes of diverse students.
Several Macro, Meta, and Micro assessments were employed in the current investigation. The assessments used within each level are diagrammed in Figure 4. Note that the model for effecting Math Fluency subtest scores is distinguished from the Model for Impacting Math Applied Problems, Concepts, and Calculation Subtest Scores.
[FIGURE 4 OMITTED]
At the Macro level, four subtests of the Woodcock-Johnson Tests of Achievement-III (WJ-III; Woodcock & Johnson, 1989) were administered at the beginning of the school year to facilitate placement in the math curriculum. The Calculation subtest measures the ability to perform mathematical computations in addition, subtraction, multiplication, division, geometry, trigonometry, as well as logarithmic and calculus operations. The Math Fluency subtest assesses the ability to solve simple addition, subtraction, and multiplication facts correctly within three minutes. The Applied Problems subtest requires the subject to analyze and solve math problems across a variety of genres: such as time, money, word problems, fractions, decimals, and percentages. Finally, the Quantitative Concepts assessment measures knowledge of mathematical concepts, symbols, and vocabulary. Progress at this level was measured at the beginning and end of the school year.
At the Meta level, the Computation CBM test and the Concepts and Applications CBM test from the Monitoring Basic Skills Progress (MBSP) assessment were used (Fuchs, Hamlett & Fuchs, 1998). The Computation test measures student ability to complete grade level whole number computation problems in a predetermined amount of time; time limit varies with level of CBM administered. The Concepts and Applications test measures student ability to complete grade specific mathematics concepts in addition to whole number computation, such as word problems, measurement, fractions, decimals, money, and other concepts depending upon grade level. The student's score is calculated by counting the number of responses written at the end of the prescribed time period. The score is then compared to a norm-referenced table indicating student rate of performance compared to the distribution of students in the normative sample. Accordingly, decision rules as utilized within the multi-level system of assessment suggest that students who are achieving scores at the top 75th percentile are moved up to the next grade level of both Saxon math instruction and the accompanying grade equivalent CBM measures. Teachers typically share the value of this analysis with the student. Students like Sarah, whose skills are well below same age peers, are afforded the possibility of catching up to peers because they are not required to stay in one level of curriculum for the entire academic year. This assessment is also timed according to the CBM grade level administered. Progress at this level was measured weekly using the Standard Celeration Chart.
At the Micro level, students built frequency on digit writing, basic math fact computation, and concepts and application in third grade Saxon math. Typical practice sessions require approximately 10 minutes of frequency building practice per Micro level activity. Progress at this level was monitored daily using the Standard Celeration Chart.
Placement and Intervention
The pretest Macro level assessment results are provided in Table 1. The data show that Sarah's pretest grade equivalent scores ranged from approximately half a year behind in Applied Problems to approximately 2.5 grade levels behind in Quantitative Concepts.
Since Sarah scored one to three grade levels behind on the Macro level assessments, she received Tier 2 intervention. Her teachers chose the curriculum grade level that aligned most closely to her performance level on the WJ-III assessment: Saxon 3 math (Larson et al, 1994). This was an approved public school curriculum in the state of Washington where she resided. In addition, the teachers established her baseline performance on the 3rd grade level weekly curriculum based measurement probes for Computation and Concepts and Applications using the Monitoring Basic Skills Progress (MBSP) assessment (Fuchs, Hamlett & Fuchs, 1998). As per the protocol in the administration manual, Sarah was given 3 minutes to complete the 3rd grade Computation assessment and six minutes to complete the 3rd grade Concepts and Applications assessment.
As the school year progressed, Sarah and her teachers analyzed her CBM data as well as her progress within the Saxon curriculum. If the data indicated that Sarah needed additional practice in concepts due to error patterns in her lessons or in the weekly CBM probe, then Sara resumed daily practice on those skills at the Micro level of assessment utilizing a Precision Teaching approach. The pinpoints for the daily practice targeted either component or composite skills. In Sarah's case, she needed additional practice in both types of skills. For the component skills, Sarah engaged in the daily frequency building practice of writing her numbers fluently so that she could write more quickly and thus answer more questions on the weekly computation probe. The teacher noted that she was solving computation problems accurately, but she hesitated on simple facts, which resulted in her answering fewer questions during the time allotted in the weekly CBM probe. Therefore, she began to practice building her frequency of responding on single digit math facts for all operations (addition, subtraction, multiplication, division). For the composite skills, Sarah was generally accurate in solving most concepts taught directly from the Saxon curriculum, but was very slow and frequently got "lost" in processes. This meant that during her daily Math class, Sarah would build frequency in her responses on whole number computation problems which aligned to the type and difficulty of problems taught in the Saxon lessons on that day. In addition, Sarah was given what the teacher affectionately titled: a "Saxon Dump Chart." The daily practice was analyzed on a Standard Celeration Chart. The instruction shifted between any and all concepts taught in the curriculum and assessed on the weekly CBM probe for those areas in which Sarah needed more frequency building practice. The Micro and Meta level assessments showed she was not able to perform these concepts correctly either in the daily Saxon lesson (Micro) or in her weekly (Meta) probe. For example, during one CBM probe on Concepts and Applications, Sarah either skipped or made errors in all items dealing with reducing fractions. This resulted in a change of instruction in math class for the next week whereby the teacher provided deliberate timed practice examples on reducing fractions, which aligned with the concept as it was taught in Sarah's daily Saxon lessons. The data from these timed practices were graphed on her "Saxon Dump Chart." As soon as the next CBM measure yielded scores that showed Sarah correctly answered all reducing fraction problems, the teacher then moved onto another concept in which Sarah needed more help. The subsequent concept was targeted next on the "Saxon Dump Chart" in daily practice. This process of monitoring performance on the Standard Celeration Chart and examining CBM data continued throughout the year.
The pretest and posttest Macro level test scores for Sarah and Mason are provided in Table 2. The outcomes illustrate the effectiveness of integrating frequency-based instruction within a multi-level RtI system.
Both students made significant improvement over the course of the academic year. Specifically, Sarah gained more than two grade levels in one academic year on the Math Fluency and Applied Problems subtests. And on the Quantitative Concepts subtest, she gained over four grade levels in one academic year. At the end of 4th grade, Sarah caught up to the average range of her peers in these skill areas. However, for the Calculation subtest, the gap between her skills and those of her peers had widened such that she was 1.0 grade level behind in the beginning of the year and 1.6 grade levels behind by the end of the year). This downward trend was addressed intensely in her 5th grade year. Similarly, Mason's posttest scores showed impressive gains. Mason gained one grade level in two subjects (Calculation and Applied Problems). Meanwhile, he gained just over two grade levels on the Quantitative Concepts subtest while advancing nearly approximately 6.8 grade levels on the Math Fluency subtest.
We described a model for integrating frequency-based instruction with a multi-level assessment system to enhance RtI frameworks in order to improve mathematics outcomes. The assessment system includes Macro, Meta, and Micro level measures. The Macro level measures were used to place students in the mathematics curriculum using the pretest scores and to evaluate the overall effectiveness of the instructional program based on posttest scores. The Meta level measures were used to monitor weekly learning progress and guide instruction at the Micro level. The Micro level frequency-based practice and assessment were, in turn, used to drive progress at the Meta and Macro levels.
The results revealed that both of the highlighted 4th grade students made significant mathematics progress over the course of the Tier 2 intervention. Of particular interest, despite being qualified for special education services under the eligibility category of Specific Learning Disability in the public school system, Sarah gained over four grade levels in quantitative reasoning skills and over two grade levels in applied reasoning and math fluency skills. These gains occurred in only ten months between her pre and post assessments and while she was placed in a small group general education class. She did not require the additional pull out special education services assigned to her through her IEP. We believe that these outcomes challenge the legitimacy of using the dual discrepancy model to qualify students for special education services due to the fact that with high quality instruction and on-going progress monitoring within her general education classroom, Sarah no longer needed an individualized education plan.
Additionally, the student diagnosed with Autism made dramatic progress in one academic year as well. Mason's scores on the Applied Problems and Calculation subtests maintained their normative status compared to his peers as evidenced by a one year gain in each skill area and his performance on the Quantitative Concepts test increased by more than two grade levels. However, Mason's gains on his Math Fluency scores dramatically highlight the power of the multi-level system of assessment. His performance yielded a score similar to a student almost 5 grade levels ahead of him with a gain of over 6 grade levels! This improvement is the direct result of daily frequency-based practice on basic math facts using PT instructional design and delivery methods which required approximately 10 minutes of his class time each day. It is especially noteworthy that this multi-level system compares Mason's scores to those of non-diagnosed students of the same grade level at both the Macro level (WJ-III) and the Meta level (MBSP). Here is a student with a diagnosis marked by developmental delays, monitored by general education standards within a general education class structure, and the results show that he caught up to his peers or surpassed them in certain areas. Policy makers, educators, and behavior analysts who work within and in conjunction with the field of autism intervention should take note of Mason's case. Further research into the efficacy of holding students with autism to the same academic standards as their neurotypical peers, which employs an RtI model using PT methods within a multi-leveled system of progress monitoring, is an exciting prospect.
In addition to the field of autism, more rigorous research is needed to determine the extent to which such outcomes can be consistently achieved from students labeled with mild to moderate learning problems given individualized frequency-based mathematics instruction using PT methods delivered within a multi-level assessment system. It is possible that if the same curriculum were used without the frequency building practice, the results would differ. Alternatively, it is possible that if the same curriculum and instructional methods were used, but assessment at the Meta level was less frequent, the results would not differ. Consequently, more research is needed to determine the essential instructional and assessment ingredients that rendered this particular recipe so successful.
The assumptions made about student success or failure are being challenged by a system that currently provides special education services to approximately 6.8 million children and youth served by law under IDEA (DOE, 2010). According to the United States Department of Labor Bureau of Labor Statistics, in order to serve these students, the U.S. employed 473,000 Special education teachers (as of 2008) with an anticipated increase of 17 percent from 2008 to 2018, making special education teachers the most rapidly-growing occupation. When federal funds are added to state and local funding, an estimate from nearly ten years ago stated "$35-$60 billion is spent annually on special education in this country, with possibly 40 percent of all new spending on K-12 education over the past 30 years spent on special education" (Finn, Rotherham, & Hokanson, 2001). Many school districts are adopting the RtI framework as a way to more easily satisfy the No Child Left Behind mandate to maintain current funding levels and to also be competitive for the new federally-funded Race to the Top initiative (2009) with grants that range from $25-700 million. These facts and figures may be daunting, yet educators and psychologists who stand prepared in the classrooms with effective assessment tools and methodologies, and who have the commitment to serve all children can view this challenge with optimism. One such proclamation is that RtI stands for "Really Terrific Instruction" (Tilly, 2008).
Given the growing acceptance and enthusiasm for RtI in the general education realm, and the fact that IDEA (2004) requires that schools must have procedures in place such "that special classes, separate schooling, or other removal of children with disabilities from the regular educational environment occurs only when the nature or severity of the disability is such that education in regular classes with the use of supplementary aids and services cannot be achieved satisfactorily," RtI provides yet another opportunity for nationwide educational improvements for both general and special education students. The least restrictive environment for a student previously identified as in need of special education services, may one day be our general education classrooms where we will find that effective instruction and regular formative assessment will be present for all learners. The goal of the system described in this paper is highly effective and inclusive education for all students in the same general education classroom.
More than ever before, the push for bringing data-based decision making into the hands of teachers directly impacting their students on a daily basis opens the door for a powerful partnership between the fields of education and Applied Behavior Analysis. Interestingly enough, behavior analysts might recognize how our contributions in education have been aligned with the goals of RtI for decades. To understand the history of Responsiveness to Intervention and its use of formative assessment, those who are familiar with formative evaluation will recognize the connection. In 1989, Susan Markle wrote an article entitled, "The Ancient History of Formative Evaluation." Here she traces the work of behavior analysts from Skinner's work in the laboratory using shaping procedures with pigeons to the complex design of programmed instruction and reminded us that over twenty years ago,
"The process now called "formative evaluation is an old idea with many names, In those early days, we called it "developmental testing" or simply "tryout," Evaluation experts (Lumsdaine,1965; Scriven, 1967) clarified our ideas by distinguishing between "formative" and "summative" evaluation" (p.27).
In RtI protocols, it is the teacher who is now responsible for examining and shaping the responsiveness to instruction. What was once the responsibility assumed by the instructional designer of programmed instruction (at the Micro level) is now required of both the instructional material (evidence-based instruction) and the instructional leaders--including the principals and teachers. The curricula and its delivery system stand at the forefront of the system's accountability and greatly influence whether a student will be identified as eligible for special education services. In RtI, we might consider all the variables that contribute to efficient and effective learning, and refer to them as "the program." According to Markle (1969), when progress does not occur, "if the student errs, the programmer flunks" (p. 16).
Another article written by a behavioral scientist nearly 50 years ago discusses "Dimensions of the Need" describing the very same variables that can be teased out in an RtI model. According to Padwa, in his discussion of programmed instruction, he says "the behavioral development and testing of instructional programs assure an unprecedented appropriateness of teaching methods to the backgrounds and actual preparation levels of the students taught, as well as a precise evaluation of its own effectiveness." Padwa's success with programmed instruction led him to affirm that programmed instruction had the "ability to guarantee high achievement" (Padwa, 1962), emphasis in original). These same principles of sound instructional design can once again be studied and brought to the classroom by behavior analysts and teachers for their students.
The power of using a multi-level system of assessment as a formative decision making tool for students and teachers is the alignment of curriculum to assessment and on-going progress monitoring in a general education classroom for all students, regardless of diagnoses or ability levels. Although our investigation briefly describes a system with case studies in mathematics, the procedures can be extended to the typical public school classroom when all resources are optimally employed. This assessment system can be utilized for any academic subject area where there exist normative standards for mastery. Creating a powerful alliance of the expertise of those who know how to teach (teachers) with those who know how to analyze (behavior analysts) is an exciting prospect which will likely result in substantially positive impact for future generations of students with and without disabilities.
Ardoin, S. P., Witt, J. C., Connel, J. E., & Koenig, J. (2005). Application of a three-tiered response to intervention model for instructional planning, decision making, and the identification of children in need of services. Journal of Psychoeducational Assessment, 23, 362-380.
Beck, R., & Clement, R. (1991). The Great Falls Precision Teaching project: A historical examination. Journal of Precision Teaching, 8, 8-12.
Binder, C. (1988). Precision Teaching: Measuring and attaining exemplary academic achievement. Youth Policy Journal, 10(7), 12-15. Available at http://www.binder-riha.com/publications.htm.
Binder, C. (1990). Doesn't everybody need fluency? Performance Improvement, 42(3), 14-20.
Binder, C., & Watkins, C. L. (1990). Precision teaching and direct instruction: Measurably superior instructional technology in schools. Performance Improvement Quarterly, 3(4), 74-96.
Bryant, D. P., Bryant, B R., & Hammill, D. D. (2000). Characteristic Behaviors of Students with LD Who Have Teacher-Identified Math Weaknesses. Journal of Learning Disabilities, 33(2), 168-177.
Bryant, D. P., Bryant, B. R., Gersten, R. M., Scammacca, N N., Funk, C., Winter, A., Shih, M., & Pool, C. (2008). The effects of tier 2 intervention on the mathematics performance of first-grade students who are at risk for mathematics difficulties. Learning Disability Quarterly, 31, 47-63.
Canter, A. (2006). Problem Solving and RTI: New roles for school psychologists, NASP Communique, 34(5)0. Retrieved from http://www.nasponline.org/publications/cq/cq345RtI.aspx
Cummins, J. (2010). Putting language proficiency in its place: Responding to critiques of the conversational/academic language distinction. Retrieved from http://www.iteachilearn.com/cummins/converacademlangdisti.html
Deno, S. L. (1985). Curriculum-based measurement: The emerging alternative. Exceptional Children, 52, 219-232.
Deno, S. L. (2003). Developments in curriculum-based measurement. The Journal of Special Education, 37(2), 184-192.
Deno, S. L., & Mirkin, P. K. (1977). Data-based program modification: A manual. Reston, VA: Council for Exceptional Children.
Finn, C.E., Jr., Rotherham, A.J.,Hokanson, C.R., Jr. (2001). Rethinking special education for a new century. Washington DC: Fordham Foundation.
Fleischner, J.E., Garnett, K., & Shepherd, M.J. (1982). Proficiency in basic fact computation of learning disabled and nondisabled children. Focus on Learning Problems in Mathematics, 4, 47-55.
Fletcher, J. M. (2006). The need for response to instruction models of learning disabilities. Perspectives, 32(1), 12-15.
Fletcher, J. M., Coulter, W. A., Reschly, D. J., & Vaughn, S. (2004). Alternative approaches to the definition and identification of learning disabilities: Some questions and answers. Annals of Dyslexia, 54, 304-331.
Francis, D. J., Fletcher, J. M., Stuebing, K. K., Lyon, G. R., Shaywitz, B. A., & Shaywitz, S. E. (2005). Psychometric approaches to the identification of LD: IQ and achievement scores are not sufficient. Journal of Learning Disabilities, 38(2), 98-108.
Fuchs, L. S., & Fuchs, D. (2006). Identifying learning disabilities with RTI. Perspectives, 32(1), 39-43.
Fuchs, L. S., Hamlett, C. L., Fuchs, D. (1998). Monitoring basic skills progress manual--Second Edition. Austin, TX: PROED.
Fuchs, L. S., Fuchs, D., Prentice, K., Hamlett, C. L., Finelli, R., Courey, S. J. (2004). Enhancing mathematical problem solving among third-grade students with schema-based instruction. Journal of Educational Psychology, 96, 635-647.
Geary, D. (2003). Learning disabilities in arithmetic: Problem-solving differences and cognitive deficits. In H. L. Swanson, K. R. Harris, & S. Graham (Eds.), Handbook of Learning Disabilities (pp. 199-212). New York: Guilford.
Gersten, R., Jordan, N. C., & Flojo, J. R. (2005). Early identification and interventions for students with mathematics difficulties. Journal of Learning Disabilities, 38, 293-304.
Gross-Tsur, V., Manor, O., & Shalev, R. S. (1996). Developmental dyscalculia: Prevalence and demographic features. Developmental Medicine and Child Neurology, 38, 25-33.
Haughton, E. C. (1972). Aims: Growing and sharing. In J. B. Jordan & L. S. Robbins (Eds.), Let's try doing something else kind of thing (pp. 20-39). Arlington, VA: Council for Exceptional Children.
Haughton, E. C. (1980). Practicing practices: Learning by activity. Journal of Precision Teaching, 1, 3-20.
The Individuals with Disabilities Education Improvement Act of 2004, Pub. L. No. 108-446, [section]632, 118 Stat. 2744.
Jenkins, J. R., Hudson, R. F., & Johnson, E. S. (2007). Screening for at-risk readers in a response to intervention framework. School Psychology Review, 36, 582-600.
Johnson, K. R., & Layng, T. V. J. (1992). Breaking the structuralist barrier: Literacy and numeracy with fluency. American Psychologist, 47, 1475-1475.
Johnson, K. R., & Layng, T. V. J. (1994). The Morningside model of generative instruction. In R. Gardner III, et al. (Ed.). Behavior analysis in education: Focus on measurably superior instruction. (pp. 173-197) Pacific Grove: CA: Brooks/Cole.
Johnson, K. R., & Layng, T. V. J. (1996). On terms and procedures: Fluency. The Behavior Analyst, 19, 281-288.
Johnson, K. R., & Street, E. M., (2004). The Morningside Model of Generative Instruction: What It Means to Leave No Child Behind. Concord: MA: Cambridge Center for Behavioral Studies.
Kame'enui, E. (2007). Responsiveness to intervention. Teaching Exceptional Children, 39(5), 6-7.
Larson, N. et al (1994). Math 3: An Incremental Development. Saxon Publishers, Inc.
Lindsley, O. R. (1972). From Skinner to Precision Teaching: The child knows best. In J. B. Jordan & L.S. Robbins (Eds.). Let's try doing something else kind of thing (pp. 1-11). Arlington, VA: Council for Exceptional Children.
Lindsley, O. R. (1991). Precision Teaching's Unique Legacy from B. F. Skinner. Journal of Behavioral Education, 1(2), 253-266.
Malmquist, S. (2004). Using a multi-level system of assessment to form instructional decisions and determine program effectiveness. In K. Johnson & L. Street (Eds.) The Morningside Model of Generative Instruction (pp. 52-93). Concord: MA: Cambridge Center for Behavioral Studies.
Markle, S. M. (1969). Good frames and bad: A grammar of frame writing (2nd ed.). New York: JohnWiley & Sons, Inc.
Noell, G. H. Freeland, J. T., Witt, J. C., & Gansle, K. A. (2001). Using brief assessments to identify effective interventions for individual students. Journal of School Psychology, 39, 335-355.
Padwa, D. J. (1962). Dimensions of the need. Reprinted from The American Behavioral Scientist, 6(3), pp.40-42.
Pascopella, A. (2010). RTI goes mainstream. District Administration. Retrieved from http://www.districtadministration.com/viewaRtIcle.aspx?aRtIcleid=2383
Pennypacker, H. S., Koenig, C. H. , and Lindsley, O. R. (1972). The handbook of the standard behavior chart. Kansas City, KS: Behavior Research Co.
Pisecco, S., Wristers, K., Swank, P., Silva, P., & Baker, D. (2003). The effect of academic selfconcept on ADHD and antisocial behaviors in early adolescence. Journal of Learning Disabilities, 34, 450-461.
Shapiro, E. S. (1996). Academic skills problems: Direct assessment and interventions. (2nd Ed.) New York: Guilford.
Shinn, M. R. (1989). Curriculum-based measurement: Assessing special children. New York: Guilford Press.
Shores, C., & Chester, K. (2009). Using RTI for school improvement: Raising every student's achievement scores. Thousands Oaks, CA: Corwin Press and Council for Exceptional Children.
Skinner, B. F. (1953). Science and human behavior. New York: Macmillan.
Stecker, P. (2007). Using progress monitoring with intensive services. Teaching Exceptional Children, 39(5), 50-57.
The Individuals with Disabilities Education Improvement Act of 2004, Pub. L. No. 108-446, [section]632, 118 Stat. 2744 (2004).
Tilly, D. W. (2006). Diagnosing the enabled learner: The promise of response to intervention. Perspectives, The International Dyslexia Association. 20-24.
Tilly, D. W. (2008). RtI stands for really terrific instruction. Presented at the North Carolina PBIS/RtI Conference, Greensboro, NC. Retrieved from http://www.ncpublicschools.org/docs/ positivebehavior/resources/cometogether/tillyhandout.pdf
Vaughn, S., & Fuchs, L. S. (2003). Redefining LD as inadequate response to instruction: The promise and potential problems [Special issue]. Learning Disabilities Research & Practice, 18(3), 137-146.
Vaughn, S., & Linan-Thompson, S. (2003). What is special about special education for students with learning disabilities? Journal of Special Education, 37, 140-147.
U.S. Department of Education (2006). Assistance to states for the education of children with disabilities and preschool grants for children with disabilities; Final rule. Federal Register, 71 (156) 46786-46787.
U.S. Department of Education Washington (2009). Race to the Top Program Executive Summary. Retrieved from http://www2.ed.gov/programs/racetothetop/executive-summary.pdf
Woodcock, R. W., & Johnson, M. B. (1989). Woodcock-Johnson Psycho-Educational Battery-Revised. Allen, TX: DLM Teaching Resources.
Author Contact Information
Alison Moors, MA, BCBA
5977B Rainier Avenue South
Seattle, WA 98118
Amy Weisenburgh-Snyder, MA, ABD
3705 Scotts Lane Box #D-11
Philadelphia, PA 19129
Joanne K. Robbins, Ph.D.
4705 S. Dakota St.
Seattle, WA 98118
(1) As noted by Lindsley (1991), "Rate was the measure of operant behavior used in the animal laboratories (Ferster & Skinner, 1957; Keller & Schoenfeld, 1950; Skinner, 1938). In Precision Teaching the term frequency is used instead of rate because it is more readily understood by non-psychologists. Furthermore in one of his more general books, Skinner (1953, p. 62) himself used the term frequency when describing behavior" (p. 254).
Table 1. Macros Level Pretest Scores Used for Placement Eligibility Student Grade Category Sarah 4 Learning Disability Mason 4 Autism WJ-III Pre Test Student Subtest Title GE score Calculation 3.0 Sarah Math Fluency 2.2 Applied Problems 3.5 Quantitative Concepts 1.4 Calculation 4.4 Mason Math Fluency 3.0 Applied Problems 4.2 Quantitative Concepts 4.2 Note. WJ-III = Woodcock-Johnson III Achievement Test; GE = Grade Equivalent. Table 2. Macros Level Pretest and Postest Scores Used for Program Evaluation Eligibility Student Grade Category Sarah 4 Learning Disability Mason 4 Autism WJ-III Pre Test Post Test Student Subtest Title GE score GE score Calculation 3.0 3.4 Sarah Math Fluency 2.2 4.8 Applied Problems 3.5 5.6 Quantitative Concepts 1.4 5.5 Calculation 4.4 5.4 Mason Math Fluency 3.0 9.8 Applied Problems 4.2 5.2 Quantitative Concepts 4.2 6.3 Pre-Post Test Student in 10 months +4 months Sarah +2 years, 6 months +2 years, 1 month +4 years, 1 month +1 year Mason +6 years, 8 months +1 year +2 years, 1 month Note. WJ-III = Woodcock-Johnson III Achievement Test; GE = Grade Equivalent.