Printer Friendly

Too much, too soon? unanswered questions from national response to intervention evaluation.


The report of the national response to intervention (RTI) evaluation study, conducted during 2011-2012, was released in November 2015. Anyone who has read the lengthy report can attest to its complexity and the design used in the study. Both these factors can influence the interpretation of the results from this evaluation. In this commentary, we (a) explain what the national RTI evaluation examined and highlight the strengths and weaknesses of the design, (b) clarify the results of the evaluation and highlight some key implementation issues, (c) describe how rigorous efficacy trials on reading interventions can supplement several issues left unanswered by the national evaluation, and (d) discuss implications for future research and practice based on the findings of the national evaluation and reading intervention research.

Every decade or so, the field of special education seems to lurch abruptly in a new direction by adopting new policies and practices, often with a good deal of enthusiasm. These changes are typically based on interesting theory and occasionally combined with evidence from small-scale studies or results of pilot projects. Examples include the resource room in the early 1970s (Hammill & Wiederholt, 1972), the use of special educators as consultants to classroom teachers (Knight, 1981) and to co-teach classes with classroom teachers in the 1980s (Friend, Reising, & Cook, 1993), and inclusion in the 1990s (Wang & Reynolds, 1996). All these initiatives aspired to help students with learning disabilities and other serious learning problems succeed in the least restrictive environment, all attempted to address problems with then-current practice, and all are still used today.

In the view of many, one of the more promising initiatives has been to intervene early when students show signs of reading difficulties, with a tiered approach known as response to intervention (RTI) or multitier system of support (e.g., Gersten & Dimino, 2006; Vaughn & Fuchs, 2003). With RTI, schools utilize efficient screening measures to identify students who are most likely to experience difficulty learning to read with typical class-wide reading instruction. In addition to evidence-based core instruction, those students who are identified as having difficulty are provided early evidence-based reading intervention to prevent reading failure (typically referred to as Tier 2, which usually entails 20-40 min of reading intervention several days per week). Students who do not respond to this initial intervention then receive even more intensive instruction (typically referred

RTI was conceptualized as a means of helping address a major service delivery problem in the field of education. At that time, students with reading difficulties were not permitted to receive services for reading problems until the end of second or even third grade, when they were identified as having a learning disability. RTI essentially paved the way for early evidence-based reading intervention, the goal of which was to help students improve their reading well before they fell too far behind and were labeled learning disabled.

Although many states--such as Ohio, Oregon, Texas, and California--began experimenting with RTI and data-based decision making in beginning reading during the latter half of the 1990s (Fuchs, Mock, Morgan, & Young, 2003), widespread implementation of RTI for reading began in the early 21 st century when Congress passed the No Child Left Behind Act (2006), which dedicated a good deal of funding to Reading First, an effort to enhance the quality of beginning reading instruction to all but especially to those students in the at risk category. Implementation of RTI was further bolstered with the passage of the reauthorization of the Individuals With Disabilities Education Act (2006), which allowed states to use RTI as a method to identify students with learning disabilities. Since the passage of this act, RTI has been strongly supported at the federal level, and virtually every state actively encourages schools to use a preventative/RTI approach, particularly for beginning reading in the primary grades.

With widespread adoption of RTI, a national evaluation seemed in order. Results of the national RTI evaluation (conducted by MDRC, SRI International, and Instructional Research Group) were released by the U.S. Department of Education late in 2015. See Balu et al. (2015) for the full report of this national evaluation. When the Education Week article "RTI Practice Falls Short of Promise" (Sparks, 2015) was published a little over a year ago, many felt that the results of the national evaluation indicated that RTI was not effective for all students receiving early intervention in reading.

In fact, though unsettling, the findings provide no sweeping judgment on the effectiveness of RTI as a system for preventing future reading failure. Thus, the purpose of this commentary is as follows:

* Explain what the national RTI evaluation (Balu et al., 2015) examined and highlight the strengths and weaknesses of the design

* Clarify the results of the evaluation and identify some key implementation issues

* Describe how rigorous efficacy trials on reading interventions can supplement issues left unanswered by the national evaluation

* Discuss implications for future research and practice based on the findings of the national evaluation and reading intervention research

The National RTI Evaluation Study: How Was It Done, and What Does It Tell Us?

The national RTI evaluation study (Balu et al., 2015), conducted during 2011-2012, included an impact analysis and a descriptive analysis based on survey data. The descriptive component of the national study involved 1,300 schools randomly selected from 13 states. Note that this is a different sample from that of the veteran RTI implementers who served in the school impact evaluation. The descriptive study aimed to describe current RTI practice by comparing RTI implementation of veteran RTI implementers who served in the impact study with that of 1,300 randomly selected schools.

This article focuses on the sample used in the impact analysis. We use the findings from the descriptive component of the study only to provide context for the impact analysis. The impact study involved 146 elementary schools in 13 states that had been implementing RTI in reading for at least 3 years before the 2011-2012 school year. Over 20,000 students participated in the study. The 146 elementary schools that formed the impact sample were nominated by an expert panel familiar with RTI implementation. Evaluators used a 2-year multistep process to choose schools that had (a) implemented key RTI practices recommended in the Institute of Education Sciences' practice guide on RTI for elementary reading (Gersten et al., 2009) for at least 3 years, (b) used a quantitative data-based system to identify students in need of more intense reading assistance, (c) met thresholds for the number of students in each grade and in each tier of instruction, and (d) agreed to meet data collection requirements for the evaluation. The study team used site visits to verify that schools met these selection criteria.

Key Aspects of the Impact Evaluation

The impact evaluation used state-of-the-art regression discontinuity methods (Balu et al., 2015; Imbens & Lemieux, 2008; Nomi & Raudenbush, 2012; Reardon & Robinson, 2012) to evaluate the effectiveness of reading intervention for students at or slightly below the cut point on their schools' reading screening battery for Grades 1 through 3. Although the national RTI evaluation research team--and its technical advisory board--would have preferred to implement a gold standard randomized controlled trial that could have answered the question of whether RTI implementation is effective, this option was never a possibility. Once the federal government and legislation in all states actively encourage a policy such as RTI--and, in some cases, mandate that policy--random assignment of schools to either RTI or non-RTI conditions is not possible. To do so, the evaluation team would have had to ask personnel in control schools to knowingly violate state education regulations as well as, in many cases, discontinue newly implemented practices. An important concern of the research team remained understanding the impact of RTI as currently implemented in schools. This is often called an evaluation study as opposed to the efficacy studies discussed in the latter part of this article.

An interrupted time-series design with a control sample from each state was considered; however, this quasi-experimental design would require 6 years of longitudinal research. Conducting a study over such a long period is problematic for at least two reasons. States shift their assessments over time, thereby making comparisons of groups difficult. Also, privileged information would need to be obtained on specific special education referral and identification data for a large-scale study of this scope.

Thus, the only feasible option was the far more limited evaluation design of regression discontinuity. This design is considered to be the strongest quasi-experimental design, although it answers a different question than whether RTI is effective for targeted students. In this context, a regression discontinuity design assesses the issue of whether the combination of the cut score and interventions used in the 146 schools in the evaluation made a difference in students' reading performance. Essentially, the regression discontinuity design allows for a comparison of those students who are just below the cut point and receiving reading intervention with those students who are just above the cut point and not receiving reading intervention; that is, it examines how those below the cut point perform with the intervention in relation to how they would have been predicted to perform based on the performance of those not receiving intervention (see Imbens & Kalyanaraman, 2012). This distinction is subtle and not well understood. Even experts (e.g., Tang, Cook, Kisbu-Sakarya, Hock, & Chiang, 2016) are still discussing precisely what a significant impact with regression discontinuity means--for example, how generalizable results are for students, say, scoring 5 points below the cut point on the screening measure (typical AIMSweb or DIBELS in this study).

All that we can conclude from this regression discontinuity study is whether the current combination of cut score and intervention programs used was helpful to the relatively small proportion of students slightly below the cut points used by the 146 schools in the evaluation sample. Note that these cut points varied from school to school, although focus on students "on the cusp" was constant in the analysis. For example, in one school, only students scoring below the 25th percentile (on a nationally normed screener) might receive intervention; in another school, students scoring at or below the 45th percentile would receive intervention. The national RTI evaluation did not answer the question of whether these Tier 2 reading interventions work for all students served in the participating schools who receive them (Balu et al., 2015; Shinn & Brown, 2016). However, it did document on average the impact on students at or near the various cut points used in the 146 schools, all of which were experienced RTI schools.

Sample. The impact sample was selected to include schools implementing all practices important for RTI no later than 2009-2010. Those practices included (a) use of three or more tiers of increasing instructional intensity to deliver reading services to students, (b) universal screening in reading at least twice a year, (c) use of progress monitoring (beyond universal screening) for students reading below grade level to determine whether intervention is working for students placed in Tier 2 or Tier 3, and (d) use of data for placing students in Tier 2 or Tier 3 and moving students to an appropriate tier.

Initially schools were recommended by experts in the field. Next the research team interviewed the schools. If they seemed to meet the criteria, site visits were conducted to verify presence of these elements. Schools in the impact sample provided information about the score on a screening test that they used to determine a student's placement in Tier 2 or Tier 3. These were later verified empirically and corrected to reflect the decision rule in actual practice.

To give a sense of the scope of the study, for first grade, data were analyzed for 8,342 students, and data from 6,049 students were used in the regression discontinuity analysis (because only about three quarters of the students fit into the optimal bandwidth). Specifically, the analysis excluded most students at the very high and very low ends of the distribution.

Measures. Schools used an array of screening batteries at the beginning of year, though by far the most commonly used measures were DIBELS and AIMSweb. These scores were standardized for each site. Student posttest data included performance on the Early Childhood Longitudinal Study--Kindergarten Class of 1998-1999 reading battery for Grade 1 (ECLS-K; Tourangeau, Nord, Le, Sorongon, & Najarian, 2009), Test of Word Reading Efficiency--Second Edition (TOWRE-2; a timed test of word list reading; Torgesen, Wagner, & Rashotte, 2012) for Grades 1 and 2, and state assessment scores for Grade 3.

Reading Intervention. Although a variety of reading interventions were used across the schools, most schools appeared to implement a "standard protocol" approach for intervention; that is, one intervention program was typically used for all students receiving Tier 2. Students received approximately 40 min of Tier 2 reading intervention 4 or 5 days a week in groups averaging 5.3 students in Grade 1, 5.9 in Grade 2. and 6.4 in Grade 3. Reading interventions were provided by teachers and paraprofessionals. More students in Grade 1 (than in Grades 2 and 3) were taught by paraprofessionals (see Balu et al., 2015, for details).

Key Findings From the National RTI Evaluation

For Grade 1 students, the intervention had a significant negative effect on their ECLS-K reading assessment (effect size = -0.17, p < .05) and no significant effect on TOWRE-2 (effect size = -0.11, ns; see Figure 1). For those in Grades 2 and 3. there was no statistically significant evidence of effectiveness but no negative effect either. In fact, for Grade 2, the effect is positive, albeit small.

However startling and disturbing the findings may be for the upper end of the "at risk" category of students (i.e., 35th-40th percentile), we suggest some caution in interpreting these findings. Because of the nature of the design used in the evaluation, the findings do not indicate that all students receiving reading intervention (i.e., those in the 35th-40th percentile as well as those below the 35th percentile) fared poorly as a result of receiving Tier 2 reading interventions. The findings should also be interpreted within the context of the implementation information obtained from the schools as part of the descriptive analyses. We highlight two issues that temper these findings.

The descriptive component of the study revealed that 60% of the classrooms reported that students receiving intervention missed some of their reading and language arts instruction, although this practice varied widely. The reading intervention, rather than being totally supplemental, instead supplanted some of the core classroom reading instruction (Shinn & Brown, 2016) for slightly over half of the students. It is unclear how much of core reading instruction and activities they missed (e.g., was it only independent seat-work? time devoted to writing?).

To facilitate a clean comparison of those just below the cut point with those just above the cut point, it is important that students above the cut point not receive any reading intervention and that students below the cut point receive reading intervention as a supplement to their core instruction. However, neither of these requirements was fully met by all students in the study, thereby raising questions regarding the validity of the comparison.

From the Balu et al. (2015) analysis, we believe that far too many Grade 1 students received reading intervention. In first grade, 41% of students received small-group reading intervention. This is twice the number recommended by experts (e.g., Vaughn & Fuchs, 2003), as represented in the famous triangle used to describe RTI, where a total of 20% are estimated to require intervention (approximately 15% receive preventative intervention in Tier 2 and 5% receive intensive support in Tier 3). Evidence indicates that the intervention seems to harm those students on the cusp (i.e., those students falling between the 35th and 40th percentile, on average, at their school). As such, there is good reason to recommend that schools select for intervention only those students with scores well below the benchmark, perhaps only those in the lowest quartile of their school. Although the proportions go down a bit in Grades 2 and 3, one cannot help but conclude that a great deal of resources are being wasted.


In summary, because of the nature of the design, the national RTI evaluation does not answer the question of whether these Tier 2 reading interventions work for all students who receive them. Rather, results suggest that RTI is not benefiting students whose scores are at or somewhat below the cut score. The results also do not suggest that reading interventions are ineffective in general. However, given the startling news reports that have oversimplified the results of the national evaluation (e.g., Loewenberg, 2015; Sparks, 2015), the field has been left with questions about whether the interventions used in RTI models are effective in improving student reading outcomes.

In the context of these issues, Gersten, Newman-Gonchar, Haymond, and Dimino (2017) reviewed all Tier 2 reading intervention research. We use the results from their research review to provide a context for interpreting the key findings from this evaluation and to examine the effectiveness of reading interventions in general.

Review of Research on Tier 2 Reading Interventions

Gersten et al.'s (2017) review aimed to clarify what can be inferred about the effectiveness of Tier 2 reading interventions based on high-quality scientific research (as specified by the What Works Clearinghouse). They conducted an exhaustive literature search using contemporary methods to locate all research on Grade 1-3 reading interventions published from January 2002 to June 2014.

The review team screened 1,813 studies to confirm that they included (a) a randomized controlled trial or a quasi-experimental design, (b) students in Grades 1-3 who scored below the 35th percentile on a normed standardized reading test or received scores on a valid screening test indicating the likelihood of performing below the 35th percentile on a reading measure at the end of the year without receiving further support, (c) a reading intervention that was longer than 8 hours, and (d) at least one relevant reading outcome. The search excluded whole-class interventions (Tier 1) as well as Tier 3 interventions.

Gersten et al. (2017) found 43 studies that met the screening criteria, and each study was analyzed by two certified reviewers using the What Works Clearinghouse standards (Version 3.0; U.S. Department of Education, Institute of Education Sciences, & What Works Clearinghouse, 2013). This is a large number of studies that meet the high bar for rigor established by the Institute of Education Sciences and a feat that the field should be proud of. Ultimately, 23 studies met the What Works Clearinghouse standards--an impressive number of studies for any area of educational research.

Reading Interventions Included in the Research Review

Interventions from the 23 studies, in general, covered multiple domains of reading. All but one of the 23 interventions focused on systematic work in decoding, and all addressed passage or sentence reading, with instruction in literal comprehension. A larger proportion of time was spent on vocabulary development, comprehension instruction, and encoding (spelling) in the Grade 2 and 3 interventions than in the Grade 1 interventions. Slightly fewer than half of the interventions included a writing component, sometimes as part of encoding-spelling instruction. However, the bulk of the time was spent on phonemic awareness, phonics, and word- and sentence-reading activities (especially in first grade). In some cases, vocabulary instruction seemed pro forma--that is, briefly defining words in the passage prior to reading.

The studies used one-to-one and small-group formats (with a slight prevalence of one-to-one) to provide Tier 2 reading interventions. In contrast, in the RTI evaluation, virtually all interventions were implemented in small-group settings.

Groups were also smaller in the efficacy intervention studies included in the research review (3.4 students for Grade 1, 2.8 for Grades 2 and 3) as compared with the RTI evaluation schools (5.3 students in Grade 1, 5.9 in Grade 2, and 6.4 in Grade 3). The larger group sizes may have been a factor in the minimal or nonexistent and even negative effects.

The majority of the interventions in the review were implemented 4 or 5 days per week, 30 to 45 min per day, for 12 to 25 weeks. This is somewhat similar to time spent in the national evaluation sample schools (40 min, 4 or 5 days a week). Thus, length of time spent in intervention is unlikely to be a factor leading to the nonsignificant or even negative effects in the national evaluation.

In virtually all the tightly controlled efficacy studies, interventionists (be they paraprofessionals, teachers, or volunteers) were frequently observed and provided with feedback and support related to fidelity of implementation. In all except two studies, interventionists received ongoing support in addition to initial training. In comparison, in the national evaluation sample, it appears unlikely that schools would have had the resources to provide the high level of ongoing training and support provided by the externally funded efficacy studies.

Key Findings From the Reading Intervention Research Review

When Tier 2 reading interventions are implemented in primary grades with fidelity and ample support for interventionists, they tend to result in improved outcomes for students at risk for or with reading difficulties (see Table 1 for the effect sizes), which aligns with the findings of a recent meta-analysis conducted by Wanzek and colleagues (2016) for reading interventions in Grades K-3. Effects were strongest and most consistent in the domain of word and pseudoword reading, typically representing a gain of 13 to 17 percentile points. So, a student at the 25th percentile at the onset of the intervention would now be at the 41st to 42nd percentile, whereas a student at the 16th percentile would be at the 29th to 30th percentile. Specifically, these interventions would help a student in the lowest quartile (i.e., at the high end of needing Tier 2 intervention) to end the year closer to grade level (i.e., the 41st-42nd percentile), whereas students at the lower end of Tier 2 eligibility (e.g., the 16th percentile) would end up close to the 29th to 30th percentile, appreciably improved but still requiring additional help.

Effects in reading comprehension were also moderately strong, indicating that, especially for Grade 1 students, these interventions improve their comprehension by, on average, 13 percentile points (e.g., a student at the 25th percentile might rise to the 38th and a student at the 16th might rise to the 29th), although, in our view, many of the Grade 1 comprehension measures focus heavily on literal comprehension and simple inferences.

Few of the reviewed studies included passage fluency or vocabulary measures. When fluency was included, the effects were smaller at Grade 1 than at Grades 2 and 3. Vocabulary was not assessed at all in Grade 1 and assessed in only two studies of Grade 2 and 3 interventions, with no significant results. Given that vocabulary was not a major focus of the reading interventions, it was not surprising to see no effects in vocabulary outcomes.

Data from the 23 studies also show that reading interventions tend to be effective when delivered one-on-one or in small-group settings. All of the individually administered interventions and all but one of the small-group interventions resulted in effects in at least one domain of reading. However, there was a tendency for the one-on-one interventions to produce more positive effects than the small-group interventions and to produce significant effects more frequently. Yet, in our view, the difference is not strong enough to advocate for one-on-one instruction in place of small-group instruction.

Implications for Future Research and Practice

Again, it is helpful to remember that the RTI national evaluation does not answer the question of whether early intervention in reading works for all students who receive reading intervention. Rather, results suggest that RTI is not benefiting students whose scores are somewhat below the cut score used in their school (i.e., on average, students whose scores are at or somewhat below the 40th percentile).

We do not know why students on the cusp of current screening practices did not benefit from intervention, but we can speculate on what the reasons might be: larger intervention group sizes, limited training and ongoing support for interventionists, some intervention students missing core instruction, and some students above the cut score receiving reading intervention. In addition, we highlight other possible reasons. It is likely that current screening batteries and procedures are identifying too many false positives--that is, too many students who do not need intervention. This evaluation indicates that students who do not need an intervention in reading are receiving one and that the services they receive are either harming them or not helping them enough to make a difference. As a society, we may have gone overboard in trying to ensure that "no child is left behind" and that all students learn to read by the end of Grade 3.

It may be that students who score only slightly below grade-level expectations may do fine with no intervention at all, rather than the "one size fits all" standard protocol approach used by many schools in this evaluation (Balu et al., 2015). Systematic review and practice may be time wasted, tedious, and counterproductive for students on the cusp. Those likely to fall somewhat below grade level by the end of the year may need a faster paced type of support, perhaps a bit of practice and review that supplements the core reading series.

Most of the specific intervention curricula used in the RTI evaluation appear to be programs based on principles from rigorous research, as articulated, for example, in the National Reading Panel report (National Institute of Child Health and Human Development, 2000). However, few were actually subjected to rigorous clinical trials. Although curricula are, ideally, chosen because they are evidence based (e.g., Vaughn & Fuchs, 2003), most of the programs do not have rigorous evidence to support their use, as seen from a cursory look at the What Works Clearinghouse database. Instead, they may have used research-based principles to serve as a basis for curriculum development. This is, indeed, a loose criterion to use, yet one espoused during the Reading First years and, many argue, a sensible way to select curricula, given the paucity of curricula subjected to rigorous evaluations.

Finally, it is unclear whether--or to what extent--the intervention curricula were aligned with the reading instruction going on in the students' classrooms (Balu et al., 2015). The intervention program and the core curriculum may have addressed very different aspects of reading during the same week. Students may have received no assistance in transferring what they learned during intervention to what they were covering in the classroom. Serious observational work has to be undertaken to determine how the intervention and Tier 1 core reading instruction are aligned and what areas of reading instruction are neglected.

Rather than continue to speculate on causes for the lack of effects in the national evaluation, what can be done to move forward as a field? Clearly, there needs to be more field evaluations of RTI. All evidence of effectiveness of interventions is from highly controlled research studies. Small field studies examining the effectiveness of RTI practices will help answer some of the questions left unanswered by the current national evaluation.

We call on districts or states (perhaps in collaboration with researchers) to take a serious look at current RTI practice and evaluate its impact. Districts or states should use random assignment of students to cither intervention or no-intervention conditions for a year or even a semester and assess the impact of current RTI practice on student outcomes. This could be as simple as varying group size from, for example, 6 to 3 or randomly assigning half the interventionists to a treatment condition where they receive ongoing support and training and the other half to a business-as-usual condition. They should document what happens in the business-as-usual condition, so they can clearly delineate the differences in the instruction that each group received and attribute student gains to the right instructional differences. Most important, however, they should publish their findings to help the field begin to fill in answers to whether RTI practices are effective in helping prevent reading failure.

Although fidelity of implementation was not assessed in the national evaluation, the individual researchers conducted tightly controlled intervention studies. They monitored implementation on a weekly (if not more frequent) basis and provided interventionists with feedback on their practice. Such monitoring and support may be the solution for improving reading outcomes for primary grade students. Schools need to spend more time monitoring fidelity of implementation and providing additional training or support to those providing reading interventions. We realize that schools may not have the resources to do this. However, we feel that only with high fidelity of implementation will RTI work. Such sentiments are being echoed by other researchers in the field (e.g., VanDerHeyden et al., 2016).

Most reading interventions examined in the synthesis, in general, covered multiple areas of reading. Although all interventions focused on systematic work in decoding, the same cannot be said for the coverage of other areas of reading. Reading interventions in primary grades should focus on including quality comprehension and vocabulary instruction.

We conclude this commentary with another reminder that the findings of the national RTI evaluation do not indicate that early intervention in reading is not working for all students receiving reading interventions or that reading interventions in general are not effective. Additional studies are needed to make that determination.


Balu, R., Zhu, P., Doolittle, F., Schiller, E., Jenkins, J., & Gersten, R. (2015). Evaluation of response to intervention practices for elementary school reading (NCEE 2016-4000). Washington, DC: U.S. Department of Education, Institute of Education Sciences. Retrieved from

Friend, M., Reising, M., & Cook, L. (1993). Co-teaching: An overview of the past, a glimpse at the present, and considerations for the future. Preventing School Failure: Alternative Education for Children and Youth. 37(4). 6-10. doi:10.1080/1045988X.1993.9944611

Fuchs, D., Mock, D., Morgan, P. L., & Young, C. L. (2003). Responsiveness-to-intervention: Definitions, evidence, and implications for the learning disabilities construct. Learning Disabilities Research & Practice, 18, 157-171. doi:10.1111/1540-5826.00072

Gersten, R., Compton, D., Connor, C. M., Dimino, J., Santoro, L., Linan-Thompson, S., & Tilly, W. D. (2009). Assisting students struggling with reading: Response to intervention and multi-tier intervention for reading in the primary grades. IES practice guide (NCEE 2009-4045). Washington, DC: National Center for Education Evaluation and Regional Assistance. Retrieved from

Gersten, R., & Dimino, J. (2006). RTI (response to intervention): Rethinking special education for students with reading difficulties. Reading Research Quarterly, 41, 99-108. doi:10.1598/RRQ.41.1.5

Gersten, R., Newman-Gonchar, R. A., Haymond, K. S., & Dimino, J. (2017). What is the evidence base to support reading interventions for improving student outcomes in Grades 1-3? Washington, DC: Regional Educational Laboratory Southeast.

Hammill, D., & Wiederholt, J. L. (1972). The resource room: Rationale and implementation. Philadelphia, PA: Buttonwood Farms.

Imbens, G., & Kalyanaraman, K. (2012). Optimal bandwidth choice for the regression discontinuity estimator. The Review of Economic Studies, 79, 933-959. doi:10.1093/restud/rdr043

Imbens, G. W., & Lemieux, T. (2008). Regression discontinuity designs: A guide to practice. Journal of Econometrics, 142, 615-635. doi:10.1016/j.jeconom.2007.05.001

Individuals with Disabilities Education Act, 20 U.S.C. [section][section] 1400 et seq. (2006 & Supp, V. 2011)

Knight, M. (1981). IMPACT: Interactive model for professional action and change for teachers. Journal of Staff Development, 2, 103-113.

Loewenberg, A. (2015, December). New study raises questions about RTI implementation. New America. Retrieved from

National Institute of Child Health and Human Development. (2000). Teaching children to read: Reports of the subgroups (Publication No. 00-4754). Washington. DC: National Institutes of Health.

No Child Left Behind Act of 2001, 20 U.S.C. [section][section] 6301 et seq. (2006 & Supp, V. 2011)

Nomi, T., & Raudenbush, S. W. (2012). Understanding treatment effects heterogeneities using multi-site regression discontinuity designs: Example from a "Double-Dose " algebra study in Chicago. Evanston, IL: Society for Research on Educational Effectiveness.

Reardon, S. F., & Robinson, J. P. (2012). Regression discontinuity designs with multiple rating-score variables. Journal of Research on Educational Effectiveness, 5, 83-104. doi:10.1080/1934574 7.2011.609583

Shinn, M. R., & Brown, R. (2016). Much ado about little: The dangers of disseminating the RTI outcome study without careful analysis. Retrieved from

Sparks, S. D. (2015). Study: RTI practice falls short of promise. Education Week, 35(12), 1, 12. Retrieved from

Tang, Y., Cook, T., Kisbu-Sakarya, Y., Hock, H., & Chiang, H. (2016). The comparative regression discontinuity (CRD) design: An overview and demonstration of its performance relative to basic RD and the randomized experiment. Advances in Econometrics, 38.

Torgesen, J., Wagner, R., & Rashotte, C. (2012). Test of Word Reading Efficiency, Second Edition (TOWRE-2). San Antonio, TX: Pearson Assessment.

Tourangeau, K., Nord, C, Le, T., Sorongon, A. G., & Najarian, M. (2009). Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 (ECLS-K), Combined User's Manual for the ECLS-K Eighth-Grade and K-8 Full Sample Data Files and Electronic Codebooks (NCES 2009-004). Washington, DC: National Center for Education Statistics, Institute of Education Sciences. U.S. Department of Education. Retrieved from

U.S. Department of Education, Institute of Education Sciences, & What Works Clearinghouse. (2013). What Works Clearinghouse: Procedures and standards handbook (Version 3.0). Retrieved from

VanDerHeyden, A., Burns, M., Brown, R., Shinn, M. R., Kukic, S., Gibbons, K., ... Tilly, W. D. (2016). Four steps to implement RTI correctly. Education Week, 35(15), 25. Retrieved from

Vaughn, S., & Fuchs, L. S. (2003). Redefining learning disabilities as inadequate response to instruction: The promise and potential problems. Learning Disabilities Research & Practice, 18, 137-146. doi:10.1111/1540-5826.00070

Wang, M. C, & Reynolds, M. C. (1996). Progressive inclusion: Meeting new challenges in special education. Theory Into Practice, 35(1), 20-25. doi:10.1080/00405849609543697

Wanzek, J., Vaughn, S., Scammacca, N., Gatlin, B., Walker, M. A., & Capin, P. (2016). Metaanalyses of the effects of Tier 2 type reading interventions in Grades K-3. Educational Psychology Review, 28, 551-576. doi:10.1007/S10648-015-9321-7

Zhu, P., Jacob, R., Bloom, H., & Xu, Z. (2011). Designing and analyzing studies that randomize schools to estimate intervention effects on student academic outcomes without classroom-level information. Educational Evaluation and Policy Analysis, 34(1), 45-68. doi:10.3102/0162373711423786

Authors' Note

The senior author, Russell Gersten, played a significant role in the national response to intervention evaluation, though hardly the leading role; perspectives shared here are solely our own.

We thank Rebecca Newman-Gonchar and Kelly Haymond for their assistance with the manuscript.

Manuscript received November 2016; accepted January 2017.

Russell Gersten (1), Madhavi Jayanthi (1), and Joseph Dimino (1)

(1) Instructional Research Group

Corresponding Author:

Russell Gersten, Instructional Research Group, 4281 Katella Ave, Suite 205, Los Alamitos, CA 90720.

Table 1. Impacts From the 23 Studies of
Reading Interventions in Grades 1-3.

Intervention: Area of reading    Studies, n    Weighted mean
                                               effect size
Grade 1                            15
Word and pseudoword reading        14            0.452 (**)
Reading comprehension               8            0.386 (**)
Passage fluency                     4            0.226 (*)
Grades 2 and 3                      8
Word and pseudoword reading         8            0.456 (**)
Reading comprehension               6            0.327 (**)
Passage fluency                     7            0.374 (**)
Vocabulary                          2            0.176 (ns)

                                 Expected scores for
                                 intervention students who
                                 normally would perform at ...

Intervention: Area of reading    16th percentile      25th percentile

Grade 1
Word and pseudoword reading            29.5                41.3
Reading comprehension                  27.2                38.7
Passage fluency                        22.2                32.8
Grades 2 and 3
Word and pseudoword reading            29.7                41.5
Reading comprehension                  25.3                36.5
Passage fluency                        26.8                38.2
Vocabulary                             20.8                31.1

Note. Re-analysis of data from Gersten et al. (2017).
ns = not statistically significant.

(*) p < .05.  (**) p < .001.
COPYRIGHT 2017 Sage Publications, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2017 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Gersten, Russell; Jayanthi, Madhavi; Dimino, Joseph
Publication:Exceptional Children
Geographic Code:1USA
Date:Apr 1, 2017
Previous Article:Preview.
Next Article:Critique of the National Evaluation of response to intervention: A case for simpler frameworks.

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters