African-American students struggle on Ohio's high-stakes test.
Advantages and Disadvantages of High-Stakes Testing
Descriptions of some advantages and disadvantages of high-stake testing programs appear to affect the curriculum. Teachers must ultimately affect what and how students study and learn and what they come to value in the educational process. Heyneman (1987) noted important advantages and disadvantages that have been attributed to high-stakes tests:
* They are relatively objective and a means of distributing educational benefits.
* Preparation for high-stakes tests often overemphasizes rote memorization and cramming by students and drill-and-practice as a teaching method. Lieherman (1991) posited that developing a system of accountability could he an impetus for raising standards, reforming schools, and rethinking American education. What a challenge for students if these tests could be aligned with what is taught, what is learned, and what is needed. In a study of 19 teachers Smith and Rottenberg (1991) found that testing reduced the time available for ordinary instruction.
Schools were also neglecting material not on the tests, while encouraging the use of instructional methods resembling testing, such as multiple-choice exams. Teachers also changed the content and sequence of instruction throughout the year to accommodate the high-stakes test. Should the curriculum change to improve standardized test scores? Herman and Golan (1990) also found that teachers spent an inordinate amount of time preparing for tests. Similar results were in a study by Shepard and Dougherty (1991); they found that teachers were spending class time on worksheets covering test content and format. Shepard and Dougherty (1991) found that teachers gave greater emphasis to basic skills instruction and that non-tested content suffered because of the focus on standardized tests. In general, high-stakes testing affects both the content and the sequence of instruction, by encouraging a "teach to the test" approach in the classroom (Orfield & Wald, 2000).
* The use of examinations for the dual purpose of certifying the completion of a secondary education and for university admission puts those not bound for college at a disadvantage.
* Results for individual students are often used to serve a variety of purposes (e.g., tracking, promotion, and graduation) for which they may not be designed. McLaughlin (1991) associated the following possible negative fallout of high-stakes testing schemes as:
* discouraging classroom innovation, risk-taking, and invention;
* forcing out of the curriculum the very kinds of learning (higher-order thinking and problem solving) that educational theorist and others say are most important to "increase national competitiveness" and success in the world marketplace (p. 250-251). Natriello and Pallas (1999) examined the racial and ethnic disparities of minority students in performance on the Texas Assessment of Academic Skills (TAAS) between 1996 and 1998. The authors concluded, "These tests are, and will remain for some time, an impediment for the graduation prospects of African-American and Hispanic youth." In another study, Heubert (1998) pointed out that students of color are almost always overrepresented among those who are denied diplomas on the basis of test scores. Historically, high-stakes testing in the United States has been used to diagnose and classify students, and to assign them to educational treatments (Madaus, 1991). Heubert and Hauser (1999) noted that proponents and opponents of graduation testing agree there exists relatively little research that addresses the consequences of such testing. Although the discussions continue, increasing numbers of students continue to suffer in the high-stakes movement. For example, Johnston (1997) found that some Michigan parents in an affluent school district refused to allow their children to take a high-stakes standardized test when they perceived that it might be harmful to their children in terms of future college admissions. Unfortunately, urban parents do not exercise this option, perhaps because they are misinformed, or ill informed of the severity and consequences of the tests.
Research on high-stakes testing has included perspectives from countless vantage points (Heubert & Hauser, 1999; Kohn, 2001; Natriello & Pallas, 1999). Teachers, school administrators, university researchers, policymakers, and politicians have all contributed to the myriad of "what students need," "best practices," or "effective solutions" in conversations on high-stakes tests (Hilliard, 2000; Kohn, 2000; Madaus, 1998).
Highlighting some of the advantages and disadvantages of testing provides a strategy to "hold schools accountable" by using the test scores to trigger rewards, sanctions, or remedial actions (Darling-Hammond, 1991; Lieberman, 1991). Darling-Hammond (1991) has suggested that the negative effects of high-stakes tests stem partly from the nature of the tests and partly from the way in which the tests have been used for educational decision-making. For example, Randall (2001) reported that results on state tests are also used to determine when schools are in need of improvement and qualify for special funding for their disadvantaged students.
Ohio's High-Stakes Test
A review of the legislative objectives relating to the contemporary statewide testing trend reveals a plethora of hopes and expectations that diverge somewhat across states but have many elements in common. Legislation in the state of Ohio is an example, which can help to alleviate the problems already posed by testing. Ohio's Statewide Proficiency Test went into effect at the beginning of the 1990-91 academic year. By an action of the State Board of Education, all students who entered the ninth grade prior to the 1990-91 school year, even those who dropped out of school and reentered after September 1990, must take the Ohio Proficiency Examination. The Ohio Proficiency Examination, commonly called the Ninth Grade Proficiency Test, is a criterion-referenced test designed to assess competence on a designated minimum in reading, writing, mathematics, science, and citizenship skills. In order to receive a high school diploma, students are bound by the policy and must pass all five skill areas of the examination (i.e., citizenship, mathematics, reading, science, and writing) in addition to fulfilling the regular graduation requirements. The test is initially administered during the fall of a student's ninth grade year. Students failing the first attempt are retested in the spring and if they continue to fail are retested in the fall and spring of their tenth, eleventh, and twelfth-grade years. Thus, each student receives eight opportunities to pass the test prior to the time of his or her graduation. In the next section three students and their preparation experiences for a high-stakes mathematics test will be discussed.
The Three Fallen Angels
Below are descriptions of three African-American tenth graders from a large urban high school in the Midwest. The students are Wanda, Art, and Boo (not their real names). These students did not pass mathematics, but passed the other parts of the test, which were citizenship, reading, science, and writing. All were active in extracurricular activities such as music, theater, and school sports. They also had stated goals for their future. Wanda wants to be a law enforcement officer, Art wants to be a pharmacist, and Boo wants to be an architectural engineer. They have done everything that school administrators, teachers, and parents required, but their mathematics test performance does not reflect this.
Wanda is a 15-year-old African-American female who is growing up in a family in which a college education is not valued; neither of her parents graduated from high school. Her hobbies consist of reading African-American literature, singing, dancing, playing the saxophone, and talking on the telephone. Wanda enjoys listening to jazz, rhythm and blues, and rap. She admits reading for school purposes has been more or less a turn-off because as a child she was forced to read and memorize everything.
Art is a 16-year-old African-American male who wishes to become a pharmacist. His hobbies are drawing, riding his bicycle, weightlifting, and repairing old cars given to him by his father. The collection so far consists of a 1987 Escort G.T. and a 1986 Buick Regal. He describes himself as not being spoiled, but doing productive things. Art's preference is the 1986 Buick Regal because it feels more luxurious and roomier.
Boo is a 16-year-old African-American female. She lives with her mother and four siblings (2 boys, 2 girls). Her hobbies include drawing, singing, dancing, and acting. She is a member of the cheerleading squad, captain on the Army ROTC drill team, and a member of her school's softball team.
Qualitative methodologies (Bogdan & Biklen, 1992; Miles & Huberman, 1994) were used in this study. Qualitative approaches involve strategies that allow researchers to "consider experiences from the informants' perspectives" (Bogdan & Biklen, 1992, p. 32). The strategies used in this study included participant observations and interviews. These two forms of data collection techniques were used for "triangulation," or using data collected in one way to "cross-check" the accuracy of data gathered in another way (LeCompte & Priessle, 1993).
Setting and Participants
The study took place in an urban high school with an enrollment of 914 students (90% African-American) located in central Ohio. The participants consisted of three African-American students (2 girls and 1 boy). The participants were selected from a cumulative list. I requested a list of all the 9th, 10th, 11th, and 12th graders who had failed part or the entire test. After looking at each grade level, the ninth graders who would be tenth graders the following school year appeared to show the most promise. Because the upcoming tenth graders were at a time in the testing process where substantial improvement on the test could take place without penalizing the students unfairly, I decided that this would be the population for the study. After reviewing the number of ninth graders who did not pass any parts of the test, those ninth graders who had passed three parts of the test, and those who had failed only two parts of the test (e.g., reading/mathematics, or writing/ mathematics); I decided to focus on those students who had only failed the mathematics portion. From the compiled list of ninth graders who had only failed the mathematics portion, only eighteen of the students were African-American. Each of the eighteen parents of the prospective research participants was sent an explanation of the study, and permission slips that were to be signed and returned. Of the eighteen permission slips that were sent out, only six were returned. Thus, three of the six students were selected for the study. These students were also in the same mathematics class.
Data Collection and Analysis
Each student was interviewed 6 times with each interview lasting approximately one hour in length. The number six was chosen because of the number of interview protocols developed and selected to do with the participants. All student interviews took place on the school premises. Taped interviews lasting 60 minutes were conducted with each individual student. The interviews involved open-ended questions regarding their preparation for a high-stakes mathematics test. The interviews usually began with the following question: "What mathematics classes stand out in your memory?" The purpose of the first question was to get them talking about their mathematical experiences. As the interviews progressed, the students were eventually asked questions such as: "How did you prepare for the test?" or "Would you provide examples of the work you did in preparing for the test?" or "What type of assistance did you receive from the teacher?"
All students in the study were observed three times weekly for sixteen weeks. It was the routine to enter the classroom, sit down, and take attendance observing those students who frequently arrived after class discussion started or the day's assignments were given. Note taking on the students' behavior, attitudes, and work habit was part of the observation. It included recording whether the teacher went over homework, the topics that were covered, or whether students worked individually or in groups. The teacher wrote on the overhead projector the exercises he would provide to the students (which included a plethora of worksheets) and the jokes he would tell of great mathematicians that he discovered during his formative years. When it came to the students, their comments, questions, answers, vocabulary, or the content of their conversations (sometimes not pertaining to test preparation) were written in the field notes. Field notes from participant observation of classrooms, hallways, cafeteria, school grounds, and extracurricular activities were used to build a rapport with the participants.
Data analysis involved standard methodology in naturalistic inquiry (Bogdan & Biklen, 1992; Miles & Huberman, 1994). After the interview tapes were transcribed, the transcripts and field notes were read several times to identify regularities and patterns, which were used to develop coding categories (Bogdan & Biklen, 1992). Further analysis involved "member checking" and "triangulation" before the final categories were decided (LeCompte & Priessle, 1993; Lincoln & Guba, 1985).
A Typical Mathematics Class Experience
Their teacher, Mr. H, is a European-American man in his mid-40s, who appeared to have had some minimal success in teaching. He began teaching high school mathematics in the late 60s, and for more than 20 years he has taught in the same way. During the study, he was teaching tenth grade geometry, and although he was not regarded as a strict teacher, the students respected him for not being too demanding yet caring. As an example, the following episode occurred during a lesson about the hypsometer, a topic many of the students in the mathematics class found boring and difficult. As the mathematics teacher began the lesson, several students in the class immediately laid their heads on their desk. This lesson appeared not to provoke much engagement among the students. The following incident is illustrating Mr. H's difficulty with transferring his passion for mathematics to his students.
Teacher: Today we are talking about the hypsometer. This will not be an easy activity, but to begin the topic, we will use some simple materials.
Boo: Can I pass out the materials?
Teacher: Yes, make sure each student gets a piece of manila folder, a pair of scissors, a washer, a piece of string and a straw. You will have to share the tape.
Art: Why do we have to do this? This is boring. Teacher: Why do you think?
Art: I do not know.
Many students did not demonstrate a real interest and commitment to the mathematical tasks, did not exhibit on-task behavior, and displayed negative attitudes about the mathematics involved.
Mr. H often asked, "How do you know?" I believe to push students' thinking. When students asked questions, Mr. H was quick to say, "Who knows? Who can help him out here?" Mr. H often referred students to the textbook when he was obviously frustrated about students' progress, especially those with poor attendance. However, Mr. H did not shrink from his own responsibility as teacher. From time to time he worked individually with students who seemed puzzled or confused about the discussion. The busy hum of activity of a few students in Mr. H's classroom was directed toward mathematics while others remained bored or confused.
Student responses were characterized into two broad themes: test preparation (Kohn, 2000; Barton, 1999; Popham, 2001) and suggestions for improvement (Kellaghan, Madaus, & Raczek, 1996; Steele, 1997). However, both dealt with the perceptions of students on high-stakes tests.
According to these students, teachers missed opportunities to attend to their self-esteem while preparing them for the test. The ways in which students were prepared for high-stakes testing are captured in the remarks of one of three research participants:
Wanda: The teacher should give the practice materials earlier and stop waiting until the last minute to give the practice testing materials.
Another approach that was ineffective to the preparation process of these students was the preparation strategy of the students; which is an indicator of their perceptions of the value of the test. For example, another one of the research participants stated:
Art: I had begun to study weeks before the test rather than two weeks before the test or the night before the test, relieving the pressure of learning everything at once. I reviewed my math notes for a half-hour a day to help me prepare for the proficiency test.
From a very passionate discussion with these students, the need to implement alternate strategies or preparation techniques was voiced because of the ineffectiveness of the methods already in place in the school. For example, one of the participants said:
Art: If you try to cram everything in at the last minute, then you are going to forget something. It is a rush, and it is a rush on your mind.
The problems with these preparations are captured in the remarks of two participants:
Art explains: I admit the test is important because if I do not pass it, it will hurt me eventually because I want to have College Preparatory written on my diploma and if I do not pass the test, this will not happen.
Interviewer: How do you feel about not having passed the test so far?
Boo: I do not like it, but I cannot help it. Interviewer: What keeps you motivated to keep taking the test?
Boo: I want to pass it so I can go to college.
The reality reflected in these statements suggests that these African-American students who are in the best place to assess the impact of these tests feel downtrodden at best. By all accounts, the testing process for these African-American high school students in a high-stakes test district is a complicated one that does not take into consideration the influence of their preparation process (Heubert & Hauser, 1999; Natriello & Pallas, 1999; New York Urban League, Inc. v. New York, 1995; Willingham & Cole, 1997). Nor does it consider how students who have failed feel when having to prepare again and again.
Student Suggestions for Improvement
A myriad of students' responses illustrated a surprising lack of bitterness toward the test and mathematics. According to the students, the test provided an impetus to try harder in school. Students had positive attitudes about mathematics when problem solving in groups and working with manipulatives. The types of problems engaged in had an observable impact on the participants' attitudes and reactions during the mathematics class. In addition, a good mathematics teacher is concerned about their students on a personal level or, as another participant indicated, "willing to help interested students learn the subject of mathematics, has patience, and willing to find other possible ways to make students understand mathematics." Friendship appeared to be a very important characteristic among the teenagers. In the words of one of the participants, "almost every student has something negative to say, and almost never do they compliment you." One participant defines a good mathematics student is one "who has a unique way to learn mathematics, and wants to learn mathematics badly." According to another participant, a good mathematics student is a student "who gets mathematics easily and has an I-know-it attitude." When performing drill and practice, memorization and rote, computation out-of-context, students were most often bored with learning and felt low levels of engagement. As one of the participants explained, "the mathematics is like going through a maze and I got to get out of that maze just as I got to pass the test." To them there seemed to be another obstacle, and since their schooling has been replete with obstacles and barriers, they see it as another barrier to cross. In some absolute sense, the test was hard for them; they failed, but they vacillated as to the difficulty of the test. It was a very consistent comment from the participants that the content and the test changed each time it was taken. The preparation of the students for the mathematics test by their teachers and the school, in particular, could serve as one reason for their failure.
The results of the study suggest that students experience some of the same "turn offs" of mathematics by the teachers who teach them. Teachers apparently convey the message of the difficulties of mathematics. It is detrimental to hear a considerable number of teachers (pre-service and in-service) say that "They are not good in math," at the same time they are teaching the subject and passing this negative message to the students, the people who most need a positive attitude.
Case studies such as these are useful, although they may not be generalizable. These case studies address an important phenomenon, the mathematical classroom experiences of students in preparation for a high-stakes test. The implication of the findings obtained here led to a consideration of the critical elements that determine success in urban mathematics classrooms. Is it appropriate motivation, attitude, or predisposition, or is it what happens to students once they begin to prepare for the test that leads to success? It is, most likely, some of both. It was concluded, however, from the evidence provided by Wanda, Art, and Boo's cases, that the assumptions about the students held by researchers and practitioners, and students' actions in preparation for the tests, can be a powerful indicator as to the student's success in the mathematics classroom. Certainly, many factors contributed to the inadequate preparation of the students, but one aspect of these complex and multidimensional processes of teaching and learning must be held accountable, and that is the pedagogy provided by the teacher. The kind of teaching implemented in this classroom is what Haberman (1991) calls "the pedagogy of poverty." The pedagogy of poverty includes such routine teaching acts as "giving information, asking questions, giving directions, making assignments, giving tests, assigning homework, reviewing homework, settling disputes, punishing noncompliance, marking papers, and giving grades" (p. 290). Haberman points out that taken separately these acts might seem "normal"; however, "taken together and performed to the systematic exclusion of other acts, they have become the pedagogical coin of the realm in urban schools" (p. 291). Lemlech's (1994) account is typical: In classrooms where students are given little opportunity to choose what they will learn, how they will learn, and the way in which they will be evaluated for learning, there is a greater likelihood that the learning will become joyless. There is also a tendency in these classrooms to overemphasize repetition, drill, and commercially produced dittos for practice materials. Some believe this to be prevalent in more socioeconomic and low achieving classrooms, and as a consequence it may be the cause of negative motivation patterns (p. 91).
Continuing to promote this kind of pedagogy is what I call "pedagogy of mediocrity." According to this pedagogy, if mathematics is covered, it is covered in a very dry and watered down way that ineffectively prepares for and prevents students from meeting high-stakes tests requirements. The pedagogy places restrictions on teaching practices that focus on computations and procedures that are myopic and prevent the development of the essential conceptual understanding needed to navigate high-stakes tests. This pedagogy of mediocrity played a critical role in preparing the students previously mentioned. An anomaly becomes apparent (Stone, 1994): ... schools [are encouraged] to spare neither effort nor resources in fitting instruction to students while expecting little from them in return. Student inattention and apathy are met with Herculean efforts to stimulate interest and enthusiasm. Deficient outcomes are countered by reducing expectations to levels of whatever the student seems willing to do. Even the practice of [motivating students by] affording ... accurate feedback about accomplishments is deemed questionable because of purported detrimental effects on intrinsic motivation and self esteem.... recurrent failure to attain even minimal achievement is accepted as lamentable but unavoidable and treated accordingly. In short, development requires only the teacher to work, not the student. (p. 62)
In summary, these students perceived the test as a barrier, they remained hopeful, although they realized the test was an impediment. It is particularly interesting that according to these students, at best; these imposed pressures tend to create an improvement in their sense of self and instilled a committed passion for learning and passing the test.
Implications for Research in Mathematics Education
These students' experiences create a space to narrate their stories so as to act on them. Educators can help improve mathematics performance on high-stakes tests by developing appropriate and effective teaching strategies. Students are not passive recipients of teacher instruction but are active interpreters of the classroom environment (Weinstein, 1983). Thus, success in mathematics for African-American students may need to be deeply embedded in a variety of social contexts (Ladson-Billings, 1997; Tate, 1994). Besides changing the names of story problem characters, teachers will also need to understand the deep structures of students' experiences. This may mean doing some things with students that have not been done in the traditional mathematics classroom. For example, it is suggested that interviewing students, having them write autobiographies, and discussing their interests in mathematics can assist students in learning and understanding mathematics (Ladson-Billings, 1997). Teachers provide the experiences that exert powerful influences on students' attitudes about mathematics. However, to learn mathematics, students must want to learn and feel good about learning (Kenney & Silver, 1997; Mullis, Martin, Beaton, Gonzales, Kelly & Smith, 1997; NCTM, 2000). Educators must be aware of situations that can cause low engagement, and work with students in ways that increase engagement levels by providing mathematics curricula and pedagogy that take full advantage of the "adaptive," "resilient," "complex" nature of learners in urban mathematics classrooms (Ladson-Billings, 1997, p. 706).
This discussion was not an effort to rationalize the poor performance of African-American students on high-stakes tests; there exists an abundance of literature that documents the mathematics failure of African-American students. Rather the focus is to address important phenomena for understanding high-stakes testing from the students' perspective. The implications of the findings obtained here lead to a consideration for the implementation of the critical elements that determine the success of high-stakes tests in urban settings. These interviews provide evidence that the assumptions made about these tests can have a powerful influence on the learner's success. What can be seen is that teachers could teach better if they knew their students better. These however are only a few of the factors that contribute to successful teaching within the urban classroom.
Unfortunately, the three individuals in this study were not provided adequate instruction to successfully pass the test. These matters cannot be left to the vagaries of chance. Teachers need to eliminate the variety of untested and unproven practices in order to provide the needed assistance to those students who consistently fail high-stakes mathematics tests. Many mathematics teachers in the urban classroom are proponents of this system. But students today are faced with different challenges than their teachers, such as high-stakes tests. Therefore, teachers must provide a different way of delivering the same mathematics. The bottom line, what is happening in some urban mathematics classrooms, and with students in preparation for high-stakes tests just, do not add up.
American Federation of Teachers (2001). Making Standards Matter 2001. Washington, DC: American Federation of Teachers.
Barton, P. E. (1999, Summer). "Tests, tests, tests," American Educator 23(2): 18-23.
Bogdan, R. C., and Bilken, S. K. (1992). Qualitative research for education. Boston, MA: Allyn and Bacon.
Carnine, D. (1995, May 3). "Is innovation always good?" Education Week: 40.
Darling-Hammond, L. (1991, November). "The implication of testing policy for quality and equality," Phi Delta Kappan 73: 220-225.
Firestone, W., Goertz, M., and Natriello, G. (1997). From cashbox to classroom. New York: Teachers College Press.
Fisher, D., Roach, V., and Kearns, J. (1998, March). "Statewide assessments systems: Who's in and who's out?" [On-line]. Available: http://www.ecs.org/ecs/ecsweb.nsf/f333f38a7982ee63872565c80058b824/.
Haberman, M. (1991). "The pedagogy of poverty versus good teaching," Phi Delta Kappan 73: 290-294.
Herman, J. L., and Golan, S. (1990). Effects of standardized testing on teachers and learning: Another look. Los Angeles: Center for Research on Evaluation, Standards, and Student Testing. (ERIC Document Reproduction Service No. ED 341 738)
Heubert, J. P. (1998). High-stakes testing and civil rights: Standards of appropriate test use and a strategy for enforcing them. Paper presented at the Conference on Civil Rights Implications of High-stakes Testing, sponsored by the Harvard Civil Rights Project, Teachers College and Columbia Law School.
Heubert, J. P., and Hauser, R. M. (Eds.). (1999). High Stakes: Testing for tracking, promotion, and graduation. Washington, DC: National Academy Press.
Heyneman, S. P. (1987). Uses of examinations in developing countries: selection, research, and education sector management. International Journal of Educational Development 7: 251-263.
Hilliard, A. (2000). "Excellence in education versus high-stakes testing," Journal of Teacher Education 51: 293-304.
Johnston, R. C. (1997, April). Just saying no. Education Week. [On-line]. Available: http://www.edweek.org/ew/1997/28test.h16.
Kellaghan, T., Madaus, G. F., and Raczek, A. E. (1996). The use of external examinations to improve student motivation. Washington, DC: American Educational Research Association, Monograph.
Kenney, P. A., and Silver, E. A. (Eds). (1997). Results from the sixth mathematics assessment of the National Assessment of Educational Progress. Reston, VA: National Council of Teachers of Mathematics.
Kohn, A. (2000). "Burnt at the High Stakes," Journal of Teacher Education 51: 315-327.
Kohn, A. (2001, January). "Fighting the test: A practical guide to rescuing our schools," Phi Delta Kappan 82(5): 348-357.
Ladson-Billings, G. (1997). "It doesn't add up: African American students mathematics achievement," Journal for Research in Mathematics Education 28: 697-708.
LeCompte, M. D., and Priessle, J. (1993). Ethnography and qualitative design in educational research (2nd ed.). New York: Academic Press, Inc.
Lemlech, J. (1994). Curriculum and instructional methods for the elementary and middle school (3rd ed.). New York: Macmillan College.
Lieberman, A. (1991, November). "Accountability as a reform strategy," Phi Delta Kappan 73: 219-220.
Lincoln, Y. S., and Guba, E. G. (1985). Naturalistic inquiry. Beverly Hills, CA: Sage Publications.
Madaus, G. F. (1991). "The effects of important tests on students," Phi Delta Kappan 73: 226-231.
Madaus, G. F. (1998). "The distortion of teaching and testing: High-stakes testing and instruction," Peabody Journal of Education 65: 29-46.
Marshall, J. (1993, December) "Why Johnny can't teach," Reason 25: 102-106.
McLaughlin, M. W. (1991). Test-based accountability as a reform strategy. Phi Delta Kappan 73: 248-251.
McLeod, D. B. (1992). Research on affect in mathematics education: A reconceptualization. In D.A. Grouw (Ed.), Handbook of Research on Mathematics Teaching and Learning (pp. 575-596). New York: McMillan.
Miles, M. B., and Huberman, A. M. (1994). Qualitative data analysis: An expanded sourcebook. Thousand Oaks, CA: Sage Publications.
Mullis, I. V. S., Martin, M. O., Beaton, A. E., Gonzales, E. G., Kelly, D. L., and Smith, T. L. (1997). Mathematics achievement in the primary school years: IEA's Third International Mathematics and Science Study (TIMSS). Chestnut Hill, MA: Boston College, TIMSS International Study Center.
National Council of Teachers of Mathematics (1989). Curriculum and evaluation standards for school mathematics. Reston, VA: Author.
National Council of Teachers of Mathematics (2000). Principles and standards for school mathematics. Reston, VA: Author.
Natriello, G., and Pallas, A. (1999). The development and impact of high stakes testing. Paper presented at the Conference on Civil Rights Implications of High-stakes Testing, sponsored by the Harvard Civil Rights Project, Teachers College and Columbia Law School.
New York Urban League, Inc. v. New York. (1995). 71 F.3d 1031 (2nd Cir.).
Orfield, G., and Wald, J. (2000). "Testing, testing," [On-line]. Available: http://past.thenation.com/issue/000605/0605 orfield.shtml.
Popham, J. W. (2001, March). "Teaching to the test," Educational Leadership 16-20.
Pritchard, I. (1998). "Judging standards in standard-based reform," [On-line]. Available: http://www.ecs.org/ecs/ecsweb.nsf/ f333f38a7982ee63872565c80058b8 24/.
Randall, K. (2001). "Growing opposition to "high-stakes" testing in US schools," [On-line]: available: http://www.wsws.org/articles/2001/test-j25.shtml.
Shepard, L. A., and Dougherty, K. C. (1991). Effects of high-stakes testing on instruction. Paper presented at the Annual Meetings of the American Educational Research Association and the National Council on Measurement in Education. (ERIC Document Reproduction Service No. ED 337 468)
Shore, A., Madaus, G., and Clarke, M. (2000). "Guidelines for policy research on educational testing," Boston: National Board on Educational Testing and Public Policy 1(4): 1-7.
Smith, M. L., and Rottenberg, C. (1991). "Unintended consequences of external testing in elementary schools," Educational Measurement: Issues and Practice 10: 7-11.
Steele, C. M. (1997). "A threat in the air: How stereotypes shape intellectual identity and performance," American Psychologist 52(6): 613-629.
Stone, J. E. (1994). Developmentalism's impediments to school reform: Three recommendations for overcoming them. In R. Gardner III, D. M. Sainato, J. O. Cooper, T. E. Heron, W. L. Heward, J. W. Eshlemann, and T. A. Grossi (Eds.), Behavior analysis in education: Focus on measurably superior instruction (pp. 57-72). Pacific Grove, CA: Brooks/ Cole.
Stone, J. E. (1996, April). "Developmentalism: An obscure but pervasive restriction on educational improvement," Educational Policy Analysis Archives 4(8). Retrieved April 4, 2002 from World Wide Web: http://epaa.asu.edu/epaa/v4n8.html.
Tate, W. F. (1994). "Race, reform, and retrenchment of school mathematics," Phi Delta Kappan 75: 477-485.
Weinstein, R. S. (1983). "Student perception of schooling," Elementary School Journal 83: 287-312.
Willingham, W.W., and Cole, N. S. (1997). Gender bias and fair assessment. Hillsdale, New Jersey: Erlbaum.
Wolf, R. (1998). "National standards: do we need them?", Educational Researcher 27: 22-25.
Randy Lanimore is assistant professor in Mathematics and Teacher Education at Wayne State University in Detroit, Michigan. He is grateful to Margaret Hoard for her assistance and helpful comments on preliminary drafts of this article.