Multiple Measures: The Common-Sense Approach to Education Assessment.Mention "assessment reform" to U.S. educators and one place they often think of is Kentucky Kentucky, state, United States Kentucky (kəntŭk`ē, kĭn–), one of the so-called border states of the S central United States. It is bordered by West Virginia and Virginia (E); Tennessee (S); the Mississippi R. . Beginning with the Kentucky Education Reform Act earlier in this decade and the assessment program it spawned, the commonwealth rapidly established itself as a national leader in educational testing. The Kentucky Instructional Results Information System, also known as KIRIS KIRIS Kentucky Instructional Results Information Systems , gained the spotlight Spotlight can refer to at least three types of lighting:
Now, Kentucky's testing program is changing again. This time the commonwealth is reintroducing an assessment format it abandoned five years ago: the norm-referenced, multiple-choice test. Beginning in spring 1997, Kentucky will again test public school students with a multiple-choice test as part of KIRIS. That means Kentucky students will be given both kinds of tests--performance assessments and a norm-referenced test A norm-referenced test is a type of test, assessment, or evaluation in which the tested individual is compared to a sample of his or her peers (referred to as a "normative sample"). . Multiple Purposes Kentucky's use of the multiple-measures approach is indicative of a national trend. What educators and policymakers nationwide are realizing is that no single test can do everything. This multiple-measures approach is not revolutionary. It is a common-sense approach that has been ignored in recent years as some education reformers have sought to find a single-test format to suit all of their needs. Today, however, educators better understand the real power and usefulness of creating testing programs that combine performance assessments, norm-referenced tests, and other evaluative measures. This approach puts the right kind of assessment to work for the right purpose. Performance assessments, for example, are best suited for instructional purposes. Norm-referenced tests, on the other hand, are used to generate and provide data for evaluation and accountability purposes. It is important to realize, however, that performance assessment results can be used to evaluate programs and account for pupil progress, while traditional, multiple-choice tests also can provide valuable instructional information. Both types of tests continue to be in great demand as parents, the public, and policymakers seek, to build better and more relevant measures into our education systems. A majority of new multiple-measures programs are using two well-known formats: performance assessments and norm-referenced, multiple-choice tests. States as diverse as Indiana Indiana, state, United States Indiana, midwestern state in the N central United States. It is bordered by Lake Michigan and the state of Michigan (N), Ohio (E), Kentucky, across the Ohio R. (S), and Illinois (W). , Florida, Nevada, and Wisconsin Wisconsin, state, United States Wisconsin (wĭskŏn`sən, –sĭn), upper midwestern state of the United States. It is bounded by Lake Superior and the Upper Peninsula of Michigan, from which it is divided by the Menominee now are taking this multiple-measures approach to assessment. In fact, in the past year, more than half of the new state testing programs have included both norm-referenced, multiple-choice components to respond to the public's need for comparative information about their students and schools, and performance measures to assess the progress toward meeting the state standards. Unremitting Need Prior to 1990, nearly all state and local assessment programs relied on multiple-choice tests. The foci of these programs were measurement and accountability. However, during the first half of the 1990s, many states added performance assessments to their testing programs as part of a movement that focused on standards-driven--not measurement driven--reform and instructional improvement. In the process, many states eliminated traditional norm-referenced, multiple-choice tests. Yet while norm-referenced tests, were eliminated, the educational policy that undergirded their use--measurement-driven accountability--was not. In fact, the need for accountability remains a cornerstone cornerstone Ceremonial building block, dated or otherwise inscribed, usually placed in an outer wall of a building to commemorate its dedication. Often the stone is hollowed out to contain newspapers, photographs, or other documents reflecting current customs, with a view to of most education policy. This policy guides the thinking of many governors, legislators, state education officials, local school administrators, and boards of education. Oddly enough, as acceptance of performance assessments has grown, so has the public's and some educators' demand for the information provided by norm-referenced tests. In the same way that many educators feel that norm-referenced tests have not provided and cannot provide certain information, they have learned that performance assessments alone cannot generate the type of valid, reliable, and fair baseline The horizontal line to which the bottoms of lowercase characters (without descenders) are aligned. See typeface. baseline - released version data that norm-referenced tests do. Unfortunately, a small but vocal group of education reformers is actively characterizing any new use of norm-referenced tests--even if they are included in a multiple-measures assessment program--as a retreat from progress. Many professionals engaged in educational assessment do not see the reintroduction Noun 1. reintroduction - an act of renewed introduction intro, introduction, presentation - formally making a person known to another or to the public or inclusion of norm-referenced tests in state and local testing programs as a retreat at all. Rather, it represents real progress in responding to the needs and demands placed on such assessment programs. Some of these same reformers have gone so far as to claim that the retreat from progress is being led by test publishers. Yet publishers would gain no advantage by doing so since they are in the performance-assessment market as well. Most, if not all, major test publishers already have developed and marketed successfully a variety of performance assessments and the norm-referenced tests with which they are more commonly associated. Appropriate Use Building a multiple-measures program begins with a recognized need for various types of assessment data and an understanding of how those data are used by stakeholders Stakeholders All parties that have an interest, financial or otherwise, in a firm-stockholders, creditors, bondholders, employees, customers, management, the community, and the government. . These stakeholders include parents, teachers, principals, superintendents, legislators, state education officials, governors, taxpayers and, yes, the news media. Each group of stakeholders holds expectations and places demands on our assessment programs. Given that, school administrators must successfully answer three questions: What information does each type of stakeholder stakeholder n. a person having in his/her possession (holding) money or property in which he/she has no interest, right or title, awaiting the outcome of a dispute between two or more claimants to the money or property. need? How is it to be used? Is it of value? A school district's failure to appropriately answer these questions at the design stage of an assessment program can cost it dearly later on--not only in time and effort, but also in credibility and public support. I suggest considering four additional questions relating to relating to relate prep → concernant relating to relate prep → bezüglich +gen, mit Bezug auf +acc the use of assessment data: 1. How does a certain type of data relate to the many decisions that must be made regarding instruction and public accountability? 2. Is the use of assessment data understood by all stakeholders? 3. Is the data valued and trusted by stakeholders? 4. Finally, can stakeholders who make decisions about the lives of children and the health of our education systems translate the data into action? These are tough, but basic questions that should serve as the starting points Noun 1. starting point - earliest limiting point terminus a quo commencement, get-go, offset, outset, showtime, starting time, beginning, start, kickoff, first - the time at which something is supposed to begin; "they got an early start"; "she knew from the for any school district or state looking to redesign re·de·sign tr.v. re·de·signed, re·de·sign·ing, re·de·signs To make a revision in the appearance or function of. re its assessment program. Objective Data A multiple-measures approach that includes norm-referenced tests answers the accountability requirement so often placed on assessment systems. Accountability is a responsibility that educators cannot shirk shirk In Islam, idolatry and polytheism, both of which are regarded as heretical. The Qu'ran stresses that God does not share his powers with any partner (sharik) and warns that those who believe in idols will be harshly dealt with on the Day of Judgment. . For without objective, consistent data, how will people make decisions regarding their schools? Educators hoping to compare students with like populations across the nation, over time, and from school to school and district to district will lack a valid mechanism to do so. For instance, in several states that moved to a solely performance assessment program, parents, superintendents, and school boards demanded more precise and more understandable information. It was the lack of this information that has led those states back to norm-referenced tests. Despite the merits of performance assessments, they had not provided parents, policymakers, and educators with the information needed to generate objective, comparable data for use in evaluating students, programs, and schools. As a result, many districts in those states continued to purchase norm-referenced tests. Realizing the demand for such information, the states changed their testing programs so they would be based on multiple forms of assessment. Public Confidence Finally, in addition to the simple, common-sense merits of the multiple-measures approach, we also must face the realpolitik realpolitik Politics based on practical objectives rather than on ideals. The word does not mean “real” in the English sense but rather connotes “things”—hence a politics of adaptation to things as they are. of the issue: At a time when public confidence in our schools appears to be at an all-time low, it is not in our interest to shun Shun In Chinese mythology, one of the three legendary emperors, along with Yao and Da Yu, of the golden age of antiquity (c. 23rd century BC), singled out by Confucius as models of integrity and virtue. measurements that provide objective baseline data sought by educators, policymakers, and the public. It is ironical i·ron·ic also i·ron·i·cal adj. 1. Characterized by or constituting irony. 2. Given to the use of irony. See Synonyms at sarcastic. 3. that as the public calls for greater accountability in education, some education reformers continue to assert that we should forfeit To lose to another person or to the state some privilege, right, or property due to the commission of an error, an offense, or a crime, a breach of contract, or a neglect of duty; to subject property to confiscation; or to become liable for the payment of a penalty, as the result of a the types of assessments and multiple-measures approaches capable of demonstrating whether our schools are succeeding. However, to the benefit of educators, our children, their parents, and other education stakeholders, an increasing number of school districts and states are using a multiple-measures approach to assessment to demonstrate how their schools are progressing. Michael Kean chairs the Test Committee of the Association of American Publishers (body, publication) Association of American Publishers - (AAP) A group engaged in standardisation efforts in document preparation. . A Response to Kean MONTY (programming, abuse) monty - /mon'tee/ Any program with a ludicrously complex user interface that performs a trivial task. An example would be a menu-driven, button clicking, pulldown, pop-up windows program for listing directories. NEILL Real Accountability Requires Helpful Assessment Michael Kean appears to offer a reasonable, common-sense approach to assessment: use multiple measures. However, his particular formulation formulation /for·mu·la·tion/ (for?mu-la´shun) the act or product of formulating. American Law Institute Formulation of multiple measures is a retreat from progress, which will undermine educational improvement in the name of accountability. True, no single test can do everything. But while multiple-choice is just one kind of measure, performance assessments include many kinds--portfolios, projects, exhibitions, essays, experiments, and more. We can have multiple measures and never use multiple-choice exams. So the question is: What are the benefits and drawbacks of different approaches? According to according to prep. 1. As stated or indicated by; on the authority of: according to historians. 2. In keeping with: according to instructions. 3. Kean, multiple-choice, norm-referenced tests are best for accountability, while performance assessments are best for instruction, though each can be used for both purposes. In reality, norm-referenced tests are virtually worthless for instructional purposes, and while they appear efficient for accountability, they actually harm it. Multiple-choice questions are a poor tool for measuring more than factual recall, one-step procedures, or simple inferences. Even for assessing these, it is a narrow method. Dual Problems This creates two problems: (1) these tests do not provide much real information, and (2) because of their powerful influence, they narrow and dumb down dumb down verb A popular term for simplifying language to a less sophisticated–ergo, 'dumb'–audience curriculum and instruction. These well-documented problems are most severe in schools serving mainly students from low-income families or students of color not of the white race; - commonly meaning, esp. in the United States, of negro blood, pure or mixed. See also: Color . Thus, in the name of accountability, testing damages education. Norm-referenced multiple-choice tests compound the problems. Norm-referenced tests compare test-takers with each other by placing them on a normal or bell curve. Such a process does not tell us what a student knows, only whether she knows more or less than other students about whatever limited thing is measured. Norm-referenced tests are easily corrupted cor·rupt adj. 1. Marked by immorality and perversion; depraved. 2. Venal; dishonest: a corrupt mayor. 3. by teaching to the test, which leads to score inflation and undermines accountability. Another instructional impact of the bell curve is subtle but pernicious pernicious /per·ni·cious/ (per-nish´us) tending toward a fatal issue. per·ni·cious adj. Tending to cause death or serious injury; deadly. : it tells educators that only a few students can learn a lot and most will not learn very much, so they can settle for mediocre me·di·o·cre adj. Moderate to inferior in quality; ordinary. See Synonyms at average. [French médiocre, from Latin mediocris : medius, middle; see medhyo- performance from their students. A Preferable Mix For these and other reasons, the Campaign for Genuine Accountability, supported by several dozen education and civil rights groups, stated in 1990 that multiple-choice items should not be more than a quarter of an accountability assessment. AASA AASA American Association of School Administrators AASA Asian American Student Association AASA Association of Academies of Sciences in Asia AASA Aging and Adult Services Administration AASA Administrative Assistant to the Secretary of the Army endorsed that statement. More recently, the National Forum on Assessments "Principles and Indicators for Student Assessment Systems," which has been signed by more than 80 education and civil rights organizations, concluded that multiple choice, if used at all, should comprise only a small part of the assessment package, and tests designed to rank order or compare students should not be a significant part of an assessment system. For accountability, the forum suggests states and local districts rely on a combination of sampling from classroom-based assessment information and performance exams, also used on a sampling basis. In essence, the process could work as follows: Each teacher, using scoring guides, indicates where on a developmental scale or a performance standard each student should be placed and attaches evidence (e.g., portfolio material) to back up the decision. Substantial diversity can be allowed in the records and portfolios, but each one must provide evidence of learning in the area being assessed. A random sample of portfolios or learning records then is selected from each classroom. Independent readers (educators from other schools, community members, etc.) review the records as evidence of student learning and place students on the scale. The scores of teachers and readers then are compared to see whether the judgments correspond. If they do not, various actions, beginning with another independent reading, can be used to identify the discrepancy DISCREPANCY. A difference between one thing and another, between one writing and another; a variance. (q.v.) 2. Discrepancies are material and immaterial. . In addition, groups of schools can form networks to hold each other accountable and involve the communities served by the schools. This approach is being tried by a number of schools in New York City New York City: see New York, city. New York City City (pop., 2000: 8,008,278), southeastern New York, at the mouth of the Hudson River. The largest city in the U.S. . Public Willingness Can all this provide the public with real accountability? Test scores actually tell people very little--just ask someone what they know about a student or a school after looking at the scores. The accountability methods FairTest proposes will provide much richer information and involve the public in more powerful ways. The public has expressed support for methods other than multiple-choice tests, and absent organized opposition from supporters of multiple-choice testing, largely has been willing to provide the time for improved methods of assessment to develop. In moving toward improved accountability with better teaching and stronger learning, we are not helped by taking the steps backward called for by Kean. |
|
||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion