Are multiple-choice exams easier for economics students? A comparison of multiple-choice and "equivalent" constructed-response exam questions. (Targeting Teaching).1. Introduction Multiple-choice mul·ti·ple-choice adj. 1. Offering several answers from which the correct one is to be chosen: a multiple-choice question. 2. (MC) exams are very popular in economics, especially at the principles level. Becker Beck´er n. 1. (Zool.) A European fish (Pagellus centrodontus); the sea bream or braise. and Watts Watts, residential section of south central Los Angeles. Named after C. H. Watts, a Pasadena realtor, the section became part of Los Angeles in 1926. Artist Simon Rodia's celebrated Watts Towers are there. (2001) use survey data to estimate that MC questions account on average for about 45% of total assessment at this level, while Siegfried Siegfried (sēg`frēd) or Sigurd (sĭg`ərd), great folk hero of early and medieval Germanic mythology. and Kennedy (1995) use the Test of Understanding of College Economics (TUCE TUCE Test of Understanding of College Economics ) data to place this figure at about 67%. Constructed-response (CR) exams, in which students are asked to construct an answer rather than choose from among a set of possible answers, is the main alternative to MC exams. Considerable research has investigated the question of whether MC exams are essentially measuring the same thing as CR exams; the methodology of this research can loosely be described in terms of whether MC and CR scores are highly correlated cor·re·late v. cor·re·lat·ed, cor·re·lat·ing, cor·re·lates v.tr. 1. To put or bring into causal, complementary, parallel, or reciprocal relation. 2. . High correlation implies that an instructor could use one type of exam and be confident that, had the other type of exam been used, the ranking of the students would be largely unaffected. But it is possible for two exam results to be hi ghly correlated, with scores on one exam much higher than scores on the other. Consequently, although the students would be ranked appropriately, the exam results may give a misleading impression concerning students' level of understanding of the material being examined. To our knowledge, no research has investigated the issue of whether the question format affects the level of student scores in economics exams. The purpose of this article is to ask whether, as many would speculate, students score higher on MC exams (after correcting for guessing) than on "equivalent" CR exams. For many instructors, it is performance on CR exams, not MC exams, that measures how well students understand economics and whether they can apply this understanding. Katz Katz , Bernard 1911-2003. German-born British physiologist. He shared a 1970 Nobel Prize for the study of nerve impulse transmission. , Bennett, and Berger Berger may refer to: Places
Berger is a relatively common last name. It means mountaineer in Dutch and German, and shepherd in French. (2000, p. 55) articulate articulate /ar·tic·u·late/ (ahr-tik´u-lat) 1. to pronounce clearly and distinctly. 2. to make speech sounds by manipulation of the vocal organs. 3. to express in coherent verbal form. 4. this well: "Constructed response items are preferred over multiple-choice by many in the education community because the former are believed to measure more important skills, be more relevant to applied decision making, better reflect changing social values, and have more positive social consequences." if this view is accepted, and so CR exam scores are interpreted as being the true reflection of student understanding of and ability to apply economics, then if students in fact score highe r on "equivalent" MC exams, both instructors and students could be unjustifiably complacent com·pla·cent adj. 1. Contented to a fault; self-satisfied and unconcerned: He had become complacent after years of success. 2. Eager to please; complaisant. about their teaching success and understanding of economics, respectively, an undesirable state of affairs. By reporting our findings, we hope to sensitize sen·si·tize v. To make hypersensitive or reactive to an antigen, such as pollen, especially by repeated exposure. instructors to this phenomenon. Section 2 reviews the literature in this area and discusses the relevant educational theory. Section 3 describes the experiment we undertook to produce our data, and section 4 reports the empirical results. Section 5 reanalyzes the data to investigate whether our results are the same for males as for females, and section 6 summarizes results from a similar investigation of the performance of "good" versus "poor" students. Section 7 concludes. 2. Theory and Literature Review Ever since Robert Yerkes Robert Mearns Yerkes, PhD, (May 26 1876 – January 3 1956 (aged 81)) was a psychologist, ethologist, and primatologist best known for his work in intelligence testing and in the field of comparative tested a million World War I recruits with his multiple-choice Alpha Army Intelligence Test, there has been controversy concerning the relative merits of MC and CR tests. In economics, Walstad (1998) provides an excellent summary of the advantages of MC testing, arguing that these tests have low grading costs, provide timely feedback, are free from scoring bias, allow a wider sampling of course content, produce less measurement error, and are highly correlated with CR test scores. Welsh and Saunders Saun´ders n. 1. See Sandress. (1998) defend the essay test for economics, noting that it can assess and develop higher level cognitive skills cognitive skill Psychology Any of a number of acquired skills that reflect an individual's ability to think; CSs include verbal and spatial abilities, and have a significant hereditary component , encourage the development of writing skills, elicit e·lic·it tr.v. e·lic·it·ed, e·lic·it·ing, e·lic·its 1. a. To bring or draw out (something latent); educe. b. To arrive at (a truth, for example) by logic. 2. students' opinions and attitudes, and lower test preparation costs. The Katz, Bennett, and Berger (2000, p. 55) quote given earlier summarizes the case for CR. In the education literature, research on MC versus CR appears under the rubric RUBRIC, civil law. The title or inscription of any law or statute, because the copyists formerly drew and painted the title of laws and statutes rubro colore, in red letters. Ayl. Pand. B. 1, t. 8; Diet. do Juris. h.t. of format effects. There are three main streams of research. The most prominent stream addresses the issue of whether MC and CR test scores are essentially measuring the same thing. Wainer and Thissen (1993, p. 116) summarize sum·ma·rize intr. & tr.v. sum·ma·rized, sum·ma·riz·ing, sum·ma·riz·es To make a summary or make a summary of. sum this work, concluding that "A natural conclusion to reach from the weightings associated with constructed-response tests versus multiple-choice questions is that the former take more examinee time and resources to measure essentially the same thing more poorly than the latter." Walstad and Becker (1994) endorse To sign a paper or document, thereby making it possible for the rights represented therein to pass to another individual. Also spelled indorse. endorse (indorse) v. this view using data on economics students, whereas Becker and Johnston Johnston, town (1990 pop. 26,542), Providence co., N central R.I., a suburb of Providence; inc. 1759. Among its manufactures are jewelry, textiles, and fabricated metals. Johnston is the home of several insurance companies. (1999) is a contrary view using data on economics students. Kennedy and Walstad (1997) go beyond traditional statistical analysis of data on economics students to examine implications for an explicit objective function (minimizing grading errors), an approach they describe as an economist's view. They conclude that the statistical c orrelations upon which the standard literature in this area are based mask significant grading errors for a small number of students. A second stream of research in this area is concerned with how to scale or link MC and CR scores so as to create a single total score on an exam consisting of both types of questions, or to compare scores from one type of exam with those of another. Sykes This article is about the TV series. For other uses, see Sykes (disambiguation). Sykes was a British sitcom starring Eric Sykes and Hattie Jacques that aired on BBC1 for 68 episodes from 1972 to 1979. and Yen (2000) and Tate (2000) are recent examples; to our knowledge, there is no similar research in economic education. The third stream of research in this area, the branch to which the research reported in this article belongs, addresses the issue of whether questions in one format are more difficult than "equivalent" questions in the other format. Research in this branch is limited, mainly because, as Traub (1993, p. 30) suggests, a meaningful comparison of difficulty requires that the true-score scales of the MC and CR instruments must be equivalent, something he claims is difficult if not impossible to demonstrate. Katz, Bennett, and Berger (2000) is a recent example of work in this area and provides a good literature review. They begin (p. 39) by stating that "Researchers have frequently noted that some items are more difficult in the constructed response (CR) format than in the multiple-choice (MC) format, while format does not affect performance on other items. Yet little is known about the mechanism by which response format affects item difficulty." We address this question by confining con·fine v. con·fined, con·fin·ing, con·fines v.tr. 1. To keep within bounds; restrict: Please confine your remarks to the issues at hand. See Synonyms at limit. our analysis to two specific contexts, typical of the literature in this area. First, we examine only one type of CR question. A CR question is any question requiring the examinee to generate an answer rather than select from a small set of options. Such answers can range from producing a word or phrase to writing a lengthy essay. Snow (1993, p. 48) has a taxonomy taxonomy: see classification. taxonomy In biology, the classification of organisms into a hierarchy of groupings, from the general to the particular, that reflect evolutionary and usually morphological relationships: kingdom, phylum, class, order, . Our CR questions all fall into Snow's most basic CR category, namely generation of a short sentence or phrase to answer a question. Despite this apparent restriction, it could be argued that our study is more representative of the flavor of constructed response than are existing studies in the literature. The two most prominent studies in the literature, those of Bridgeman Bridgeman often refers to the Bridgeman Art Library. Bridgeman is also a surname, and may refer to many people. : Top - 0–9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
Second, following the existing literature, we match MC and CR questions with identical stems so that students are faced with "equivalent" questions but in different formats. In the MC version, they choose from four suggested answers; in the CR version, they must produce the correct answer on their own. In both variants, scoring is on a right/wrong basis--no part marks are available, arbitrarily ruling out one of the possible advantages of CR. These two features of our data are the basis for our claim that we have created a fair comparison between MC and CR questions, allowing us to examine in unbiased fashion the influence of the format effect so long as suitable adjustment is made for the possibility of guessing in the MC format. Snow (1993, p. 51) claims that, in general, students perceive MC tests to be easier than CR tests. Bridgeman (1992, p. 269) reports that 81% of the students he surveyed preferred MC, 11% preferred CR, and 8% were indifferent INDIFFERENT. To have no bias nor partiality. 7 Conn. 229. A juror, an arbitrator, and a witness, ought to be indifferent, and when they are not so, they may be challenged. See 9 Conn. 42. . Kennedy and Walstad (1997) report the results of a 1995 survey at the University of Nebraska Nebraska (nəbrăs`kə), Great Plains state of the central United States. It is bordered by Iowa and Missouri, across the Missouri R. (E), Kansas (S), Colorado (SW), Wyoming (NW), and South Dakota (N). , indicating that about 70% of economics principles students believe that MC tests are easier than CR tests. Why might this be the case? Several reasons for why MC tests might be easier are possible, any of which could explain the empirical results we report later. One reason is that MC questions sometimes have options that are simply not credible, so that a student guessing will produce a reasonable score, even after standard penalties for guessing are imposed. (1) We have made every effort in this study to use MC questions for which the distracters are all credible, but this is undoubtedly a drawback DRAWBACK, com. law. An allowance made by the government to merchants on the reexportation of certain imported goods liable to duties, which, in some cases, consists of the whole; in others, of a part of the duties which had been paid upon the importation. of the MC format, given that not all instructors are diligent dil·i·gent adj. Marked by persevering, painstaking effort. See Synonyms at busy. [Middle English, from Old French, from Latin d in this respect. Indeed, one of Bridgeman's (1992, p. 269) conclusions is that "Format effects appeared to be particularly large when the multiple-choice options were not an accurate reflection of the errors actually made by students." In such a case, students get useful feedback when their initial answer does not appear among the MC options, rendering See render. (graphics, text) rendering - The conversion of a high-level object-based description into a graphical image for display. For example, ray-tracing takes a mathematical model of a three-dimensional object or scene and converts it into a bitmap image. the MC questions "easier." A second reason why MC questions may be thought easier is that, in some MC questions, particularly mathematics questions, it may be possible to work backward from the MC answers to figure out the correct answer. This problem-solving problem-solving n → resolución f de problemas; problem-solving skills → técnicas de resolución de problemas problem-solving n → strategy is not available in the CR format. The theory of format effects in the education literature, as summarized by Traub (1993, pp. 39-42), provides further insight into this issue. Questions that test explicit knowledge Explicit knowledge is knowledge that has been or can be articulated, codified, and stored in certain media. It can be readily transmitted to others. The most common forms of explicit knowledge are manuals, documents and procedures. Knowledge also can be audio-visual. are such that the availability of response options in the MC format may facilitate recognition of the desired answer. Similarly, questions that require inference (logic) inference - The logical process by which new facts are derived from known facts by the application of inference rules. See also symbolic inference, type inference. or application or evaluation of concepts may have associated with them MC answers that articulate thinking that the student would otherwise not have been able to produce despite being familiar with the concepts. This also could facilitate recognition of the correct answer and thus make the MC question easier. On the other hand, MC questions that require inference or application or evaluation of concepts could be such that the response options in the MC format provide no information/help to students, in which case, format should have little influence on a student's ability to produce the desired answer. A test consisting of this type of MC question should be equal in diffic ulty to its corresponding CR test (after correction for guessing). Finally, some types of MC questions have options that are all obvious to students faced with its CR counterpart counterpart n. in the law of contracts, a written paper which is one of several documents which constitute a contract, such as a written offer and a written acceptance. , so that guessing is possible in the CR format. In such cases, we would expect similar scores on both types of questions, regardless of MC guessing. One important conclusion from this literature review is that format effects can vary across different types of questions: In some cases, MC questions do appear easier, but in others, MC and CR versions of the same question appear to be equally difficult. Our results are consistent with this general conclusion. 3. Experimental Design Two versions of an economics final exam Noun 1. final exam - an examination administered at the end of an academic term final examination, final exam, examination, test - a set of questions or exercises evaluating skill or knowledge; "when the test was stolen the professor had to make a new set of , exam A and exam B, were distributed randomly to 196 students, all of whom sat the exam at the same time. This exam determined between 65 and 100% of students' final grades for a principles of macroeconomics macroeconomics Study of the entire economy in terms of the total amount of goods and services produced, total income earned, level of employment of productive resources, and general behaviour of prices. course taught by a single instructor. Each exam consists of 36 MC questions, 12 special CR questions, and 7 other CR questions. Each MC question was worth one mark, with no marks deducted de·duct v. de·duct·ed, de·duct·ing, de·ducts v.tr. 1. To take away (a quantity) from another; subtract. 2. To derive by deduction; deduce. v.intr. for incorrect answers. Each of the 12 special CR questions required students to produce a one-line answer worth one mark. The remaining CR questions consisted of a variety of short-answer questions that collectively were worth 52 marks. Total marks available for the exam were 100; there was no effective time constraint In law, time constraints are placed on certain actions and filings in the interest of speedy justice, and additionally to prevent the evasion of the ends of justice by waiting until a matter is moot. . The main difference between exam A and exam B is that exam A's first 12 MC questions and first 12 CR questions are different than exam B's first 12 MC questions and first 12 CR questions, but in a special way. In particular, the first 12 MC questions on exam A are "equivalent" to the first 12 CR questions on exam B, and the first 12 MC questions on exam B are "equivalent" to the first 12 CR questions on exam A. This difference is spelled out in the following two summaries. (1) The first 12 MC questions in exam A, referred to hereafter In the future. The term hereafter is always used to indicate a future time—to the exclusion of both the past and present—in legal documents, statutes, and other similar papers. as MCA MCA in full Music Corporation of America Entertainment conglomerate. It was founded in Chicago in 1924 by Jules Stein as a talent agency. In the 1960s it bought Decca Records and Universal Pictures, and today it produces films, music, and television shows. 1-MCA12, differ from the first 12 MC questions in exam B, referred to hereafter as MCB (Memory Control Block) An identifier (16 bytes) that DOS places in front of each block of memory it allocates. 1-MCB12. However, the [i.sup.th] question in each version is on a similar topic and is of similar difficulty. The last 24 MC questions on both exams are identical. (2) The first 12 CR questions in version A, referred to hereafter as CRA See Community Reinvestment Act. 1-CRA12, are "equivalent" to MCB1-MCB12 except that no multiple choices are offered. (An example appears below.) Likewise, the first 12 CR questions in version B, referred to hereafter as CR131-CRB12, are "equivalent" to MCA1-MCA12. To illustrate this, the 11th MC question on version A, MCA11, and its corresponding CR question, CRB CRB See: Commodity Research Bureau. 11, the 11th CR question on version B, are shown below. (MCA11) While acknowledging that a weaker dollar initially favors exports, he said this advantage will only last as long as .... . Complete this clipping (1) Cutting off the outer edges or boundaries of a word, signal or image. In rendering an image, clipping removes any objects or portions thereof that are not visible on screen. See scissoring. See also WCA. . (a) the exchange rate remains lower (b) our prices do not rise to offset the advantage of the tower exchange rate (c) exporters continue to market their products accordingly (d) the monetary authorities keep the exchange rate fixed (CRB11) While acknowledging that a weaker dollar initially favors exports, he said this advantage will only last as long as .... Complete this clipping. The CR questions were all graded by the same person, who determined whether the answer merited one or zero marks; no part marks were given. A subjective element was unavoidable. In the case of this question, the preferred CR answer was answer b given in the MC options, but a full mark was given for "the price of our goods doesn't does·n't Contraction of does not. rise" and for "the real exchange rate remains low." No mark was given for "the exchange rate remains low." In summary, ignoring the common MC questions, those writing version A of the exam produced answers to 12 MC questions, with those writing version B of the exam producing CR answers to 12 "equivalent" questions. Those writing version B of the exam produced answers to 12 different MC questions, with those writing version A of the exam producing 12 CR answers to 12 "equivalent" questions. Together, they produce 24 sets of MC answers matched by 24 sets of CR answers. The complete set of these 24 MC and CR questions appears at www.sfu.ca/~kennedy/sej.html. In both versions A and B of the exam, some of the MC questions are such that they are in essence no different from their CR counterpart. There are eight such MC questions: MCA2, MCA4, MCA9, MCA10, MCB2, MCB4, MCB9, and MCB10. For these questions, the possible answers provided as choices are obvious in the CR format, as illustrated for example by MCB9/CRA9. (MCB9) Because Canadian Canadian (kənā`dēən), river, 906 mi (1,458 km) long, rising in NE New Mexico. and flowing E across N Texas and central Oklahoma into the Arkansas River in E Oklahoma. inflation will continue to accelerate while inflation in the U.S. starts to decline, the Canadian dollar Noun 1. Canadian dollar - the basic unit of money in Canada; "the Canadian dollar has the image of loon on one side of the coin" loonie dollar - the basic monetary unit in many countries; equal to 100 cents will probably --- this year. This will force the Bank of Canada Bank of Canada Canada's central bank, established under the Bank of Canada Act (1934). It was founded during the Great Depression to regulate credit and currency. The Bank acts as the Canadian government's fiscal agent and has the sole right to issue paper money. to --- interest rates. Fill in the blanks. (a) rise; decrease (b) rise, increase (c) fall; decrease (d) fall; increase (CRA9) Because Canadian inflation will continue to accelerate while inflation in the U.S. starts to decline, the Canadian dollar will probably --- this year. This will force the Bank of Canada to --- interest rates. What is the best way to fill the blanks? For these questions, referred to hereafter as the eight no-expected-difference questions (to distinguish them from the other expected-difference questions), we would not expect students to be affected by format; they were included in our experimental design to serve as a check on our empirical results and are treated separately in the empirical analysis. Assignment to exam versions A and B was done randomly. It was clear that some students knew none of the course material, producing ridiculously low scores on the exam. From looking at the histogram histogram or bar graph Graph using vertical or horizontal bars whose lengths indicate quantities. Along with the pie chart, the histogram is the most common format for representing statistical data. of total scores on the multiple-choice and constructed-response questions common to both exams, nine students were identified as outliers, having scores of 20 or lower out of 76. These scores were spread out over the range 12-20, whereas higher scores occurred with noticeably no·tice·a·ble adj. 1. Evident; observable: noticeable changes in temperature; a noticeable lack of friendliness. 2. Worthy of notice; significant. greater frequency-there were five scores of 21 and four of 22, for example. To enhance the validity of our results, we omitted these students from our empirical analysis, lowering the sample size for group A from 99 to 94 and for group B from 97 to 93. None of the qualitative results reported below were affected by discarding these outliers. One of the questions (MCB8/CRA8) was badly designed in that the MC and CR versions in retrospect could not be considered "equivalent" because correct answers other than those provided as multiple choices w ere possible. This question has been deleted Deleted A security that is no longer included on a specified market. Sometimes referred to as "delisted". Notes: Reasons for delisting include violating regulations, failing to meet financial specifications set out by the stock exchange and going bankrupt. from our empirical analysis. To check that the two groups were of comparable ability, we tested for group differences on the common MC questions and on the common CR questions. Group A averaged 14.21 on the 24 common MC questions and 27.90 out of 52 on the common CR questions. Group B averaged 14.63 and 30.70, respectively. Testing for equality across groups resulted in t-statistics of 0.70 and 1.88 with associated p-values 0.48 and 0.06, respectively. Tests for equality of variances resulted in p-values of 0.76 and 0.27. Although the difference between these groups' test scores tested insignificantly in·sig·nif·i·cant adj. 1. Not significant, especially: a. Lacking in importance; trivial. b. Lacking power, position, or value; worthy of little regard. c. Small in size or amount. 2. different from zero at the traditional 5% significance level, it appears that Group B may be slightly better at CR questions, a fact that in most empirical studies Empirical studies in social sciences are when the research ends are based on evidence and not just theory. This is done to comply with the scientific method that asserts the objective discovery of knowledge based on verifiable facts of evidence. would be worrisome. In our study, however, this turns out to be an advantage--despite this apparent difference, the A group MC scores exceed the B group CR scores on matching expected-difference questions, strengthening our results. 4.Empirical Results The main empirical question addressed in this article is whether the average score on an MC question is insignificantly different from the average score on the "equivalent" CR question. We have analyzed an·a·lyze tr.v. an·a·lyzed, an·a·lyz·ing, an·a·lyz·es 1. To examine methodically by separating into parts and studying their interrelations. 2. Chemistry To make a chemical analysis of. 3. our data using a straightforward approach that exploits the randomized ran·dom·ize tr.v. ran·dom·ized, ran·dom·iz·ing, ran·dom·iz·es To make random in arrangement, especially in order to control the variables in an experiment. nature of our data (the fact that students were assigned as·sign tr.v. as·signed, as·sign·ing, as·signs 1. To set apart for a particular purpose; designate: assigned a day for the inspection. 2. randomly to the two types of exams) and is less affected by specification error (because we do not presume pre·sume v. pre·sumed, pre·sum·ing, pre·sumes v.tr. 1. To take for granted as being true in the absence of proof to the contrary: We presumed she was innocent. a specific equation representing the probability of obtaining a correct answer). (2) In short, we test for differences between MC and CR performances on each of the 23 sets of matching MC and CR questions and then consolidate these individual tests into an overall test. This analysis is broken into two parts, one addressing the 15 expected-difference questions and the other addressing the eight no-expected-difference questions. For a given question, let [theta Theta A measure of the rate of decline in the value of an option due to the passage of time. Theta can also be referred to as the time decay on the value of an option. If everything is held constant, then the option will lose value as time moves closer to the maturity of the option. ] denote de·note tr.v. de·not·ed, de·not·ing, de·notes 1. To mark; indicate: a frown that denoted increasing impatience. 2. the probability that the average student knows the correct answer. Then for the group answering the CR version of that question, the estimated value of [theta] is the fraction [[theta].sup.*.sub.CR] of correct answers of that group, with estimated variance The discrepancy between what a party to a lawsuit alleges will be proved in pleadings and what the party actually proves at trial. In Zoning law, an official permit to use property in a manner that departs from the way in which other property in the same locality (the standard formula for estimating the variance of a sample proportion) var [[theta].sup.*.sub.CR] = [[[theta].sup.*.sub.CR](1 - [[theta].sup.*.sub.CR])]/N, where N is the number of students in that group. Estimation estimation In mathematics, use of a function or formula to derive a solution or make a prediction. Unlike approximation, it has precise connotations. In statistics, for example, it connotes the careful selection and testing of a function called an estimator. of [theta] for the students answering the MC version of that question is complicated by the fact that students not knowing the answer can guess without penalty with a probability of 0.25 of being right. Because of this, a correction must be made before any comparison can be undertaken. Suppose [pi] is the probability that the student will get the correct answer and, as above, [theta] is the probability that the student actually knows the correct answer, so that [pi] = [theta] + (1 - [theta])(0.25). From the available data, [pi] can be estimated from the fraction [[pi].sup.*] of students who got the correct answer to the question, implying that [theta] can be estimated as [[theta].sup.*.sub.MC] = ([[pi].sup.*] - 0.25)/0.75, with estimated variance (calculated by exploiting the result that the variance of a constant k times a random variable x is [k.sup.2] times the variance of x) var [[theta].sup.*.sub.MC] = [(0.75).sup.-2][[pi].sup.*](1 - [[pi].sup.*])/N. The discussion above is relevant to the expected-difference questions because, for those questions, students can guess in the MC format but are not able to guess in the CR format. For the no-expected-difference questions, students can easily guess in the CR as well as the MC format, so comparisons for these questions should use the unadjusted MC score. Before describing our formal statistical analysis, we offer some overall perspective by reporting average total scores of the two groups on the MC and on the corresponding CR questions that form the basis for the empirical analysis of this article. Group A averaged 0.65 on its eight expected-difference MC questions (after adjusting for the influence of guessing, as explained earlier), with group B averaging 0.49 on the corresponding CR questions. Group B averaged 0.73 on its seven expected-difference MC questions, with group A averaging 0.51 on the corresponding CR questions. For the no-expected-difference questions, for which no adjustment for guessing was undertaken, group A averaged 0.39 on its four MC questions and group B averaged 0.41 on the corresponding CR questions. Group B averaged 0.72 on its four no-expected difference questions and group A averaged 0.66 on the corresponding CR questions. These summary results suggest that students perform better on those MC questions for which there is an expected difference and perform equally well on those questions for which there is no expected difference. The remainder of the empirical analysis of this article examines these results by looking at the sets of questions individually and by conducting related tests of statistical significance. Table 1 reports information concerning the 15 sets of expected-difference questions. In column 7 is the value of the t-test t-test, n an inferential statistic used to test for differences between two means (groups) only. This statistic is used for small samples (e.g., N < 30). Also called t-ratio, stu-dent's t. statistic statistic, n a value or number that describes a series of quantitative observations or measures; a value calculated from a sample. statistic a numerical value calculated from a number of observations in order to summarize them. for testing for zero difference between the adjusted MC score and its corresponding CR score. On the null hypothesis null hypothesis, n theoretical assumption that a given therapy will have results not statistically different from another treatment. null hypothesis, n that students should obtain the same score on MC and CR versions of a question, these t-values should be random draws from a t-distribution t-distribution see t statistic. . We have 15 such draws, consisting of one set of 8 draws and one set of 7 draws. The null hypothesis can be tested by testing if the mean of these draws is insignificantly different from zero using an overall t-statistic. (We perform a one-sided one-sid·ed adj. 1. Favoring one side or group; partial or biased: a one-sided view. 2. Characterized by the domination of one competitor over another: test against the alternative that the MC score is greater than the CR score.) This overall t-statistic has a normal distribution because the variance of the means of these random draws is known rather than estimated. For the set of 15 draws, for example, this variance is calculated as the variance of the t-statistic (approximately one) divided by 15. For the 15 draws, the overall t-st atistic is 10.44 with p-value p-value, n in statistics, the probability that a random variable will be found to have a value equal to or greater than the observed value by chance alone. This value provides an objective basis from which to assess the relative change in the data. 0.00. For the set of eight draws and the set of seven draws, the overall t-values are 6.69 and 8.14, respectively, with p-values 0.00 and 0.00. In all cases, the null A character that is all 0 bits. Also written as "NUL," it is the first character in the ASCII and EBCDIC data codes. In hex, it displays and prints as 00; in decimal, it may appear as a single zero in a chart of codes, but displays and prints as a blank space. of equal scores on the MC and CR versions of the questions is rejected at the 5% significance level, and in all cases, we conclude that students score higher on the MC version, even after adjustment for guessing. Although it is clear that, overall, students do better on the MC questions, in several specific questions, this difference is not statistically significant. This of course is not unusual in the presence of sampling error, but it is of interest to ask if these questions, particularly those with very low or negative t-values, have anything in common that could serve as an explanation for this statistical insignificance in·sig·nif·i·cance n. The quality or state of being insignificant. Noun 1. insignificance - the quality of having little or no significance unimportance - the quality of not being important or worthy of note . Four questions have t-values less than unity. Three of these, MCA5/CRB5, MCA12/CRB12, and MCB5/CRA5, have a common feature. (Readers are reminded that all the exam questions can be viewed at the website noted earlier.) In each of these questions, one of the distracters in the MC version referred to the monetarist Monetarist An economist who holds the strong belief that the economy's performance is determined almost entirely by changes in the money supply. Notes: Milton Friedman was a well-known monetarist. rule of fixing the rate of growth of the money supply or to an incorrect version of this (fixing the money supply). This rule had played a prominent role in the course. We speculate that, because of this, students answering the MC versions were influenced by this distracter dis·tract·er also dis·trac·tor n. One of the incorrect answers presented as a choice in a multiple-choice test. , whereas stu dents answering the CR versions may not have thought of this possible answer and so were able to reason through to the correct answer more successfully. The fourth question with a t-value less than unity is MCA3/CRB3. As for the three questions discussed above, we speculate that, in this question, a distracter played a prominent role, introducing students to a possibility that they would not otherwise have thought of, inhibiting in·hib·it tr.v. in·hib·it·ed, in·hib·it·ing, in·hib·its 1. To hold back; restrain. See Synonyms at restrain. 2. To prohibit; forbid. 3. their ability to think through to the correct answer. (For this question, the answer distribution was 22, 67, 3, and 8% on a, b, c, and d, respectively, suggesting that distracter a was influential.) If correct, this speculation suggests that the specific nature of the MC questions plays a crucial role in any comparison of MC and CR questions, with substantive implications for research studies in this area. Table 2 reports similar results for the eight no-expected-difference questions, where unadjusted MC scores are employed. Here, the overall t-statistic is 0.72 with p-value 0.47. For the first set of four draws, the t-statistic is -0.75, with p-value 0.45; for the second set of four draws, the t-statistic is 1.77, with p-value 0.08. We do not reject the null at conventional significance levels and so conclude that our expectation was correct-students perform equally well on these questions. Two questions here appear to have anomalous a·nom·a·lous adj. 1. Deviating from the normal or common order, form, or rule. 2. Equivocal, as in classification or nature. results. We have no explanation for why students did markedly worse on MCA1O than on CRB 10. The question is deficient de·fi·cient adj. 1. Lacking an essential quality or element. 2. Inadequate in amount or degree; insufficient. deficient a state of being in deficit. in that it does not spell out that the answer should address the longer run impact, not the very short-run Adj. 1. short-run - relating to or extending over a limited period; "short-run planning"; "a short-term lease"; "short-term credit" short-term short - primarily temporal sense; indicating or being or seeming to be limited in duration; "a short life"; "a impact, but this deficiency should handicap handicap In sports and games, a method of offsetting the varying abilities or characteristics of competitors in order to equalize their chances of winning. Handicapping takes many, often complicated, forms. equally both sets of students. This question is a true anomaly Noun 1. true anomaly - the angular distance of a point in an orbit past the point of periapsis measured in degrees angular distance - the angular separation between two objects as perceived by an observer; "he recorded angular distances between the stars" , although a referee A judicial officer who presides over civil hearings but usually does not have the authority or power to render judgment. Referees are usually appointed by a judge in the district in which the judge presides. has suggested that the first two MC answers are written slightly differently than the last two, which may have played a role. Regarding MCB4/CRA4, we suspect that some students answering the CR version may not have known that open market operations Open Market Operations The buying and selling of government securities in the open market in order to expand or contract the amount of money in the banking system. Purchases inject money into the banking system and stimulate growth while sales of securities do the opposite. involved changing annual Fed bond purchases, information that was provided in the MC answers, and so perhaps this question should have been placed in the expected-difference category. 5. Male Versus Female The role of gender in economic education has been extensively researched. Three questions of interest regarding gender can be addressed using these data. First, do the two genders score differently on the two types of tests? There exists empirical evidence that, in economics, men seem to perform better on multiple-choice questions than women. For example, of the 5815 men and 2164 women who wrote the Graduate Record Exam (GRE (Generic Routing Encapsulation) A tunneling protocol developed by Cisco that allows network layer packets to contain packets from a different protocol. It is widely used to tunnel protocols inside IP packets for virtual private networks (VPNs). ) in economics during 1989 and 1993, the average men's score was 651 while the average women's score was only 603 (Hirschfeld Noun 1. Hirschfeld - United States artist noted for his line-drawn caricatures (1904-2003) Al Hirschfeld , Moore Moore, city (1990 pop. 40,761), Cleveland co., central Okla., a suburb of Oklahoma City; inc. 1887. Its manufactures include lightning- and surge-protection equipment, packaging for foods, and auto parts. , and Brown 1995). Further, there are studies that found females, on the other hand, perform better on constructed response tests than males (Lumsden Lumsden may refer to: Places
Anderson, river, c.465 mi (750 km) long, rising in several lakes in N central Northwest Territories, Canada. It meanders north and west before receiving the Carnwath River and flowing north to Liverpool Bay, an arm of the Arctic , Benjamin, and Fuss 1994). If indeed there were a gender bias in test performance, then it would seem unwise to test students using only one of the test formats. This first question is addressed by comparing male and female average scores on the 24 common MC questions and comparing the male and female average scores on the common CR questions. There are 111 male students and 76 female students in our truncated truncated adjective Shortened sample. The former averaged 14.68 on the 24 common MC questions and 28.46 on the common CR questions. The latter averaged 14.05 and 30.51, respectively. The t-statistic for testing if the male and female average scores are insignificantly different from one another is 1.03 (p-value 0.30) for MC and -1.35 (p-value 0.18) for CR. We conclude that there are no evident gender differences. The second question of interest here is whether the results obtained earlier hold true for both males and females. This was investigated by looking at males and females separately. There are 55 male and 44 female students who wrote exam A, and there are 60 male and 37 female students who wrote exam B. After the truncation described earlier, there remained 52 male and 42 female students who wrote exam A and 59 male and 34 female students who wrote exam B. Table 3 reports results for males only, corresponding to Table 1 results for the entire sample for the 15 expected-difference questions. The overall t-statistic is 7.78; for the set of eight draws and the set of seven draws, the overall t-statistics are 4.38 and 6.81, respectively. All p-values are 0.00. We conclude that our earlier results regarding expected-difference questions hold for males. Table 4 reports results for females only, corresponding to Table 1 results for the entire sample for the 15 expected-difference questions. The overall t-statistic is 7.26; for the set of eight draws and the set of seven draws, the overall t-statistics are 5.46 and 4.78, respectively. All p-values are 0.00. We conclude that our earlier results regarding expected-difference questions hold for females. Table 5 reports male results relating to relating to relate prep → concernant relating to relate prep → bezüglich +gen, mit Bezug auf +acc the eight no-expected-difference questions, corresponding to Table 2 results for the entire sample. The overall t-statistic is 0.28 (p-value 0.78); for the set of eight draws and the set of seven draws, the overall t-values are -1.52 and 1.91, respectively, with p-values 0.13 and 0.06. We conclude that our earlier results regarding student performance on no-expected-difference questions hold for males. Table 6 reports female results relating to the eight no-expected-difference questions, corresponding to Table 2 results for the entire sample. From Table 6, the overall t-statistic for females is 0.93 (p-value 0.35); for the set of eight draws and the set of seven draws, the overall t-values are 0.97 and 0.34, respectively, with p-values 0.33 and 0.73. We conclude that our earlier results regarding student performance on no-expected-difference questions hold for females. The third question of interest relating to gender examined here is whether the difference between MC and CR performance on expected-difference questions is the same for males and females. This question was examined by calculating, for each of the 15 questions, the difference between the male and female differences between average (adjusted) MC score and average CR score. For MCA1/CRB1, for example, the value in column 7 of Table 4 was subtracted from the value in column 7 of Table 3. The standard error for this difference of differences was calculated and used to produce a t-statistic for testing if this difference of differences is equal to zero. The resulting 15 t-values were used to produce an overall t-statistic value of 0.32 (p-value 0.75). We conclude that the extent to which students score better on MC questions of the expected-difference type is the same for males as for females. 6. Good Versus Poor Students One last question we investigated was whether our result were different for good students versus poor students. We arbitrarily defined good students as those who scored in the top 40% of the (truncated) class as based on scores on the common part of the exam and poor students as those in the bottom 40% of the (truncated) class. This resulted in 39 good students and 45 poor students in version A and 45 good students and 35 poor students in version B. Our method of analyzing good versus poor students matched exactly our method of analyzing male versus female students. We find that both good and poor students perform better on expected-difference MC questions (p-values 0.00 and 0.00, respectively) and that both good and poor students perform equally well on no-expected-difference questions (p-values 0.71 and 0.42, respectively). Further, we find that, for expected-difference questions, the extent to which performance on MC questions exceeds performance on corresponding CR questions is the same for good students as for poor students (p-value 0.22). 7. Conclusion In Table 7, we summarize the substance of the empirical results via the overall t-statistics reported above. In all cases, the expected-difference overall t-statistics indicate that students score higher on MC questions than on "equivalent" CR questions. And in all cases, the no-expected-difference overall t-statistics indicate that students score about the same on MC questions as on "equivalent" CR questions. The main conclusion of this article is that, for certain types of MC questions, students score better than on "equivalent" CR questions, even after correction for guessing. In these questions, students are required to create a short verbal description, the content of which is not obvious from the context of the question. In such cases, the multiple-choice alternatives appear to provide students with help in remembering/deducing the answer or help in articulating the answer in an unequivocal fashion. An example of the former case is a question asking for a fact of some kind, such as the definition of a technical term or the method by which a number is calculated. An example of the latter case is a question asking the student to produce an explanation for an observed phenomenon, such as how to complete a news clipping or how to explain an anomalous result. These results were shown to hold true for male students, female students, good students, and poor students, with male versus female and good versus poor having no impact on the magnitude of this result. Further work might investigate whether this result holds for difficult versus easier questions or for questions posed at different cognitive levels (Bloom's taxonomy). We also concluded that there are types of MC questions that do not offer any advantage to students because their CR counterparts have possible answers that are so obvious that these questions are no different from the MC version. And last, on the basis of some seemingly seem·ing adj. Apparent; ostensible. n. Outward appearance; semblance. seem ing·ly adv. anomalous results, we speculated that the distracters associated with MC
questions could make those MC questions more difficult by causing
students to worry about erroneous erroneous adj. 1) in error, wrong. 2) not according to established law, particularly in a legal decision or court ruling. factors that they otherwise would not
have taken into consideration. Given that the relationship between MC
and CR questions depends on the type of question being asked, these
results raise some concerns about earlier studies investigating the
issue of whether MC and CR questions are highly correlated--the
particular nature of MC questions employed in these studies may have had
an undue influence on the results.
Table 1
Difference between CR Scores and Adjusted MC Scores for the
Expected-Difference Questions
Adjusted
Multiple- Average Average Constructed-
Choice Score Score Response
Questions ([[pi].sup.*]) ([[theta].sup.*.sub.MC]) Questions
MCA1 0.87 0.83 CRB1
MCA3 0.68 0.57 CRB3
MCA5 0.66 0.55 CRB5
MCA6 0.90 0.87 CRB6
MCA7 0.69 0.59 CRB7
MCA8 0.97 0.96 CRB8
MCA11 0.66 0.55 CRB11
MCA12 0.47 0.29 CRB12
MCB1 0.89 0.85 CRAl
MCB3 0.80 0.73 CRA3
MCB5 0.78 0.71 CRA5
MCB6 0.80 0.73 CRA6
MCB7 0.80 0.73 CRA7
MCB11 0.72 0.63 CRAll
MCB12 0.81 0.75 CRA12
Multiple- Average
Choice Score [[theta].sup.*.sub.MC] --
Questions ([[theta].sup.*.sub.CR]) [[theta].sup.*.sub.CR]
MCA1 0.47 0.36
MCA3 0.62 -0.05
MCA5 0.61 -0.06
MCA6 0.63 0.24
MCA7 0.37 0.22
MCA8 0.76 0.20
MCA11 0.24 0.31
MCA12 0.22 0.07
MCB1 0.44 0.41
MCB3 0.43 0.30
MCB5 0.68 0.03
MCB6 0.36 0.37
MCB7 0.65 0.08
MCB11 0.35 0.28
MCB12 0.65 0.10
Multiple- p-Value
Choice (One-
Questions t-Statistic Sided)
MCA1 5.14 0.00 (a)
MCA3 -0.57 0.72
MCA5 -0.77 0.78
MCA6 3.65 0.00 (a)
MCA7 2.68 0.00 (a)
MCA8 3.99 0.00 (a)
MCA11 3.89 0.00 (a)
MCA12 0.91 0.18
MCB1 6.17 0.00 (a)
MCB3 4.03 0.00 (a)
MCB5 0.36 0.36
MCB6 5.03 0.00 (a)
MCB7 1.13 0.13
MCB11 3.49 0.00 (a)
MCB12 1.32 0.09
(a)Significant at the 5% level.
Table 2
Difference Between CR Scores and Unadjusted MC Scores for the
No-Expected-Difference Questions
Multiple- Constructed- Average
Choice Average Response Score
Questions Score ([[pi].sup.*]) Questions ([[theta].sup.*.sub.CR])
MCA2 0.48 CRB2 0.39
MCA4 0.35 CRB4 0.38
MCA9 0.52 CRB9 0.53
MCA10 0.20 CRB10 0.34
MCB2 0.81 CRA2 0.73
MCB4 0.56 CRA4 0.41
MCB9 0.75 CRA9 0.71
MCB10 0.76 CRA10 0.79
Multiple-
Choice [[pi].sup.*] -
Questions [[theta].sup.*.sub.CR] t-Statistic p-Value
MCA2 0.09 1.25 0.21
MCA4 -0.03 -0.43 0.67
MCA9 -0.01 -0.14 0.89
MCA10 -0.14 -2.18 0.03 (a)
MCB2 0.08 1.31 0.19
MCB4 0.15 2.07 0.04 (a)
MCB9 0.04 0.62 0.54
MCB10 -0.03 -0.47 0.64
(a)Significant at the 5% level.
Table 3
Difference Between CR Scores and Adjusted MC Scores for the
Expected-Difference Questions, Male Students
Adjusted
Multiple- Average Average Constructed-
Choice Score Score Response
Questions ([[pi].sup.*]) ([[theta].sup.*.sub.MC]) Questions
MCA1 0.85 0.80 CRB1
MCA3 0.65 0.53 CRB3
MCA5 0.69 0.59 CRB5
MCA6 0.88 0.84 CRB6
MCA7 0.63 0.51 CRB7
MCA8 0.96 0.95 CRB8
MCA11 0.75 0.67 CRB11
MCA12 0.48 0.31 CRB12
MCB1 0.88 0.84 CRA1
MCB3 0.78 0.71 CRA3
MCB5 0.80 0.73 CRA5
MCB6 0.78 0.71 CRA6
MCB7 0.80 0.73 CRA7
MCB11 0.73 0.64 CRA11
MCB12 0.78 0.71 CRA12
Multiple- Average
Choice Score [[theta].sup.*.sub.MC] -
Questions ([[theta].sup.*.sub.CR]) [[theta].sup.*.sub.CR]
MCA1 0.49 0.31
MCA3 0.69 -0.16
MCA5 0.68 -0.09
MCA6 0.64 0.20
MCA7 0.42 0.09
MCA8 0.75 0.20
MCA11 0.25 0.42
MCA12 0.19 0.12
MCB1 0.40 0.44
MCB3 0.50 0.21
MCB5 0.65 0.08
MCB6 0.33 0.38
MCB7 0.62 0.11
MCB11 0.29 0.35
MCB12 0.56 0.15
Multiple- p-Value
Choice (One-
Questions t-Statistic sided)
MCA1 3.34 0.00 (a)
MCA3 -1.47 0.93
MCA5 -0.89 0.81
MCA6 2.31 0.0l (a)
MCA7 0.79 0.21
MCA8 2.94 0.00 (a)
MCA11 4.26 0.00 (a)
MCA12 1.11 0.13
MCB1 4.98 0.00 (a)
MCB3 2.07 0.02 (a)
MCB5 0.87 0.19
MCB6 3.88 0.00 (a)
MCB7 1.17 0.12
MCB11 3.52 0.00 (a)
MCB12 1.47 0.07
(a)Significant at the 5% level.
Table 4
Difference between CR Scores and Adjusted MC Scores for the
Expected-Difference Questions, Female Students
Adjusted
Multiple- Average Average Constructed-
Choice Score Score Response
Questions ([[pi].sup.*]) ([[theta].sup.*.sub.MC]) Questions
MCA1 0.90 0.87 CRB1
MCA3 0.71 0.61 CRB3
MCA5 0.62 0.49 CRB5
MCA6 0.93 0.91 CRB6
MCA7 0.76 0.68 CRB7
MCA8 0.98 0.97 CRB8
MCA11 0.55 0.40 CRB11
MCA12 0.45 0.27 CRB12
MCB1 0.91 0.88 CRAl
MCB3 0.82 0.76 CRA3
MCB5 0.76 0.68 CRA5
MCB6 0.82 0.76 CRA6
MCB7 0.79 0.72 CRA7
MCB10 0.71 0.61 GRA1l
MCB12 0.85 0.80 CRA12
Multiple- Average
Choice Score [[theta].sup.*.sub.MC] -
Questions ([[theta].sup.*.sub.CR]) [[theta].sup.*.sub.CR]
MCA1 0.44 0.43
MCA3 0.50 0.11
MCA5 0.50 -0.01
MCA6 0.62 0.29
MCA7 0.26 0.42
MCA8 0.79 0.18
MCA11 0.21 0.19
MCA12 0.26 0.01
MCB1 0.48 0.40
MCB3 0.33 0.43
MCB5 0.71 -0.03
MCB6 0.40 0.36
MCB7 0.69 0.03
MCB10 0.43 0.18
MCB12 0.76 0.04
Multiple- p-value
Choice (One-
Questions t-Statistic Sided)
MCA1 4.06 0.00 (a)
MCA3 0.89 0.19
MCA5 -0.05 0.52
MCA6 2.91 0.00 (a)
MCA7 3.63 0.00 (a)
MCA8 2.43 0.00 (a)
MCA11 1.53 0.06
MCA12 0.05 0.48
MCB1 3.96 0.00 (a)
MCB3 3.77 0.00 (a)
MCB5 -0.25 0.60
MCB6 3.11 0.00 (a)
MCB7 0.26 0.39
MCB10 1.42 0.08
MCB12 0.38 0.35
(a)Significant at the 5% level.
Table 5
Difference Between CR Scores and Unadjusted MC Scores for the
No-Expected-Difference Questions, Male Students
Multiple- Average Constructed Average
Choice Score Response Score
Questions ([[pi].sup.*]) Questions ([[theta].sub.CR])
MCA2 0.48 CRB2 0.47
MCA4 0.27 CRB4 0.39
MCA9 0.56 CRB9 0.56
MCA10 0.21 CRB10 0.36
MCB2 0.78 CRA2 0.69
MCB4 0.56 CRA4 0.40
MCB9 0.81 CRA9 0.69
MCB10 0.80 CRA10 0.83
Multiple-
Choice
Questions [[pi].sup.*]-[[theta].sup.*.sub.CR] t-Statistic p-Value
MCA2 0.01 0.11 0.92
MCA4 -0.12 -1.36 0.12
MCA9 0.00 0.00 1.00
MCA10 -0.15 -1.78 0.07
MCB2 0.09 1.07 0.28
MCB4 0.16 1.71 0.09
MCB9 0.12 1.46 0.14
MCB10 -0.03 -0.41 0.68
Table 6
Difference Between CR Scores and Unadjusted MC Scores for the
No-Expected-Difference Questions, Female Students
Multiple- Average Constructed- Average
Choice Score Response Score
Questions ([[pi].sup.*]) Questions ([[theta].sup.*.sub.CR])
MCA2 0.48 CRB2 0.24
MCA4 0.45 CRB4 0.35
MCA9 0.48 CRB9 0.47
MCA10 0.19 CRB10 0.32
MCB2 0.85 CRA2 0.79
MCB4 0.56 CRA4 0.43
MCB9 0.65 CRA9 0.74
MCB10 0.71 CRA10 0.74
Multiple-
Choice
Questions [[pi].sup.*]-[[theta].sup.*.sub.CR] t-Statistic p-Value
MCA2 0.24 2.26 0.01 (a)
MCA4 0.10 0.89 0.37
MCA9 0.01 0.09 0.94
MCA10 -0.13 -1.30 0.20
MCB2 0.06 0.68 0.50
MCB4 0.13 1.14 0.26
MCB9 -0.09 -0.85 0.40
MCB10 -0.03 -0.29 0.77
(a)Significant at the 5% level.
Table 7
Summary of Overall t-Statistic Results
Hypothesis Test Category t-Value p-Value
All data
15 expected-difference questions 10.44 0.00
8 no-expected-difference questions 0.72 0.47
Male Data
15 expected-difference questions 77.78 0.00
8 no-expected-difference questions 0.28 0.78
Female data
15 expected-difference questions 7.26 0.00
8 no-excepted-difference questions 0.93 0.35
Good-student data
15 expected-difference questions 7.35 0.00
8 no-expected-difference questions 0.53 0.71
Poor-student data
15 expected-difference questions 7.30 0.00
8 no-expected-difference questions 0.85 0.42
(1.) An interesting dilemma stems from this phenomenon. It could be argued that a student who is able to eliminate one or more of the distracters must know something about the concept being questioned, and so the change in guessing odds is an indirect way of providing such a student with partial credit. In this view, it could be claimed that, in our study, the MC score is a more accurate reflection of a student's understanding than is the CR score because, in the latter case, no partial credit is given. (2.) Several different ways of analyzing our data using specific functional forms are possible. A three-parameter logistic lo·gis·tic also lo·gis·ti·cal adj. 1. Of or relating to symbolic logic. 2. Of or relating to logistics. [Medieval Latin logisticus, of calculation model based on item response theory Item response theory is a body of theory used in the field of psychometrics. Pychometrics is concerned with the theory and technique of educational and psychological measurement. (see Hambleton Hambleton is a local government district of North Yorkshire, England. The main town and administrative centre is Northallerton, and includes the market towns and major villages of Bedale, Thirsk, Great Ayton, Stokesley and Easingwold. and Swaminathan 1985) could be used to estimate item characteristic curves for each question. This approach is common in the existing literature; Walstad and Robson See Robson cache. (1997) is an application in economic education. Swaminathan and Rogers (1990) suggest using a logistic analysis, which would be more intuitively appealing to economists. References Anderson, Gordon Gordon, river in W Tasmania, Australia, 125 mi (200 km) long. Flowing from mountains to the W coast, its main tributaries are the Franklin and Denison from the N, and Serpentine and Olga to the S. , Dwayne Benjamin Dwayne Benjamin (born March 10, 1961 in Orillia, Ontario) is a Canadian economist and member of the University of Toronto faculty. Benjamin is currently the managing editor of the Canadian Journal of Economics and an associate editor of the journal Economic Development and , and Melvyn A. Fuss. 1994. The determinants of success in university introductory economics courses. Journal of Economic Education 25:99-119. Becker, William William, crown prince of Germany William or Frederick William, 1882–1951, crown prince of Germany, son of William II. In World War I he commanded (1914) an army on the Western Front and was nominal commander in the German attack E., and Carol Johnston. 1999. The relationship between multiple choice and essay response questions in assessing economics understanding. Economic Record 75:348-57. Becker, William E., and Michael Watts
Michael J. Watts is "Class of 1963" Professor of Geography and Development Studies at the University of California, Berkeley, and in the eyes of some a . 2001. Teaching economics at the start of the 21st century: Still chalk and talk. American American, river, 30 mi (48 km) long, rising in N central Calif. in the Sierra Nevada and flowing SW into the Sacramento River at Sacramento. The discovery of gold at Sutter's Mill (see Sutter, John Augustus) along the river in 1848 led to the California gold rush of Economic Review, Papers and Proceedings 91:446-51. Bridgeman, Brent Brent, outer borough (1991 pop. 226,100) of Greater London, SE England. The area is a rail and industrial center. Its manufactures include automobile parts, clocks and watches, and electrical equipment. . 1992. A comparison of quantitative questions in open-ended o·pen-end·ed adj. 1. Not restrained by definite limits, restrictions, or structure. 2. Allowing for or adaptable to change. 3. and multiple-choice formats. Journal of Educational Measurement 29:253-71. Hambleton, Ronald K., and Hariharan Swaminathan. 1985. Item response theory: Principles and application. Boston Boston, town, England Boston, town (1991 pop. 26,495), E central England, on the Witham River. Boston's fame as a port dates from the 13th cent., when it was a Hanseatic port trading wool and wine. Having recovered from a decline in the 18th and 19th cent. , MA: Kluwer. Hirschfeld, Mary Mary, the mother of Jesus Mary, in the Bible, mother of Jesus. Christian tradition reckons her the principal saint, naming her variously the Blessed Virgin Mary, Our Lady, and Mother of God (Gr., theotokos). Her name is the Hebrew Miriam. , Robert Moore Robert Moore may refer to
Eleanor is a feminine given name. It is also sometimes spelt Elinor or Eleanore. Brown. 1995. Exploring the gender gap on the GRE subject test in economics. Journal of Economic Education 26:3-16. Katz, Irvin R., Randy The name Randy generally derives from the names Randall or Randolph (meaning wolf with a shield). Randy is used as a given name primarily in the US and Canada. Men known as Randy
1. Contraction of it is. 2. Contraction of it has. See Usage Note at its. it's it is or it has it's be ~have not the strategy. Journal of Educational Measurement 37:39-57. Kennedy, Peter E., and William B. Walstad. 1997. Combining multiple-choice and constructed-response test scores: An economist's view. Applied Measurement in Education 10:359-75. Lumsden, Keith Keith may refer to: People with the given name Keith:
Act of or capacity for grasping with the intellect. The term is most often used in connection with tests of reading skills and language abilities, though other abilities (e.g., mathematical reasoning) may also be examined. . Journal of Economic Education 18:365-75. Messick, Samuel Samuel, two books of the Bible, originally a single work, called First and Second Samuel in modern Bibles, and First and Second Kingdoms in the Septuagint. They are considered part of "Deuteronomistic history," in which the book of Deuteronomy functions as the . 1993. Trait trait (trat) 1. any genetically determined characteristic; also, the condition prevailing in the heterozygous state of a recessive disorder, as the sickle cell trait. 2. a distinctive behavior pattern. equivalence as construct validity construct validity, n the degree to which an experimentally-determined definition matches the theoretical definition. of score interpretation across multiple methods of measurement. In Construction versus choice in cognitive measurement, edited by Randy E. Bennett and William C. Ward. Hillsdale, NJ: Lawrence Erlbaum, pp. 61-73. Siegfried, John J., and Peter B. Kennedy. 1995. Does pedagogy vary with class size in introductory economics? American Economic Review, Papers and Proceedings 85:347-51. Snow, Richard E. 1993. Construct validity and constructed-response tests. In Construction versus choice in cognitive measurement, edited by Randy E. Bennett and William C. Ward. Hillsdale, NJ: Lawrence Erlbaum, pp. 45-60. Swaminathan, Haribaran, and H. Jane Rogers. 1990. Detecting differential item functioning Differential item functioning (DIF) occurs when people from different groups (commonly gender or ethnicity) with the same latent trait (the same ability/skill) have a different probability of giving a certain response on a questionnaire or test. using logistic regression In statistics, logistic regression is a regression model for binomially distributed response/dependent variables. It is useful for modeling the probability of an event occurring as a function of other factors. procedures. Journal of Educational Measurement 27:361-70. Sykes, Robert C., and Wendy M. Yen. 2000. The scaling of mixed-item format tests with one-parameter and two-parameter partial credit model. Journal of Educational Measurement 37:221-44. Tate, Richard. 2000. Performance of a proposed method for the linking of mixed format tests with constructed response and multiple choice items. Journal of Educational Measurement 37:329-46. Traub, Ross Ross , Sir Ronald 1857-1932. British physician. He won a 1902 Nobel Prize for proving that malaria is transmitted to humans by the bite of the mosquito. E. 1993. On the equivalence of the traits assessed by multiple-choice and constructed-response tests. In Construction versus choice in cognitive measurement, edited by Randy E. Bennett and William C. Ward. Hillsdale, NJ: Lawrence Erlbaum, pp. 29-44. Wainer, Howard, and David Thissen. 1993. Combining multiple-choice and constructed-response test scores: Toward a Marxist theory of test construction. Applied Measurement in Education 6:103-18. Walstad, William B. 1998. Multiple choice tests for the economics course. In Teaching undergraduate economics: A handbook
This article is about reference works. For the subnotebook computer, see .
New York, Middle Atlantic state of the United States. It is bordered by Vermont, Massachusetts, Connecticut, and the Atlantic Ocean (E), New Jersey and Pennsylvania (S), Lakes Erie and Ontario and the Canadian province of : McGraw-Hill, pp. 287-304. Walstad, William B., and William E. Becker. 1994. Achievement differences on multiple-choice and essay tests in economics. American Economic Review, Papers and Proceedings 84:193-96. Walstad, William B., and Denise Robson. 1997. Differential item functioning and male-female differences on multiplechoice tests in economics. Journal of Economic Education 28:155-71. Welsh, Arthur L., and Phillip Saunders. 1998. Essay questions and tests. In Teaching undergraduate economics: A handbook for instructors, edited by William B. Walstad and Phillip Saunders. New York: McGraw-Hill, pp. 305-318. Nixon Chan (*) Peter E. Kennedy (+) (*.) Economics Department, Simon Fraser University Simon Fraser University, main campus at Burnaby, British Columbia, Canada; provincially supported; coeducational; chartered 1963, opened 1965. The Harbour Centre campus in downtown Vancouver opened in 1989. , Burnaby, BC V5A 1S6, Canada. (+.) Economics Department, Simon Fraser University, Burnaby, BC V5A 1S6, Canada; E-mail kennedy@sfu.ca; corresponding author. We thank Bill Walstad for help with the literature search. |
|
||||||||||||||||||

ing·ly adv.
Printer friendly
Cite/link
Email
Feedback
Reader Opinion