Printer Friendly
The Free Library
5,675,470 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Assessment of thinking levels in students' answers.


Authors' Note: This research was supported in part by a grant to J.J. Pear from the Social Sciences and Humanities Research Council of Canada The Social Sciences and Humanities Research Council of Canada (French: (le) conseil de recherches en sciences humaine en Canada) (SSHRC/CRSH) is a Canadian federal agency which supports university-based training and research and training in the humanities and social . D.E. Crone-Todd was supported by a fellowship fellowship Graduate education A post-residency training period of 1–2 yrs in a subspecialty–eg, hand surgery, which allows a specialized physician to develop a particular expertise that may have a related subspecialty board; fellowship time is often  from the Social Sciences and Humanities Research Council of Canada. The authors gratefully acknowledge Ms. Sabrina Berry's assistance with this research project.

Abstract

Having first developed a method, based on Bloom's taxonomy taxonomy: see classification.
taxonomy

In biology, the classification of organisms into a hierarchy of groupings, from the general to the particular, that reflect evolutionary and usually morphological relationships: kingdom, phylum, class, order,
 (1956), for assessing the thinking levels required by study questions in computer-mediated courses (Crone-Todd, Pear & Read, 2000), we developed a method for assessing the levels at which students answer the questions. Reliability measures between two independent assessment groups were high (i.e., > 80%). The assessment procedure can serve diagnostic and research purposes in determining how to enable students to increase their thinking levels in post-secondary courses.

**********

Assessment of Thinking Levels in Students' Answers

One of the most important goals of post-secondary education is to promote the use of critical, or higher-order, thinking skills. To this end, educators must find ways to identify, teach, and encourage the use of these skills in their courses.

One of the largest hurdles in this process is developing a precise operational definition, or set of definitions, for what is meant by "higher-order thinking Higher-order thinking is a fundamental concept of Education reform based on Bloom's Taxonomy. Rather than simply teaching recall of facts, students will be taught reasoning and processes, and be better lifelong learners. ". There is, however, a lack of consensus concerning the definition of this construct. For example, higher-order thinking may be "reasoned argumentation" (Newman, 1991a, b), comparing elements in terms of sameness (Carnine, 1991), application of concepts or principles (Hohn, Gallagher, & Byrne, 1990; Semb & Spencer, 1976), making discipline-related judgments that are effective (Paul & Heaslip, 1995), or argumentation that is systematic and active (Mayer & Goodchild, 1990). It seemed to us that all of these definitions include various components of what is considered "higher-order thinking", or thinking that requires combining elements in different ways than those provided in a textbook textbook Informatics A treatise on a particular subject. See Bible.  or other course materials.

A set of definitions that appears to incorporate all of the definitions above is Bloom's (1956) taxonomy of objectives in the cognitive domain cognitive domain,
n area of study that deals with the processes and measurable results of study, as well as the practical ability to apply intelligence.
. The taxonomy, which incorporates behavioral behavioral

pertaining to behavior.


behavioral disorders
see vice.

behavioral seizure
see psychomotor seizure.
 definitions of cognitive processes Cognitive processes
Thought processes (i.e., reasoning, perception, judgment, memory).

Mentioned in: Psychosocial Disorders
, has been used in a variety of educational settings. Despite its popularity, however, those using the taxonomy for research purposes have encountered problems with its reliability (e.g., Calder, 1983; Gierl, 1997; Kottke & Schuster, 1990; Roberts, 1976; Seddon, 1978; Seddon, Chokotho, & Merritt, 1981). Recently, Crone-Todd, Pear, & Read (2000) used a modified version of Bloom's (1956) taxonomy in the cognitive domain to identify the thinking levels required by study questions in a computer-aided personalized per·son·al·ize  
tr.v. per·son·al·ized, per·son·al·iz·ing, per·son·al·iz·es
1. To take (a general remark or characterization) in a personal manner.

2. To attribute human or personal qualities to; personify.
 system of instruction (CAPSI CAPSI Canadian Association of Pharmacy Students and Interns
CAPSI Caltech Precollege Science Initiative
) course. The purpose of the study was to begin the development of a more reliable measure of higher-order thinking in CAPSI-taught using guided study questions (e.g., Pear & Crone-Todd, 1999; http://home.cc.umanitoba.ca/~capsi) than had been previously reported in the literature. Following the taxonomy, the thinking levels were: (a) Level 1 - Knowledge, (b) Level 2 - Comprehension comprehension

Act of or capacity for grasping with the intellect. The term is most often used in connection with tests of reading skills and language abilities, though other abilities (e.g., mathematical reasoning) may also be examined.
, (c) Level 3 - Application, (d) Level 4 - Analysis, (e) Level 5 - Synthesis, and (f) Level 6 - Evaluation. Briefly, in the modified taxonomy, Level 1 corresponds to rote learning rote learning
n.
Learning or memorization by repetition, often without an understanding of the reasoning or relationships involved in the material that is learned.
, Level 2 involves the ability to state an answer in one's own words, Level 3 is the ability to apply what one has learned to new problems or situations, Level 4 is the ability to break down concepts into smaller components, Level 5 is the ability to combine concepts to create new knowledge, and Level 6 is the ability to rationally argue or discuss a position with regard to a given topic. Levels 1 and 2 may be considered lower-order thinking (because they do not involve generation of new concepts or knowledge), while levels 3 through 6 may be consider higher-order thinking (see Crone-Todd et. al., Table 1). Hence, the higher-order levels involve the definitions that are similar to the ones explored by researchers other than Bloom bloom

1. the general appearance of the surface. In carcass meat it is the glistening, transparent effect and the gentle pink color that gives a good bloom to the carcass. It is the result of proper tissue hydration coupled with the correct proportions of fat, connective tissue and
, discussed above.

Following Williams' (1998) exhortation to operationally define constructs in education, Crone-Todd et al. (2000) developed a precise, step-by-step procedure for determining the level of any given question. They applied the procedure to the study questions in two CAPSI-taught courses and showed that high agreement on thinking levels could be obtained by independent groups of three raters. Independent groups, rather than individuals, were used because discussion among raters appeared to increase the reliability of the assessments. The present study undertook the next step, which is to develop a reliable method for assessing the thinking levels at which the questions are answered. Note that this assessment is different from the assessment of question levels in several important respects. First, in order to determine the level of an answer, one must determine whether the answer is correct both with respect to terminology and content. Second, because we wish to give the student the benefit of the doubt with regard to the adequacy of his or her answer, the process of determining the answer level proceeds in roughly the reverse order of the process of determining the question level. Thus, while the question is assessed at the lowest possible level at which the student can answer it, the answer is assessed at its highest possible level.

In the following we detail how we developed the answer-level assessment procedure and tested its reliability. This study parallels and extends Crone-Todd et al. (2000), which details the development of the question-level assessment procedure.

Method

The data for this study consisted of the archived answers provided by four randomly selected students on two midterm examinations Noun 1. midterm examination - an examination administered in the middle of an academic term
midterm exam, midterm

exam, examination, test - a set of questions or exercises evaluating skill or knowledge; "when the test was stolen the professor had to make a
 and a final examination from an undergraduate computer-mediated Behavior Modification behavior modification
n.
1. The use of basic learning techniques, such as conditioning, biofeedback, reinforcement, or aversion therapy, to teach simple skills or alter undesirable behavior.

2. See behavior therapy.
 Principles course, taught from January to April 2000. Both midterm examinations consisted of three short-answer essay-type questions, and the final examination consisted of 10 such questions. The examination questions were sampled from study questions that had been assessed in the Crone-Todd et al. (2000) study. The assessment focused on components of questions rather than individual questions, since a given question may have more than one component and the components may be at different levels. For example, a question that asks, "With examples, distinguish between rule-governed and contingency-shaped behavior" is broken into the following three components for assessment purposes: (1) An example of rule-governed behavior; (2) an example of contingency-shaped behavior; and (3) a clear distinction made between the two types of behavior. The analyses included 64 examination questions, which yielded 168 such components. The number of components assessed for each level are as follows: (a) Level l: 49; (b) Level 2: 92; (c) Level 3: 11; (d) Level 4: 14; (e) Level 5: l; and (f) Level 6: 0. The fact that levels 5 and 6 were not represented on the examinations is not unusual; we have found that typically few students do well on these types of questions and most are frustrated frus·trate  
tr.v. frus·trat·ed, frus·trat·ing, frus·trates
1.
a. To prevent from accomplishing a purpose or fulfilling a desire; thwart:
 by them. Thus, in essence, levels 5 and 6 were not assessed in this study.

As a guide to assessing the answers, the assessors used a flowchart flowchart

Graphical representation of a process, such as a manufacturing operation or a computer operation, indicating the various steps taken as the product moves along the production line or the problem moves through the computer.
 (see Figure 1). The flow-chart, along with assessment instructions, were revised by raters in the present study as had been done for the flow chart in the Crone-Todd et al. study to increase the precision of the assessment, as determined by the level of agreement obtained. There were two independent groups of raters, each consisting of two research assistants. One group was comprised of a doctoral candidate (the second author) and a third-year undergraduate student (the third author), while the other group was comprised of a third-year undergraduate student and an individual with a B.A. in Psychology (the fourth author). See <http://rapidintellect.com/AEQweb/win01.htm>.

Answers were assessed for each component of a question. As seen in Figure 1, answer components were assessed only if they used appropriate terminology and were correct. Raters were given the questions for the exams and the student's answers, but not the student identification, question levels, feedback, or grade on the exams. After each rater rat·er  
n.
1. One that rates, especially one that establishes a rating.

2. One having an indicated rank or rating. Often used in combination: a third-rater; a first-rater. 
 had independently assessed the exams, each group met separately to discuss and come to an agreement on the answer levels. The two groups then met and compared (without changing) their assessments for the purpose of obtaining an estimate of the intergroup in·ter·group  
adj.
Being or occurring between two or more social groups: intergroup relations; intergroup violence. 
 reliability. The groups then discussed disagreements and came to a final agreement on the levels of the answers. Raters had to agree on a given student's answers before assessing the next student's answers. This process was repeated for the assessment of all four students' examinations.

Results

For the purpose of calculating reliability, levels 1 and 2 (lower level thinking) and levels 4, 5, and 6 were combined to increase the instances in each category and to reduce the fineness of the discriminations required to obtain agreement. (Note that level 0 corresponds to an answer that used inappropriate terminology or that was otherwise incorrect.) Table 1 shows total number of agreements and disagreements in each category, the point-to-point agreement or percent of agreements (i.e., agreements divided by agreements plus disagreements and multiplied mul·ti·ply 1  
v. mul·ti·plied, mul·ti·ply·ing, mul·ti·plies

v.tr.
1. To increase the amount, number, or degree of.

2. Mathematics To perform multiplication on.
 by 100) between groups for each level, the value of the between-groups Kappa statistic statistic,
n a value or number that describes a series of quantitative observations or measures; a value calculated from a sample.


statistic

a numerical value calculated from a number of observations in order to summarize them.
 for each level, and the interpretation of the Kappa values obtained (Landis & Koch, 1977). The main advantage of the Kappa statistic is that it takes chance agreements into account (Kazdin, 1982). Note that the point-to-point agreements are all above 80% and that the Kappa values are all in the substantial range.

See issue's website <http://rapidintellect.com/AEQweb/win01.htm>.

Discussion

The results show that the reliability of the answer assessment method described in this paper is high. While the Crone-Todd et al. (2000) study provided a reliable method for assessing questions, the present study extends this research by advancing the methodology. Reliable methods now exist for distinguishing both questions and answers at levels 1 and 2 from levels 3 and 4. While more work is needed on reliably assessing finer differences between answer levels, the present study in combination with the Crone-Todd et al. (2000) study represents an advance in higher-level assessment. This advance in methodology has advantages for both researchers and teachers.

As researchers, we are in a better position to study how answer levels, and therefore thinking levels, may be increased. For instance, it has been suggested (Solman & Rosen, 1986) that having too many higher level questions in a course may lead to higher drop out rates and lower average scores. If this is the case, then thinking level may be adversely affected by too many higher-level questions. On the other hand, it may be possible to systematically incorporate higher- and lower-level questions in a way that facilitates higher-level thinking. Hence, one empirical issue to address is the optimal ratio of higher- to lower-level questions in a given course as the course progresses. It may be, for example, that as this ratio increases, or if it increases too rapidly as a course progresses, student performance decreases.

There are other research questions that can be answered with this assessment methodology used as a base. For example, the methodology will permit us to identify early in a course which individuals are having difficulty answering the higher level questions, even though they may be mastering the lower level questions (and thus showing appropriate motivation). Once such individuals are identified, we would then be able to study what types of remedial REMEDIAL. That which affords a remedy; as, a remedial statute, or one which is made to supply some defects or abridge some superfluities of the common law. 1 131. Com. 86. The term remedial statute is also applied to those acts which give a new remedy. Esp. Pen. Act. 1.  procedures would be required to help raise these individuals' thinking levels to the point at which they could answer more of the higher-level questions.

The methodology incorporated in this study would also permit us to study how the addition of various supports (e.g., class discussions, specific teaching of the levels, feedback on the levels) might generate higher-level thinking in students' answers to the questions. This might be extended to facilitating students generating answers above the level of the question (e.g., Crone-Todd, 2001). The two assessment procedures go hand-in-hand because we need both in order to determine whether a student has answered a given question at, above, or below the level of the question.

The importance of this study for teachers is grounded in the research. If higher education higher education

Study beyond the level of secondary education. Institutions of higher education include not only colleges and universities but also professional schools in such fields as law, theology, medicine, business, music, and art.
 involves learning to think at the higher levels identified in the modified Bloom's taxonomy, then the assessment methodology described here provides a direction for helping institutions of higher education fulfill ful·fill also ful·fil  
tr.v. ful·filled, ful·fill·ing, ful·fills also ful·fils
1. To bring into actuality; effect: fulfilled their promises.

2.
 their purpose. Learning systems, such as CAPSI, that are designed with links to theory and empirically validated val·i·date  
tr.v. val·i·dat·ed, val·i·dat·ing, val·i·dates
1. To declare or make legally valid.

2. To mark with an indication of official sanction.

3.
 research (i.e., "grounded designs" Hannafin, Hannafin, Land & Oliver, 1997), provide one approach for achieving the aims of educators who wish to help their students develop higher-order thinking skills.

References

Bloom, B.S. (1956). Taxonomy of educational objectives The Taxonomy of Educational Objectives, often called Bloom's Taxonomy, is a classification of the different objectives and skills that educators set for students (learning objectives). : Cognitive and affective domains affective domain,
n the area of learning involved in appreciation, interests, and attitudes.
. New York New York, state, United States
New York, Middle Atlantic state of the United States. It is bordered by Vermont, Massachusetts, Connecticut, and the Atlantic Ocean (E), New Jersey and Pennsylvania (S), Lakes Erie and Ontario and the Canadian province of
: David McKay.

Calder, J.R. (1983). In the cells of the "Bloom Taxonomy". Journal of Curriculum Studies, 15, 291-302.

Carnine, D. (1991). Curricular interventions for teaching higher order thinking to all students: Introduction to the special series. Journal of Learning Disabilities, 24 (5), 261-269.

Crone-Todd, D.E. (in progress). Increasing the levels at which undergraduate students answer questions in a computer-aided personalized system of instruction course. Unpublished doctoral dissertation dis·ser·ta·tion  
n.
A lengthy, formal treatise, especially one written by a candidate for the doctoral degree at a university; a thesis.


dissertation
Noun

1.
, University of Manitoba Location
The main Fort Garry campus is a complex on the Red River in south Winnipeg. It has an area of 2.74 square kilometres. More than 60 major buildings support the teaching and research programs of the university.
, Winnipeg, Manitoba, Canada.

Crone-Todd, D.E., Pear, J.J., & Read, C.N. (2000). Operational definitions for higher-order thinking objectives at the post-secondary level. Academic Exchange Quarterly, 4, 99 - 106. [There is an error in Figure 1 of this article; for the correct figure, see http://home.cc.umanitoba.ca/~capsi]

Gierl, M. J. (1997). Comparing cognitive representations of test developers and students on a mathematics test with Bloom's taxonomy. The Journal of Educational Research, 91, 26-32.

Hannafin, M.J., Hannafin, K.M., Land, S.M., & Oliver, K. (1997). Grounded practice and the design of constructivist con·struc·tiv·ism  
n.
A movement in modern art originating in Moscow in 1920 and characterized by the use of industrial materials such as glass, sheet metal, and plastic to create nonrepresentational, often geometric objects.
 learning environments. Educational Technology Research and Development, 45, 101-117.

Hohn, R.L., Gallagher, T., & Byrne, M. (1990). Instructor-supplied notes and higher-order thinking. Journal of Instructional Psychology, 17 (2), 71-74.

Kazdin, A.E. (1982). Single-case research designs: Methods for clinical and applied settings. New York: Oxford University Press.

Kottke, J. L., & Schuster, D. H. (1990). Developing tests for measuring Bloom's learning outcomes. Psychological Reports, 66, 27-32.

Landis, J., & Koch, G.G. (1977). The measurement of observer agreement for categorical data categorical data

data relating to category such as qualitative data, e.g. dog, cat, female. It may be nominal when a name is used, e.g. location, breed, or ordinal when a range of categories is used, e.g. calf, yearling, cow.
. Biometrics The biological identification of a person. Examples are face, iris and retinal patterns, hand geometry and voice. Increasingly built into laptop computers, fingerprint readers have become popular as a secure method for identification. , 33, 159-174.

Mayer, R., & Goodchild, F. (1990). The critical thinker. Santa Barbara Santa Barbara (săn'tə bär`brə, –bərə), city (1990 pop. 85,571), seat of Santa Barbara co., S Calif., on the Pacific Ocean; inc. 1850. , CA: Wm. C. Brown Publishers.

Newman, F.M. (1991a). Promoting higher order thinking in social studies: Overview of a study of 16 high school departments. Theory and Research in Social Education, 19, 324-340.

Newman, F.M. (1991b). Classroom thoughtfulness and students' higher order thinking: Common indicators and diverse social studies courses. Theory and Research in Social Education, 19, 410-433.

Paul, R.W. (1985). Bloom's taxonomy and critical thinking instruction. Educational Leadership, 42(8), 36-39.

Paul, R.W., & Heaslip, P. (1995). Critical thinking and intuitive nursing practice. Journal of Advanced Nursing, 22, 40-47.

Pear, J. J., & Crone-Todd, D. E. (1999). Personalized system of instruction is cyberspace Coined by William Gibson in his 1984 novel "Neuromancer," it is a futuristic computer network that people use by plugging their minds into it! The term now refers to the Internet or to the online or digital world in general. See Internet and virtual reality. Contrast with meatspace. . Journal of Applied Behavior Analysis The Journal of Applied Behavior Analysis (JABA) was established in 1968 as a The Journal of Applied Behavior Analysis is a peer-reviewed, psychology journal, that publishes research about applications of the experimental analysis of behavior to problems of social importance. , 32, 205-209.

Roberts, N. (1976). Further verification of Bloom's taxonomy. Journal of Experimental Education, 45, 16-19.

Seddon, G. (1978). The properties of Bloom's taxonomy of educational objectives for the cognitive domain. Review of Educational Research, 48, 303-323.

Seddon, G. M., Chokotho, N. C., & Merritt, R. (1981). The identification of radex properties in objective test items. Journal of Educational Measurement, 18, 155-170.

Semb, G., & Spencer, R. (1976). Beyond the level of recall: An analysis of complex educational tasks in college and university instruction. In L.E. Fraley & E. A.

Solman, R., & Rosen, G. (1986). Bloom's six cognitive levels represent two levels of performance. Educational Psychology, 6, 243-263.

Varas (Eds.), Behavior Research and Technology in College and University Instruction (pp. 115-126). Gainesville, FL: Department of Psychology, University of Florida University of Florida is the third-largest university in the United States, with 50,912 students (as of Fall 2006) and has the eighth-largest budget (nearly $1.9 billion per year). UF is home to 16 colleges and more than 150 research centers and institutes. .

Williams, R. L. (1999). Operational definitions and assessment of higher-order cognitive constructs. Educational Psychology Review, 411-427.

J.J. Pear is a Professor of Psychology, conducting research in basic and applied behavior analysis Some of the information in this article may not be verified by . It should be checked for inaccuracies and modified to cite reliable sources.

Applied behavior analysis (ABA)
. He has co-authored a popular textbook on behavior modification, authored a textbook on learning, and has been awarded the Fred S. Keller Behavioral Education Award by the American Psychological Association The American Psychological Association (APA) is a professional organization representing psychology in the US. Description and history
The association has around 150,000 members and an annual budget of around $70m.
 (2002) for distinguished contributions to education. D.E. Crone-Todd is a doctoral student in psychology, completing her dissertation on higher-order thinking. In addition, she has published on the use of computer-aided personalized system of instruction. K.M. Wirth is a 4th year student in psychology, and has presented reviews of personalized system of instruction at conferences. H.D. Simister recently graduated from the psychology program, and will be pursuing a graduate program in psychology.
COPYRIGHT 2001 Rapid Intellect Group, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2001, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Author:Simister, Heather D.
Publication:Academic Exchange Quarterly
Geographic Code:1USA
Date:Dec 22, 2001
Words:2708
Previous Article:Evaluating online learners in applied psychology.
Next Article:Evaluating pollutsim: computer supported roleplay-simulation.
Topics:



Related Articles
Making sense of authentic assessment. (educational assessment)
Operational Definitions for Higher-Order Thinking Objectives at the Post-secondary Level.(Statistical Data Included)
ENHANCING STUDENT ACHIEVEMENT ON PERFORMANCE ASSESSMENTS IN MATHEMATICS.
Making Informed Choices: A Model for Comprehensive Classroom Assessment.
Technology: servant or master of the online teacher *?
The Scholar project.
Formative classroom assessment using cooperative groups: Vygotsky and random assignment.
Making assessments work: your district just overhauled its assessments. Are you sure these improvements are reaching your students?
The library game: engaging unengaged freshmen.
Science literacy: a collaborative approach.

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles