Printer Friendly

Achievement calibration and causal attributions.


The way students explain success or failure on a test affects motivation to achieve on future tests. The purpose of the present study was to investigate whether these explanations are related to accuracy of self-assessments of preparedness prior to a test as well as self-assessments for performance immediately following a test. Attributions for good and poor calibrators were compared to identify possible distinctions between these groups. In general, good calibrators reported attributions that would likely enhance future motivation more than poor calibrators.


People are often inaccurate in their self-assessments, whether in judging their level of attractiveness (Gabriel, Critelli, & Ee, 1994), soundness of their reasoning, quality of their humor, or correctness of grammar in written work (Kruger & Dunning, 1999). Interestingly, inaccurate self-assessments are not necessarily a bad thing, especially when the error is on the side of self-enhancement. In fact, some researchers have found positive outcomes linked with optimistic beliefs, such as a propensity to persist in the face of failure (Bandura, 1977; Pajares, 1996), willingness to take risks to gain greater rewards (Seligman, 1990), and tendency to cope more effectively with physiological stress (Bandura, 1995; Seligman, 1990). Therefore, in many situations, the value of accuracy in one's self-assessments is debatable. However, in the classroom, educators posit that inaccurate self-assessments may contribute to poor learning outcomes because over-optimism may lead to underestimating the demands of academic tasks, failing to study sufficiently for exams, engaging in superficial reading of complicated material, and overlooking important information in complex problems (Hacker, Bol, Horgan, & Rakow, 2000). Likewise, overly negative self-assessment in relation to a task may result in discouragement, lack of motivation, and failure to approach the task at all (Pajares & Schunk, 2002). Thus, for academic tasks, accurate self-assessment may be best.

Bandura (1977) recommends a strategy for measuring the accuracy of self-assessment for a specific domain of study. The strategy involves asking participants to judge their capability to perform a task and then giving them the same or similar task to perform. Achievement calibration accuracy is the extent to which judgments of capability (e.g. predictions for performance on an exam) match actual performance (e.g. actual score on the exam). Hacker, Bol, Horgan, & Rakow (2000) utilized this strategy and found that as many as 1/3 of the undergraduate students who participated in their study markedly overestimated preparedness for the first course exam, earning scores 10 points or lower than predicted. Similarly, Garavalia, Ray, Murdock and Gredler (2004) reported that 37% of the students in their study expected to earn final course grades at least a half a letter grade above actual final grades. In both of these studies, lower achieving students tended to be the most inaccurate in their self-assessments. Likewise, Kruger and Dunning (1999) found a significant negative correlation between self-assessment and objective test performances. In four different studies, participants in the bottom quartile substantially overestimated performance in a variety of domains and, more interestingly, believed their scores were above average.

One purpose of the present study was to explore disparities between expected and actual achievement by investigating how students explain their performance when they do and do not perform in line with expectations. We believe these causal explanations are important to explore because prior research indicates that the causes to which students attribute their performance affect future performance (Weiner, 1986, 2000). In addition, prior research indicates that poor calibrators frequently fail to adjust self-assessments, even in the face of negative feedback from evaluative sources (e.g. receiving a failing grade from a professor) (Hacker et al., 2000). Failure to integrate the feedback and modify expectations might be understood in relation to how the student explains what happened. For example, the student might explain a failing test grade with an external-locus statement, such as "I failed the test because the teacher asked really hard questions." This attribution removes responsibility from the student for the performance so s/he is not likely to modify behavior or expectations.

Failure is a particularly important outcome to investigate because attributions tend to be more ambiguous than success-generated attributions (Kruger & Dunning, 1999). If one fails, the sources of blame are just about endless, whereas, when one succeeds, the credit usually goes to one's ability or effort. Because of the potential disparity between success and failure attributions, we believe that miscalibrators will provide different reasons for performance than more accurate peers, especially low-achieving miscalibrators

Present Study

This study is an initial exploration of the relationship between calibration accuracy and causal attributions. Students' calibration accuracy was computed and attributions for performance were collected for two exams in a graduate level Educational Psychology Research methods course. Weiner (1986) suggests that three aspects of attributions are important to consider--locus of causality, stability, and controllability. Some attributions are more supportive of future motivation than others and we believed that graduate students would generally report attributions that would enhance achievement motivation, given their demonstrated persistence in educational pursuits. Next, we examined the calibration accuracy of the graduate students. We were interested in the consistency of calibration accuracy across achievement groups and expected that poor calibration would be associated with lower achievement. Lastly, we examined qualitative distinctions in the manner in which good and poor calibrating students explained their performance. Following are the three research questions addressed in this study:

1. To what types of causes do graduate students most frequently attribute their test performance?

2. How accurate are graduate students in their achievement calibration?

3. Are there qualitative differences in the manner in which good and poor calibrating graduate students explain test performance?



Fifty-eight graduate students enrolled in two sections of an Educational Psychology Research Methods course at a Midwestern university were asked to participate in the study. Ten students were deleted from the final analyses due to incomplete data, leaving a sample size of 48. Seventy-nine percent of the sample was female and the majority of students were Caucasian (81%). The remainder of the sample was 15% African-American, 2% Hispanic, and 2% reported other for race. The average age was 33 (sd=10). All of the students were seeking graduate degrees in the School of Education with the largest number of students pursuing master's degrees in Counseling.


Predictions. Prior to each of two exams, students were given a take-home survey asking them to predict their exam score and describe their reasons for expecting that level of performance. Students returned the survey at the beginning of the test period.

Postdictions. The final question on each exam asked students to postdict their exam score.

Exam scores. Three exams were administered during the course. Only the first two exam scores were used in our analyses, because the third was the final exam, and we were unable to administer a follow-up attribution survey. Exams consisted of multiple choice items, ranging from the knowledge to analysis levels of complexity with an emphasis on application items. A table of specifications was created and provided to students for each test to make explicit the connection between instructional objectives and exam items. KR20 coefficients for the tests were .80 and .97, respectively. Scores were determined by computing the percentage of items answered correctly.

Causal Attributions. The Revised Causal Dimension Scale (CDSII; McAuley, Duncan, & Russell, 1992) consists of an open-ended question--"Why do you believe you earned the grade that you earned on this exam?"--followed by a 12-item scale with which students classify the reason. A semantic differential ranging from I to 9 is used to rate each item. The stem for each item is "Is the cause(s) something:" and a sample item is "That reflects an aspect of yourself" vs. "reflects an aspect of the situation." Four 3-item subscales provide measures of locus of causality, external control, stability, and personal control. McAuley, Duncan, & Russell, (1992) provide extensive evidence of the reliability and validity of the scale. In the present study, alpha coefficients for exam I and exam 2 were .69 and .85 for locus of causality, .78 and .85 for external control, .66 and .77 for stability, and .89 and .82 for personal control, respectively.

Achievement calibration. Calibration is operationalized as students' accuracy in predicting and postdicting grades on the two exams and is the absolute difference between the actual grade and the predicted or postdicted grade.


Students completed the prediction survey and turned in the completed survey immediately prior to the administration of each exam. As a final question at the end of each exam, students reported their postdicted grade. Immediately following the return of the graded exams, students completed the causal attribution survey. Four calibration scores were derived for each student by comparing the predictions and postdictions to students' actual scores for exams 1 and 2 (e.g. Predicted exam 1 grade vs. Actual exam 1 grade). A mean calibration score, along with the standard deviation, was computed for prediction calibration (M=8.2, sd=4.6) and postdiction calibration (M=7.7, sd=3.9) for each student. Accuracy was determined on a relative basis with students within one standard deviation of the mean classified as average. Students in the upper tail of the distribution were classified as good and students in the lower tail were classified as poor calibrators. In order to identify students at the extremes of calibration accuracy, only students who were consistently poor and consistently good across prediction and postdiction accuracy were included in the comparative analysis of attributions for good versus poor calibrators.

Results and Discussion

To answer the first question, we computed means for the four subscales of the Revised CDSII. The descriptive statistics for each subscale are presented in Table 1. In general, students rated their attributions for performance to a greater degree as something over which they had control as opposed to something over which they had little control (locus of causality). The average student rating for stability was 4.9 and 4.4, a midpoint on the 1 to 9 scale, and is indicative of a dispersion of student ratings across the scale. Students were more polarized towards personal control with mean ratings of 7.0 and 6.3. The lower mean for both exams for the external control subscale indicated polarization towards attributions associated with internal control. Therefore, the causal attributions for exam performance tended to be controllable by the student, with the locus of causality being within the student while the degree to which students believed the cause to be a stable factor varied across students.

In response to our second question, students' average predicted grades differed from actual grades by 8.2 percentage points (sd 4.6) and postdicted grades by 7.7 percentage points (sd=3.9). The maximum difference from actual grades was 23 percentage points for predictions and 22 percentage points for postdictions. A paired-samples t-test indicated no difference between prediction and postdiction calibration accuracy, t(47)=.800; p=.428. Therefore, students predicted and postdicted their test grades equally well and, on average, these graduate students were approximately 1/2 of a letter grade inaccurate in their pre and postdictions.

Pearson-product moment correlations between exam grades and predictions and postdictions indicated improvement in calibration accuracy from exam 1 to exam 2. Correlations were stronger between postdictions and actual grades than predictions and actual grades. Correlations between predicted and actual grades for exam 1 and 2 were .07 (p = 0.63) and .36 (p = .012), respectively. Correlations between postdicted and actual grades for exam 1 and 2 were .44 (p = .002) and .51 (p < .000), respectively. Thus, the relationship was significant for all but the exam 1 prediction.

To determine if calibration accuracy differed from pre to postdiction for high, medium, and low-achieving students, we divided students into high, average, and low achievement groups by classifying students 1 standard deviation or more above the mean as high achievers, students within 1 standard deviation of the mean as average achievers, and students more than 1 standard deviation below the mean as low achievers. The results of an ANOVA indicated differences among the three groups in calibration accuracy for all pre and postdictions (see Table 2). Tukey post hoc analyses revealed that the differences were between low achieving students and the other two groups for all four measures of calibration accuracy (see Table 3). See issue website

For our third question, we examined the attributions of students who were consistently poor (n=5) and consistently good (n=5) in their calibration accuracy. Students who were good calibrators tended to attribute their performance to studying (n=4) and their own behaviors. In addition, their attributions remained stable from exam 1 to exam 2. In contrast, poor calibrators attributed their performance to external factors and took no personal accountability for their performance. One person reported "luck" as the cause of their score while 3 students stated that they didn't know why they performed in the manner in which they did. None of the poor calibrators mentioned efforts that could be undertaken to enhance performance, even though 3 of the students scored in the C and D range on both tests. The average grade for the good calibrators was 89 for exam 1 and 90 for exam 2, while the average grade for poor calibrators was 78 and 72, respectively. Consistent with findings reported by Forsterling and Morgenstern (2002), more accurate calibrators achieved at higher levels than less accurate peers.


The present study found that students' causal attributions for exam performance tended to be controllable by the student, with locus of causality being within the student while the degree to which the student believed the cause to be stable varied across students. In addition, calibration accuracy was significantly lower for lower-achieving students than for average- and higher-achieving students. Finally, good calibrators were more likely to take personal responsibility for their academic performance, while poor calibrators tended to attribute academic performance to external factors external.

The findings of this study have a few important implications for education. First, determining students' calibration accuracy is relatively simple and may be easily obtained by classroom teachers. Identifying students who are poor self-assessors (miscalibrators) may help teachers identify students who hold maladaptive achievement beliefs. Second, students who are inaccurate in their grade expectations may also attribute their performance to causes that degrade future achievement motivation. Lastly, attributional retraining programs may be enhanced by helping students develop more accurate self-evaluations. However, any interventional research should consider the possible costs and benefits associated with modifying students' self-assessments, including methods to reduce deleterious effects on students' sense of self-worth or esteem. A possible solution is to avoid attempts to correct students' general ability to accurately self-assess, focusing instead on self-assessment for specific tasks and activities. In addition, well-defined, concrete tasks that are challenging, yet doable are most likely the best choices to use in efforts to improve accuracy.


Bandura, Albert (t977). Self-efficacy: Toward a unifying theory of behavioral change. Psychological Review, 84, 191-218.

Bandura, A. (1995). Exercise of personal and collective efficacy in changing societies. In Bandura, Albert, Self-Efficacy in Changing Societies (pp. 202-231). Cambridge: Cambridge University Press.

Forsterling, F., & Morgenstem, M. (2002). Accuracy of self-assessment and task performance: Does it pay to know the truth? Journal of Educational Psychology, 94, 576-585.

Gabriel, M. T., Critelli, J.W. & Ee, J. S. (1994) Narcissistic illusions in self-evaluations of intelligence and attractiveness. Journal of Personality, 62, 143-155.

Garavalia, L., Ray, M., Murdock, T., & Gredler, M. (2004). A comparative analysis of achievement calibration accuracy in developmental and nondevelopmental college students. Research and Teaching in Developmental Education.

Hacker, D., Bol, L., Horgan, D., and Rakow, E.A. (2000). Test prediction and performance in a classroom context. Journal of Educational Psychology, 92, 160-170.

Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in recognizing one's incompetence lead to inflated self-assessments. Journal of Personality and Social Psychology, 77, 1121-1134.

McAuley, E., Duncan, T.E., and Russell, D. W. (1992). Measuring causal attributions: The Revised Causal Dimension Scale (CDSII). Personality and Social Psychology Bulletin, 18, 566-573.

Pajares, F. (1996). Self-efficacy beliefs in academic settings. Review of Educational Research, 66, 543-578.

Pajares, F., & Schunk, D. H. (2002). Self and self-belief in psychology and education: An historical perspective. In J. Aronson & D. Cordova (Eds.), Psychology of education: Personal and interpersonal forces (pp. 5-25). New York: Academic Press.

Schunk, D., & Pajares, F. (2004). Self-efficacy in education revisited. In D.M. McInerney & S. Van Etten (Eds.) Big Theories Revisited, (pp. 115-138). Greenwich, CT: Information Age Publishing.

Seligman, M. (1990). Learned Optimism. New York: Pocket Books.

Weiner, B. (1986). An Attributional Theory of Motivation and Emotion. New York: Springer-Verlag.

Weiner, B. (2000). Intrapersonal and interpersonal theories of motivation from an attributional perspective. Educational Psychology Review, 12, 1-14.

L. Garavalia, University of Missouri--Kansas City

E. Olson, University of Missouri--Kansas City

S. Comeau, University of Missouri--Kansas City

Garavalia, Ph.D. is Associate Professor of Psychology in the College of Arts and Sciences; Olson and Comeau are doctoral students in the Counseling Psychology Ph.D. program in the School of Education.
COPYRIGHT 2005 Rapid Intellect Group, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2005, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion




Article Details
Printer friendly Cite/link Email Feedback
Author:Comeau, S.
Publication:Academic Exchange Quarterly
Date:Jun 22, 2005
Previous Article:Integrating music in history education.
Next Article:From small step to giant leap in research ability.

Related Articles
Student perceptions of beginning French and Spanish language performance. (Language Teaching & Learning).
Longitudinal Effects of Kindergarten. (Connecting Classroom Practice and Research).
The accuracy of self-efficacy: a comparison of high school and college students.
Students' perceptions of Jesus' personality as assessed by Jungian-type inventories.
Gender differences in attributions and behavior in a technology classroom.
Relating students' social and achievement goals.
Academic motivation profile in business classes.

Terms of use | Copyright © 2014 Farlex, Inc. | Feedback | For webmasters