Verbal ability and teacher effectiveness.

In June 2002, U.S. Secretary of Education Rod Paige issued the annual report on teacher quality, Meeting the Highly Qualified Teacher Challenge (U.S. Department of Education, 2002). In this report, the secretary echoed and amplified the sentiments of several of the most vocal critics of traditional teacher preparation (Haycock, 1998; Finn, 2001; Walsh, 2001). Secretary Paige's message regarding teacher selection was that "the only measurable teacher attributes that relate directly to improved student achievement are high verbal ability and solid content knowledge" (U.S. Department of Education, 2002, p. 39). Secretary Paige goes on to suggest that "mandated education courses, unpaid student teaching, and the hoops and hurdles of the state certification bureaucracy" (p. 40) discourage the best potential teachers.

The secretary's report reflects one side of a battle among (a) the traditional gatekeepers of the teaching profession (colleges and the states); (b) the education reformists who would more closely regulate teacher education, most notably, those responsible for the reports of the National Commission on Teaching and America's Future; and (c) the deregulationists who wish to strip states, the National Council for the Accreditation of Teacher Education, and traditional teacher preparers of their monopoly on teacher preparation. The latter group, to which Secretary Paige pays special heed, sees little use for traditional teacher preparation programs, backs policies to deregulate teacher education and to offer school-based alternative routes to licensure, and supports a definition of a highly qualified teacher as one with strong verbal ability and good subject-matter background.

Secretary Paige's message could not have been clearer. He portrays traditional teacher education programs as obstacles to the goal of more highly qualified teachers. He seems to assume that only verbal ability and content knowledge matter. What most would assume are necessary ingredients of good teaching, he portrays as sufficient and seeks to exclude formal teacher preparation as unnecessary. Several educators have responded with alarm to these conclusions. Perhaps the most thorough rebuttal of Secretary Paige's claims and those made earlier by Walsh (2001) came in the December 2002 issue of Educational Researcher in an article by Darling-Hammond and Youngs. These authors take on all of the claims made regarding teacher preparation and teacher effectiveness. A barrage of claims and counterclaims has followed. Much of the contention has surrounded a few research studies (Ferguson, 1991; Ferguson & Ladd, 1996; Greenwald, Hedges, & Zaine, 1996; Hanushek, 1971). Each side has used these studies to support its own position--sometimes, the

same study is used to support contrary claims. This pattern seems characteristic of many battles in education--battles in which sides are taken on the basis of philosophy and ideology and in which "scientific" evidence is chosen because it props up one's predetermined viewpoint.

Our goals in this article are to discuss the relationship of verbal ability and teacher effectiveness, to review previous research, and to explore implications of choosing teachers on the basis of verbal ability. To make the case, we include a commonsense discussion of the relationship of verbal ability and teacher effectiveness and describe a research study that used a rigorous measure of teachers' verbal ability (the verbal test of the Graduate Record Examination).


Defining Verbal Ability

What is verbal ability? In general usage, verbal ability refers to a person's facility at putting ideas into words, both oral and written. This facility involves possessing not only a strong working vocabulary but also the ability to choose the right words to convey nuances of meaning to a chosen audience. Verbal ability also includes the ability to organize words in coherent ways. Verbal ability is a part of the traditional construct of intelligence, with most conventional intelligence tests measuring verbal ability, quantitative reasoning, and logical thinking. Verbal ability is usually demonstrated as the ability to write and speak well.

Teachers are, among other things, explainers. Explaining is usually a verbal process. Good teachers are also role models of appropriate speech and good writing. They must also be adept at understanding the verbal communication of their students and be able to help students improve their verbal skills. Teachers are generally expected to be able to do the following:

* clearly and cogently present information,

* give clear explanations,

* help students put their ideas into words,

* help students improve their communication skills,

* help students understand the meaning of written language,

* provide apt analogies to assist learning,

* communicate well with parents both in speech (be "well spoken") and in writing, and

* communicate effectively with administrators.

It requires no lengthy argument to convince one that verbal ability is related to good teaching.

There is a commonsense connection between verbal ability and teaching ability; therefore, it is not surprising to find some who advocate measures of verbal ability as important criteria for choosing prospective teachers. It would be silly to argue that verbal ability should not be considered in selecting teachers. Verbal ability is at once a proxy for general academic ability as well as the basis for communication, which is an essential part of teaching.


Interest in research on teachers' verbal ability is sparked by the debate between teacher educators who strongly support pedagogical training and those who claim that teacher characteristics--especially verbal ability--and subject-matter background are more important than pedagogical training.

Research on teacher characteristics, including verbal ability, was popular in the 1940s and 1950s. The Handbook of Research on Teaching includes a chapter on "The Teacher's Personality and Characteristics" (Getzels & Jackson, 1963). A bibliography by Domas and Tiedeman (1950) contained more than 1,000 titles on the subject. More recent work is surfacing under the heading of "dispositions," and important new research may be forthcoming. We focus our review on the most frequently cited work.

The review of research by Wilson, Floden, and Ferrini-Mundy (2001) focuses on subject-matter preparation, pedagogical preparation, clinical training, policy, and alternative certification programs. Like much recent research on teacher education, the characteristics of successful teachers are not included in their review.

The recent advocates of using verbal ability and content knowledge to choose teachers (i.e., Paige, Haycock, Walsh, Finn) do not give very specific guidelines for assessing verbal ability in potential teachers. Rather than discuss how to assess verbal ability in prospective teachers, they instead refer to research that allegedly supports a connection between verbal ability and teacher effectiveness. The following review examines this research.

One of the more recent studies of verbal ability and teaching effectiveness was similar to the one that will be described herein. This study was completed at the University of New Hampshire in 1996 (Andrew et al., 1996). This study looked at predictors of success in a full-year teaching internship. From a sample of 374 interns who had completed a full year of teaching, supervisors were asked to identify the most outstanding and the weakest interns. Fifty-eight (16%) were classified by supervisors as outstanding, and 42 (14%) were classified as weak. The intern population was nearly equally split into elementary and secondary teachers and covered a wide range of secondary subjects. Supervisors were asked to describe the factors that they thought made the interns either outstanding or weak; t tests were computed on interval data, comparing weak and outstanding interns on Graduate Record Examination (GRE) scores. As the authors point out,
   GRE verbal scores showed no significant predictive
   value. Of the ten students with GRE verbal scores below
   the 400 normal program cutoff, seven of the ten
   were judged to be outstanding interns (in the top
   13% of all interns). (Andrew et al., 1996, p. 276)

This study was limited by reliance on only one measure of teacher effectiveness (supervisor ratings) and by the limited scope of the GRE in assessing teachers' verbal ability.

Ehrenbeg and Brewer (1994) argue that more selective colleges enroll students with higher verbal test scores. They link the selectivity of teachers' colleges to subsequent student dropout rates and conclude a link between teachers' verbal ability and lower drop out rates of their students.

Secretary Paige and other advocates of the "verbal-ability argument" refer to a core sample of studies that draw primarily upon data gathered in the late 1960s through the 1980s and, most often, refer to two research studies, the famous Coleman et al. (1966) report and Ferguson's (1991) research using data collected in Texas and Alabama. The verbal-ability advocates argue that teachers' basic verbal abilities are highly correlated with student success and other measures of teacher effectiveness. Secretary Paige (U.S. Department of Education, 2002) writes,
   Ever since the publication of the Coleman report,
   studies have consistently documented the important
   connection between a teacher's verbal and cognitive
   abilities and student achievement. Teachers'
   verbal ability appears to be especially important at
   the elementary level, perhaps because this is when
   children typically learn to read. (p. 7)

The Coleman et al. (1966) study analyzed approximately 645,000 students in 3,100 schools in Grades 3, 6, 9, and 12. Verbal ability was equated with teacher vocabulary as measured by teachers' self-administered vocabulary tests. Researchers compared vocabulary scores with student achievement test scores. Three problems are evident. First, the Coleman report makes claims for teachers' verbal ability and student achievement only for African American students, not for the combined student scores. Second, the relationship emerges only in Grades 3 and 6. Third, the limited focus of the instrument the researchers used (vocabulary) casts some doubt on the soundness of making claims about the more complex concept of verbal ability as it applies to teachers.

Verbal-ability advocates also draw heavily upon Ferguson's examination of Texas and Alabama teachers' verbal scores on two independent tests (Haycock, 1998; Walsh, 2001). Ferguson and Ladd (1996) conducted an analysis of education and its production functions, seeking to link student outcomes as measured by test scores (and their changes over time) with a variety of school and teacher inputs such as student-teacher ratios, teacher abilities, and expenditures per pupil.

Ferguson's (1991) study analyzed Texas teachers' verbal ability as measured by the Texas Examination of Current Administrators and Teachers (TECAT), a test given to then-current teachers and administrators seeking recertification. (1) The TECAT purported to test literacy skills using responses to a reading. The exam consisted of items that measured the following:

* identify the main idea (10 items);

* identify specific details (5 items);

* identify sequential steps (5 items);

* distinguish fact from opinion (10 items);

* draw inferences (5 items);

* use and select reference sources (10 items); and

* comprehend job-related vocabulary (10 items).

Ferguson compared teachers' TECAT scores, number of students in teachers' classes, teachers' experience levels, and degrees earned with reading and math scores of students as measured by the Texas Educational Assessment of Minimum Skills (TEAMS). Exams were given to 1st, 3rd, 5th, 7th, 9th, and 11th graders in 1985 to 1986, 1987 to 1988, and 1989 to 1990. Ferguson reported that the average teacher's TECAT score explained between 19% to 25% of the variance of each district's average TEAMS reading exam scores in all grades except the first. These data suggest a relationship between some measure of teacher verbal skills and student performance as measured on standardized tests, but the link is weaker than reported in the sweeping statements made by Paige and others. Of the variance, 75% or more was explained by other factors.

Ferguson and Ladd's (1996) examination of Alabama fourth-grade students and their teachers' ACT exam score results is often cited as strong evidence for the impact of teachers' verbal ability on student learning. The researchers conclude that
   the skills of teachers as measured by their test scores
   exert consistently strong and positive effects on student
   learning despite the fact that the data are limited
   and test scores are an imperfect measure of
   teachers' skills. (p. 288)

In this research, the ACT scores were available for only one quarter of the students' teachers. To create usable data, the researchers averaged the scores available for the teachers in particular schools. In addition, district ACT averages were available on just over half of the teachers who had fewer than 5 years of experience and "only for a much smaller proportion of more experienced teachers" (Ferguson & Ladd, 1996, p. 281). The researchers created a regression analysis to predict what the other teachers' ACT scores might have been had the data been available. Additional problems were encountered with student test data. Researchers had to create the existence of testing data over time because they did not have a consistent test, one given in the third and again in the fourth grade, upon which to establish their value-added argument. In their second effort to establish the relationship between teacher verbal ability and gains in test scores between fourth- and ninth-grade students, the researchers compared two different sets of students.

One can view these as the scores of the younger brothers and sisters of the current eighth-grade students and, therefore, as proxies for the scores that the current eighth and ninth graders would have earned when they were in the third and fourth grades (Ferguson & Ladd, 1996, p. 281).

Two studies using the same data set from the 1970s further emphasize the complexity of proving a relationship between verbal ability and teacher effectiveness. Hanushek (1992) linked teachers' verbal ability and experience to reading test-score gains on mostly low-income African American students. Hanushek controlled for student background characteristics and, again, identified teachers' literacy scores (as measured by a short word test) to positively influence student reading scores. The same data set was used by Murnane and Philips (1981) to reveal a negative relationship between teachers' verbal scores and students' scores on the Iowa Test of Basic Skills.

Hanushek (1971) conducted considerable earlier research to analyze various inputs (school, teacher, and community) on student learning. His examination of teacher characteristics indicated that verbal ability as measured by the Quick Word Test 2 explained some of the variation in second- and third-grade students' scores on Stanford Achievement Tests (SATs). Hanushek's conclusions are constrained by two problems. First, he attempts to link teachers' verbal ability to student achievement with one-shot glimpses of both teachers and students. Without baseline data with which to assess student gains in achievement over time, he is unable to make a neat link between the SAT results and the scores of the students' present teachers. Second, like other pen-and-paper tests, the Quick Word Test 2 measures only a limited aspect of verbal ability. It likely measures some aspect of literacy, but one cannot reasonably equate it with verbal ability.

Webster (1988) reported a strong relationship between teachers' verbal scores as measured by a self-created test, the Personnel Services Test (PST; based on the Wesman Personnel Classification Test), and a measure of instructional behavior created by the Dallas Independent School District. Although Webster reports a strong relationship between the test and teachers' instructional abilities, he also reported "no consistent relationship ... between quality of instruction as measured by CARCS [Class Average Residualized Composite Score] and standardized test achievement" (p. 249). (2) Webster admits that the PST "required minimal cognitive processing" (p. 246) and consisted primarily of analogies. The measurement of instructional behavior, the dependent variable in this study, was based on Gagne's theory of pedagogy, a pedagogical model that identifies quite specific teaching behaviors as contributing to student learning. The significant relationship reported relied upon participant numbers between 5 and 11.

Schacter (2000) of the Milken Family Foundation, points to the meta-analysis conducted by Greenwald, Hedges, and Laine (1996). He claims that their work "demonstrated that teacher verbal ability exerts a positive and significant effect on student achievement in 30 of the 60 studies examined" (p. 4). Other proponents of the verbal-ability argument often cite this study as an indication of the strength of the research supporting their position (Walsh, 2001). Examining this meta-analysis of 60 "primary research studies" reveals that Greenwald et al. (1996) analyzed one source on teachers' verbal abilities: "The teacher ability finding is based on a single publication, Ferguson, 1991" (p. 371). Greenwald et al. state that the data are thus too limited to draw conclusions from; they mention that analyzing such data is not warranted in their study, and thus, they do not include it in their analysis. Coincidentally, they conclude that teacher education is one of three variables that does show a "very strong" relationship to student achievement (p. 17). This result is called into question in a critique of Ballou and Pogursky (2000), which seeks to discredit the teacher education reformers responsible for the National Commission on Teaching and America's Future reports.

How one interprets the research on teachers' verbal ability and teaching effectiveness is open to debate. Wayne and Youngs (2003) do a comprehensive analysis of the research and summarize many of the studies reviewed herein. As we have noted, these studies use old data sets and limited measures of teacher verbal ability. Wayne and Youngs conclude that these studies do not lead to "clear conclusions" (p. 100) but that test scores of teachers more positively affect student learning if college ratings (where teachers were prepared) are not taken into account.

We find that the often-quoted studies reveal the complex nature and limitations of the research linking teachers' verbal abilities and student performance. We do not wish to argue that the previously reported data are without merit. Our point is that the national efforts to simplify good teaching as the product of verbal ability and subject-matter knowledge are predicated upon studies whose data do not support the sweeping claims often made. Our conclusion is that there is some positive relationship between measures of teachers' verbal ability and teacher effectiveness but that the relationship is overstated in both the literature and the policy discussions that rely on this body of literature. We are left to rely on our common sense, that teachers' verbal ability surely is a necessary ingredient of good teaching. Teachers spend the bulk of their daily interactions with students communicating in a variety of ways, some written, some through body language, and most often through telling, discussing, directing, listening, explaining, and questioning. Having a certain level of verbal ability is obviously a necessary part of a teacher's vital skill set. The study that follows, though suggesting a possible threshold of necessary verbal ability, challenges the conclusion that single measures of verbal ability are sufficient to predict teacher effectiveness.



Context of the Study

The University of New Hampshire's 5-year and postbaccalaureate teacher education programs for elementary and secondary teachers are designed for both undergraduates who choose to teach (at any point in their undergraduate years) or graduates who may be career changers. Either route requires, at the outset, a subject-field major and a semester of classroom teaching to give a base of experience, a platform for thoughtful career choice, and evidence for possible admission to the remainder of the program. Undergraduates who pass this first experience with positive recommendations may begin professional core courses while they complete their subject-field major and other baccalaureate requirements. Students who enter as graduate students may take courses in a special summer program or during the academic year. Both groups must do a full-year internship. Both groups must complete a minimum of five professional core courses.

Graduates are followed by survey at the end of 1 year and 5 years. Approximately 90% enter teaching (for at least 1 year), and approximately 86% of the total number of graduates are still teaching in 5 years. Approximately 170 teachers graduate from these programs each year.


Our sample consisted of 116 interns enrolled in a 5-year or postbaccalaureate teacher education program at the University of New Hampshire. Admission to the program is based on multiple factors and is selective (see Table 1 for mean GRE subtest and undergraduate grade point average [GPA] scores). Participants represented a wide array of undergraduate majors (e.g., math, sciences, foreign language, English, family studies, history, kinesiology). Just over 40% of the participants were seeking elementary certification; the remainder were pursuing certification in a variety of secondary fields. All participants were in the final stage of their master's degree program, which requires the completion of a yearlong internship with an experienced teacher and a university supervisor. Nearly all were required to submit verbal, quantitative, and analytical scores on the GRE as a prerequisite to admission. (3) Table 1 shows verbal (GRE-V), quantitative (GRE-Q), and analytical (GRE-A) scores on the GRE and undergraduate GPA for the sample.

Measures and Data Collection

The study was designed to assess the value of verbal ability as a predictor of teaching ability. Descriptions of the variables follow.

Verbal ability. In this study, we measure verbal ability using the GRE-V. This measure was chosen in part because it is consistent with the rigorous, standardized types of measures that are being proposed to make decisions about teachers, students, and teacher education programs. The GRE is a far more rigorous measure than the Praxis I or other general assessments of teacher candidates' literacy that are commonly used. It is more rigorous than the high-school-level ACT and far more comprehensive than the self-administered vocabulary tests used by Coleman et al. (1966) or the TECAT used by Ferguson (1991). As many have pointed out, basic literacy tests for teachers are often aimed at high school levels of proficiency. The GRE-V is a specific test of verbal ability of college graduates (baccalaureate), and it is used as a predictor of academic success in graduate-level education. The Educational Testing Service (ETS), which produces the GRE, reports that the verbal section "tests the ability to analyze and evaluate written material and synthesize information obtained from it, to analyze relationships among component parts of sentences, and to recognize relationships between words and concepts" (ETS, 2003, p. 4). Scores from the analytical and quantitative sections of the GRE were also collected.

Teacher performance. Supervisors use a common set of performance goals and expected outcomes in evaluating their interns' performance over the course of the internship. More specifically, the seven major goals of the teacher education program serve as the criteria for assessment. Table 2 illustrates the outcomes and performance indicators for one of the seven goals. Through the use of the Mid-Semester Assessment Worksheet, supervisors have a common and thorough understanding of key performance characteristics for educators. Biweekly meetings of supervisors focus on developing a shared understanding of these criterion measures.

At the end of the school year, intern supervisors were sent a letter asking them to rate the performance of their interns as either acceptable, good, very good, or outstanding. To clarify the rating categories and avoid the Lake Wobegon effect (where everyone is considered "above average"), the letter suggested that the performance level of interns is likely to be normally distributed. For instance, one sixth of all interns probably fall in the acceptable category, one third in good, one third in very good, and one sixth are deserving of an outstanding rating. (Unacceptable or failing interns [1%-2%] had already been removed from the program at the time of this study.) We consider the professional judgments of teacher educators to be one of the more reliable and valid methods of assessing teacher performance.

Two thirds (68%) of the 116 interns received a rating. Table 3 presents the distribution of scores, which is imbalanced toward the positive end but, nevertheless, bears some resemblance to a bell shape. The modal rating was very good (44%), and the least occurring rating was acceptable (9%).

Analysis and Results

Our main task was to explore the relationship between verbal ability (as measured by GRE-V scores) and teaching ability (as measured by supervisor ratings of teacher interns). Several modes of data analysis were employed, including graphical and mean score comparisons. We begin with the correlational analysis.

The Pearson product-moment correlation between intern performance ratings and GRE-V scores was small and statistically nonsignificant (r = .234; p = .247; n = 76). We also computed correlations between the other GRE subtests and our intern rating measure. The GRE-A subtest exhibited a statistically significant correlation (r = .30; p = .009; n = 75). The correlation between the GRE-Q and intern rating was small and did not reach the (.05) standard level of significance (r = .20; p = .084; n = 76). It is worth noting that among all three GRE subtests, the GRE-V demonstrated the weakest relationship with teacher performance.

Mean GRE-V scores for each rating category are shown in Table 4. On average, participants in the upper three rating categories scored substantially higher on the GRE-V than those in the acceptable group. However, a one-way analysis of variance revealed no statistically significant differences among groups ([F.sub.3,69] = 1.958; p = .128). The statistical insignificance is likely due to the low number of participants rated in the lowest category of acceptable; cells of such small size weaken statistical power.

GRE-V scores for each rating category are presented graphically by way of box and whisker plots in Figure 1. Side-by-side box plots afford the direct comparison of score distributions. Each box represents the interquartile range, which contains the middle 50% of scores in the distribution. The whiskers portray the range of scores toward the tails of the distributions. There is relatively more variability in GRE-V scores for those interns rated as good. Each group, however, exhibits a wide range of variation in verbal scores, which casts doubt on any robust relationship between verbal acuity and teacher ability (as measured in this analysis). Particularly noteworthy are the low GRE-V scores evidenced by interns in all four groups.


A secondary question that emerged during the study was whether subject-area discipline makes a difference in the verbal-teaching ability relationship, or more precisely, if it mediates the relationship. We explored this question by identifying and separating those among the sample who majored in math or science-related areas and those who majored in English or humanities fields. We were able to identify 20 math/science interns and 31 English/ humanities interns. Identification was rather straightforward. For example, entomology, math, and biology majors clearly fell in the math/science group, whereas English and history majors clearly fell in the English/humanities group. An independent-samples t test revealed no statistically significant differences in mean GRE-V scores between the math/science and the English/humanities groups (t = 1.20; df = 49; p = .235). A significant difference was found between groups when GRE-Q was the dependent variable ([M.sub.math/sci] = 638.00; [M.sub.Eng/hum] = 499.35; t = 5.63; df = 49; p = .000).

Methodological Limitations

This study is confined to a population of teachers at one institution. Students selected for teaching at this state university are above the norm of teachers nationally with regard to verbal ability, and the sample size is limited. We would like to see similar studies conducted with a larger, more representative sample of teachers in the United States.

One may question the validity of the GRE-V as a measure of teachers' verbal ability. The wide range of GRE-V scores evidenced in every performance category in this study may not necessarily confirm the lack of relationship between verbal acuity and teacher ability. It may, instead, be indicative of an unreliable method of measuring teachers' verbal ability. There is some disagreement in the literature regarding the predictive validity of the GRE subtests. For instance, Williams and Sternberg (1997) argue that the GREs are poor predictors. In contrast, Kuncel, Hezlett, and Ones (2001) completed a meta-analysis of several studies and found strong support for GREs as predictors of graduate-school performance. Most studies, however, address the predictive validity of the GRE for graduate-school performance, not its validity for assessing acuity in verbal, quantitative, and analytic domains. Other quantitative measures of verbal ability might include the SAT-V, or the ACT, but these have specific predictive purposes and do not necessarily measure verbal ability as employed by teachers.

The argument put forth by critics of teacher education with regard to the predictive power of verbal ability (predicting good teachers) doubtless relies on standardized exams such as the GRE-V and SAT-V. For this reason, we wished to examine a standardized test for this study. The GRE-V should satisfy the demands for high standards in teacher selection tests and seems to be a more comprehensive and more rigorous measure of verbal ability than any of the measures used in other studies and far better than the self-administered vocabulary tests used by Coleman or the now-defunct TECAT used by Ferguson (1991) in the Texas study.

Our research is also based on the assumption that ratings of intern performance are a reasonable proxy for performance of teachers in a regular job setting. Our measure does not explicitly tell us how effective teachers are with their (future) pupils; admittedly, measures of subsequent pupil performance are absent from this design.

That all said, the interns in this study spend a full year in a classroom with considerable teaching responsibilities--cooperating teachers turn their entire classrooms over to an intern for several weeks of the year and work with the interns on a daily basis. Here, we have used the judgments of professional experts--trained teacher supervisors--to measure teacher effectiveness. These specialists have spent a year observing their interns and dialoguing with cooperating teachers (i.e., intern mentors) and the interns themselves to make determinations about performance along several predetermined categories. We believe that the considered opinion of professional experts, at least within the structure of this particular teacher education program, is a valid and reliable measure of teaching performance. We recognize there are other problems associated with an outcome measure such as faculty judgments of intern performance. Teacher education faculty may maintain strong, particular ideological beliefs about teaching and learning, and they may be predisposed to look only for qualities consistent with these beliefs during the intern-rating process. This is problematic only to the extent that such beliefs are not also shared by interns who are, in turn, penalized for it (e.g., ratings are negatively biased against those who exhibit techniques that are presumably anathema to constructivist dogma) but are, in actuality, effective teachers.

Many would prefer to see measures of student work included in a study of this sort. Still others might consider only standardized tests of student achievement as measures of teacher ability. Presumably, a truer measure of teacher effectiveness would be predicated on the effects on pupils.

One must recognize the significant challenges posed when attempting to link teacher practices to pupil performance. If one ascribes to the admittedly simplistic "input-process-outputs" causal model, any attempt to link teachers' verbal ability (an input) directly to pupil performance (an output) omits the causal processes presumably influencing pupil outcomes. (Consider, for instance, the Tennessee Value-Added Assessment System, which basically subscribes to an "inputs-outputs" framework.) Such models rely heavily on the epistemological underpinnings of positivistic causal models.


From our study, there is no conclusive evidence of a relationship between verbal ability and teaching ability. However, had there been more interns rated in the lowest category (acceptable), it is likely that a relationship would be found for those in the lowest performance category, showing that very low verbal scores are correlated with low performance.

One must also consider that the sample of teachers in this study (full-year interns) represents a select group. These teachers were, for the most part, undergraduates from a fairly selective state university where screening for general academic ability, including verbal ability, has already taken place. (4) They are also teachers who have been successful in admission to graduate school, a process that selects above-average baccalaureate students. However, the data demonstrate an important point: There is a wide range of verbal ability, as measured by a single test, within the group of good to outstanding teachers. The distribution of GRE-V scores among good, very good, and outstanding teachers is very wide with many teachers showing scores below 400. In fact, the lowest individual scores among all 116 teachers were found in the very good category of teachers. This group also had the highest median GRE-V score. It is also interesting to note that the second lowest median GRE-V score was found for the most outstanding teachers. Two important points emerge: (a) Many good, very good, and outstanding teachers would be screened out of teaching if a single cut score were used (in this case, the GRE-V = 400), and (b) there must be other criteria to consider in predicting acceptable, good, very good, and outstanding teachers.

Surprisingly, there was no significant difference in verbal scores between math/science and English humanities teachers. There was a significant difference in GRE-Q scores (p = .000), which leads one to caution that general tests used to predict teaching ability across a wide range of subject areas may be suspect because they may favor teachers in certain subject areas. Nonetheless, the GRE-A test proved to be correlated with teaching ability across all subjects and grade levels! This test measures analytical and logical thinking, the ability to sort relevant from irrelevant details in a problem situation, and the ability to make reasonable choices given a wide range of inputs (Duran, Powers, & Swinton, 1987). These abilities may be helpful in predicting teaching effectiveness because they relate to the good judgment that all successful teachers must use when operating in the complexity of the modern public school classroom. (Ironically, the analytical portion of the GRE has now been scrapped by ETS in favor of a writing test.)

The relatively low GRE verbal scores and the narrow distribution of verbal scores among the lowest performing category of teachers (acceptable) do lend support to the conclusion that verbal ability must be carefully considered in selecting prospective teachers. As stated earlier, a larger sample of acceptable teachers would likely have produced a significant correlation: The weakest teachers, on average, have the lowest verbal scores.


Good teaching is a complex interaction of a wide range of teacher characteristics, abilities, dispositions, knowledge of subject fields, experience, and pedagogical knowledge. These factors interact with particular school cultures, particular sets of educational goals, and particular children to produce effective teaching. Amid all of this complexity, it is possible to mount a logical argument for the importance of teachers' verbal ability for teacher effectiveness. On one hand, verbal ability is a proxy for general intelligence and academic skills. It is also a critical ingredient in the complex act of teaching. On the other hand, verbal ability, by itself or in conjunction with subject knowledge, is not sufficient for predicting teacher effectiveness.

A sound program for selecting prospective teachers should take verbal ability into account. Quantitative measures may be of some help in estimating verbal ability, but due to the many limitations of available standardized measures, the variation in ability of individuals to perform on standardized measures, and the clear existence of many other variables in determining good teaching, it is foolhardy to use single measures of verbal ability to predict teacher effectiveness. It is indefensible to use cut scores on standardized measures of verbal ability to include or exclude candidates for teaching. No simple measures are available to predict teacher effectiveness. A system of multiple measures is the only appropriate strategy for selecting prospective teachers.

The research reported here informs the decision-making process for teacher-candidate admissions at the University of New Hampshire. We hope it may expand the discussion of appropriate procedures or measures for selecting teachers for America's schools. The research supports the flexible use of multiple measures to predict teaching success. Although verbal ability is one important variable, it takes a range of measures of verbal ability to get a reasonable fix on verbal abilities relevant to teaching. Verbal abilities are necessary to good teaching, but they are not sufficient.


Michael D. Andrew

University of New Hampshire

Casey D. Cobb

University of Connecticut

Peter J. Giampietro

University of New Hampshire


(1.) The Texas Examination of Current Administrators and Teachers was subsequently abandoned by the state of Texas as a recertification exam. The 1986 data set is the only one available for analysis.

(2.) Class Average Residualized Composite Score is the data produced by the Dallas Independent School District's self-created measurement of instructional behaviors.

(3.) Applicants having completed advanced degrees are exempt from this requirement.

(4.) Peterson's Guide lists the University of New Hampshire as "moderately challenging."

Michael D. Andrew is a professor of education and director of teacher education at the University of New Hampshire.

Casey D. Cobb is an assistant professor of education policy and director of the Center for Education Policy Analysis at the University of Connecticut. He currently serves as president of the New England Educational Research Organization.

Peter J. Giampietro is a graduate student in the College of Education at the University of New Hampshire.
TABLE 1 Scores on GRE Subtests and UGPA, 2002-2003 Cohort

         n    Minimum   Maximum     M        SD

GRE-V   114     310     740       448.42   81.21
GRE-Q   114     260     770       525.04   109.57
GRE-A   113     360     800       563.45   103.45
UGPA    116       1.88    3.91      3.23     0.38

NOTE: GRE = Graduate Record Examination; GRE-V = Graduate
Record Examination-Verbal; GRE-Q = Graduate Record
Examination-Quantitative; GRE-A = Graduate Record
Examination-Analytical; UGPA = undergraduate grade point

TABLE 2 Goal 1 of the Seven Major Goals and Outcomes for Teacher
Education Interns at the University of New Hampshire

Goal 1: Our graduates will be knowledgeable in the subjects they teach
and how to teach those subjects to students.

Outcomes: Our graduates are able to do the following:

   (a) demonstrate depth of knowledge and recognize how knowledge in
       their subjects is created, organized, and linked to other

   (b) demonstrate specialized knowledge of how to teach subject
       matter to their students; and

   (c) use multiple approaches to facilitate student learning.

Ways in which the interns might address Goal 1 are as follows:

   (a) make connections between their academic course work and the
       curricula in place at their internship schools;

   (b) demonstrate the depth and breadth of their content knowledge in
       preparing and developing educational experiences for their

   (c) collaborate with their cooperating teachers to develop a range
       of strategies for planning and teaching;

   (d) observe and converse with master teachers in a variety of

   (e) use current technology as part of their research and planning
       processes; and

   (f) create and analyze a videotape of their teaching.

TABLE 3 Supervisors' Performance Ratings of Their Interns (n = 79)

Rating        Frequency   Percentage

Acceptable        7           8.9
Good             19          24.0
Very good        35          44.3
Outstanding      18          22.8
Total            79         100

TABLE 4 Graduate Record Examination-Verbal
Scores by Performance Rating

Rating        n      M       SD      SE

Acceptable     6   383.33   39.83   16.26
Good          17   465.99   98.43   23.87
Very good     32   449.38   64.40   11.39
Outstanding   18   456.67   69.20   16.31
Total         73   449.59   74.97    8.77
