Collective biographies: how many cases are enough? A dispatch from the far side of 11,700 biographies of nineteenth century teachers.

This essay grapples with a largely unacknowledged methodological problem in the corner of biographical writing known among some as "collective biography." Collective biography draws upon multiple biographies to reveal aspects of historical eras or movements that would remain invisible without that approach. The methodological problem addressed here is determining, at least loosely, what constitutes an adequate sample, a sufficient number of biographical cases, to warrant historical claims. My effort here will not provide a definitive number, by any means, but it may at least have the virtue of clarifying the issues and offering some cautions along the way. To get to that modest end, however, I must tell a story.

I recently published a history of the teachers who worked with the freed slaves, Schooling the Freed People: Teaching, Learning, and the Struggle for Black Freedom, 1861-1876. It was a follow-on to a study I published a number of years earlier about the freedmen's aid movement. One chapter of that earlier book dealt with those teachers who served in the schools for freed slaves. In that earlier book, I referred to the teachers, somewhat grandly, as "The Real Heroes of Their Age." (1)

When I went back to that earlier chapter a few years after publication, I was troubled by it. I was simultaneously troubled by three other books that came out at about the same time as my book that also dealt with aspects of African American education during the Civil War and Reconstruction. (2) All four of those books were sharply revisionist, rejecting virtually all the claims made by an earlier generation of southern white historians about the freedmen's education movement. Those earlier writers had been hostile to the movement and to the Yankee Schoolmarms who had, as one writer put it, "invaded" the South. They were certain that the teachers were meddlers, fanatics, and zealots who had intentionally and maliciously destroyed the strong, positive bonds that had existed during slavery between southern blacks and whites. Drawing on W. E. B. Du Bois' portrayal, in Souls of Black Folk, of the teachers as New England schoolmarms, but imputing astonishingly negative characteristics to the teachers, in contrast to Du Bois, those earlier historians characterized the freedmen's teachers as young (read: naive), privileged (read: haughty), single (read: homely spinster), schoolmarms (read: engaged in wage labor and hence never a lady) from New England (read: stiff-necked opponents of everything southern). They branded the teachers as abolitionists, about the worst name they could call anyone without violating the southern code of gentility. (3)

Our intrepid band of revisionists, completing our graduate degrees and pursuing our research through the heady years of the Civil Rights era and the student movement, understood the freedmen's teachers in ways diametrically opposed to the interpretations of those earlier historians. The teachers remained for us those same young single women teachers from New England, but we read them very differently. They were idealistic champions of the civil rights of former slaves, young proto-feminists, the forerunners of the Peace Corps volunteers of our generation. For us, abolitionism carried a very different freight from that assumed by southern white historians; for the generation of the 1960s and 1970s, abolitionism was one of the few authentically noble movements in a frequently sordid national narrative of slavery, racism, colonialism, adventurism, and nativism.

But when I revisited my 1980 story of the teachers, and considered the parallel narratives of the other revisionists, I was uneasy. For it did not take much thought about the sources upon which I and my revisionist comrades relied to realize that we used almost exactly the same historical sources that were consulted by those we intended to revise. And if two diametrically opposite interpretations could be wrung from the same sources, what did that say about either interpretive stance? Were the understandings we reached, the interpretations we urged, nothing more than the sentimentality of our two extraordinarily different generations? Our historiographic antagonists had grown up with academic and media portrayals of an antebellum South with contented black slaves, a refined white culture, a divinely ordained social order, and an antebellum North whose godless radicals and coarse culture opposed the South's peaceable kingdom, fomented a terrible war, and imposed a tyranny driven by carpetbaggers, scalawags, and black rule. We revisionists, on the other hand, had grown up in an unpeaceable kingdom of Cold War, McCarthyism, the Mississippi Freedom Summer, civil rights marchers, police dogs and fire hoses, and the youth movement; tyranny was epitomized by J. Edgar Hoover, the Pentagon, HUAC, George Wallace, neo-colonial military adventurism, and everyone over thirty. It was the world of Gone with the Wind versus the world of Mississippi Burning.

So my re-reading of the revisionists, myself included, began to trouble me. Are historical narratives inevitably just pale reflections of the hubris of succeeding generations? Or might another interpretation be possible with different methods? Would this particular historical narrative take on different dimensions if we sought a much broader range of sources than any of us had consulted to date? Would more data yield a different picture? And particularly, what would happen if we drilled down much deeper to discover who the teachers actually were?

That latter question forms the crux of this paper. As I pondered the problem of the generational construction of interpretive positions, I began to wonder: do we even have the basic picture of the teachers right? Do we really know who they were as a group, or perhaps as distinctive groups, plural? What if our collective picture of them was fundamentally mistaken? To answer those questions, I set out to identify a large sample of the teachers and to discover as much about that sample as I could manage.

Thus was born the Freedmen's Teacher Project. (4) The project sought, at minimum, the teachers' names, the years they taught in the freed people's schools, and where they taught; an individual teacher was added to the database only if I had those three pieces of evidence. Beyond the minimum, I also sought information on the teachers' gender, race, birth year, marital status, their occupations before and after their time in the South, their parents' occupations, whether they taught with family members, the sources of their support while teaching, their educational level, evidence of abolitionism, home, military experience, if any, and religious affiliation. As new information was added, it became possible to track individual teachers across both geographical space and time. Simultaneously, I was seeking published and archival qualitative material on all of the teachers. I was, quite unintentionally, on my way to constructing a large collective biography of the teachers of freed people. (5)

After several years of work, I had identified close to 6000 individual teachers. At that point, friends, colleagues, and particularly my wife told me I had all the information I needed. I had a good sample. Six thousand cases --six thousand individual biographies - is, after all, not an inconsiderable sample. Indeed, in one publication from those years I wrote, "The teachers thus far identified [i.e., 5,984 teachers] represent the majority of the total number of teachers [who taught in the southern black schools]." (6) In fact, I now know that I had only found roughly one-third of the total number. Still, how large a sample does one need?

But in collective biography, when is enough enough? How many cases does it take to give us a strong sense that we know all we are likely to know? When does the law of diminishing returns make further work irrelevant to the findings one might generate? When, in the arcane language of the qualitative researchers, does a project reach saturation, the point at which further research is unlikely to change the conclusions one can reach? (7) Or, as my wife might have put it at the time, when does this just become obsessive-compulsive behavior instead of serious scholarship?

Now, I would like to be able to claim that when I began this collective biography of nineteenth century teachers many years ago, I turned immediately to the literature on collective biography to be guided in my work and to know, from the outset, how large a sample I would need in order to draw reliable inferences and conclusions. I would like to be able to say that, but in fact I did not turn to that literature until relatively recently. Or, to be more precise (read: more honest) I turned to that literature when I thought it would be a good idea to talk with colleagues who do educational biography and get their sense of whether I have become obsessive-compulsive or have continued to be a rational scholar.

When I did turn to the literature on the methodology for collective biography, I was a bit relieved to read that, even as late as 2005, one methodologist was claiming that, despite a recent resurgence in collective biography work, "very little has been written to date about the method." (8) Well, that was a relief; at least no one could accuse me of ignorance of a hoary body of methodological knowledge. But as I read further, I became convinced, as most educational biographers may already know, that the field of collective biography is a methodological mess. Collective biography includes, at one end, those massive, one hundred-plus volumes of "National Biography" and "Who's Whos" and "Notable Women" that attempt to pose "arguments by example" but that, by the end, probably come closer to Shakespeare's "tale told by an idiot, full of sound and fury, signifying nothing." (9) It includes, perhaps not at another end of some spectrum but on an odd tangent from the spectrum, something that its devotees call "collective biography" but that should be more accurately considered collective autobiography, inasmuch as it entails an effort to get at interesting social phenomena through group analysis of the memories of members of the group. (10) The other end of the spectrum may be prosopography, the amassing of voluminous amounts of personal data on large numbers of individuals in order to characterize a group of historical actors. There is even a movement afoot by those doing large-scale quantitative analyses to arrogate the term "prosopography" to themselves, forbidding other historians and biographers from using the term. (11) In between those poles are all manner of narratives that entail more than one biography, though what makes most of them "collective" remains something of a mystery.

Worse yet, none of the methodological sources I found paid any attention to my question about sampling and saturation. So I turned from methodological treatises to books and articles that claimed to be collective biographies, but became even more baffled. One relied on a sample size of five individuals who, the author admits, "cannot be said to have formed a cohesive group" but who "did hold a number of qualities in common." (12) Another drew upon the collective characteristics of a dozen men, as found in their biographies, to identify "timeless principles" of how a society can "cultivate the types of leaders society desperately needs and craves." (13) In what sorts of historical-biographical work are five cases sufficient? When are twelve cases enough to warrant claims of "timeless principles?" What do we really know about how collective biography should sample? (14)

Finding nothing to help me, I decided that it might be heuristic, at least, to draw on my own work to see what difference sample size made, at least in the particular work I have been doing. When I first began reporting my findings, I had collected data on 5,350 teachers; two years later I had expanded the sample to just shy of 6,000; more than a decade later, I reported on a sample that had grown to 8,200. Perhaps I should have stopped at any of those points, but, dogged to the end, I pressed on. In 2010, I published the book I mentioned at the beginning of this paper that, among many other things, reported on findings regarding 11,672 individual teachers of the freed people. And still I did not stop. While I have only found 55 more individual teachers in the four years since the book was published, I have dug up further information on many of the individuals that I had already identified.

Did all that extra work matter, or, as many suspected, was this work just a good excuse to work for hours in microfilm readers and the back tables of archives?

I will not burden this essay with the many tables I could develop showing the frequencies and percentages that I reported over the years. Let me, though, point to some salient findings that shed light on my question. As early as two decades ago, I was able to establish the fact that African Americans made up a disproportionately large share of the teachers, despite the long assumption that the teachers were primarily young white women. I also claimed, incorrectly as it turns out, that black teachers remained in the southern schools about a half year longer than white teachers, on average. But the data continued to suggest that the corps of teacher was heavily northern and female. By 2003, with closer to 8,000 cases, I could report that nearly one-third of the teachers were black, a much higher proportion than I found earlier, but the northern complexion of the teaching force remained. I also noted in 2003, contrary to claims made as recently as 1979 by Jacqueline Jones, that women more often held positions as principals of southern black schools than did men, 128 to 99. (15)

What, then, changed with the increase in sample size to over 11,600, and the addition of several thousand more bits of data on all of the individual teachers? Perhaps most dramatic was finding that the teachers were not primarily northern at all. A majority was southerners, both black and white, and many of the southern white teachers in the freed people's schools were former Confederate soldiers, including not a few Confederate officers; just as ominous, many of those southern white teachers had been slaveholders before the war. Further, the number of black teachers had swollen remarkably. By 2010 I could report that over one-third of all of the freedmen's teachers who can be positively identified and who taught between 1861 and 1876 were African Americans. Further, the project had amassed evidence to establish firmly that thousands more African Americans were teaching during those years whose names may never be unearthed. Just as remarkable, the average number of years in the classroom over the fifteen years of the study revealed much more from the most recent sample than from earlier reports. African Americans again came out on top, teaching on average twice as long as northern white teachers, and three times as long as southern white teachers. It also became apparent that the teaching force was not as overwhelmingly female as long thought. While white northern teachers were primarily women, by a ratio of two to one, the entire teaching force was almost exactly half men, half women. Meanwhile, the number of women identified as principals ballooned from 128 to nearly 200, versus only 130 men. (16)

So what have I learned about sample size in collective biographies? Most saliently, it seems clear that in this particular sort of research, no purposive sample would have revealed many of the most important conclusions the project has been able to reach. Even at 8,000 cases, two-thirds of the cases eventually located, the racial profile was obscure. It was clear that earlier historians, including the revisionists, had entirely missed the centrality of African Americans in their own intellectual emancipation. (17) However, the extraordinary dedication of those teachers, and their proportion of all teachers, fully one-third of those who can be identified and well more than half of all that can be surmised from current evidence, was still invisible. Even at 8,000 cases, the gender profile of the teachers favored women, when in fact the gender frequencies were nearly dead-even, with men outnumbering women by a narrow margin. At 8,000 cases, I was still reporting figures that were too low regarding the number of years spent in black classrooms by the various groups. Anything less than a study of as close to a one hundred percent sample as possible would have left us with an incomplete, misleading understanding of the first teachers to work with the freed people.

As noted earlier, since publication of Schooling the Freed People, research has continued, though the number of individual teachers has not changed substantially. So, here again is an opportunity to ask, when is enough enough. Has the additional research resulted in any significant findings?

In fact, while frequencies have shifted, in most cases percentages, means, and modes have shown little change. For example, between 2010 and 2014, I have been able to determine race for 196 more individuals, 51 of whom were black, but with race known for more than 9,330, the percentages changed less than two percent. Further, the age of over two hundred more teachers have been determined since the book was published, but the impact on median ages of the whole sample have been affected only minutely.

On the other hand, the one variable that has changed in interesting ways with more research is relative wealth. To get at that issue, and thereby to get at social class indirectly, the project has gathered wealth data from the 1850, 1860, and 1870 censuses. The most recent research, adding census data to more than 200 more cases, did have an impact on findings and modify my claims in the book. Between the data reported in 2010 and what I know now, the modal wealth of the families of northern white teachers did not change for 1860 but the median wealth for 1860 fell from $5900 to $5550 (n = 764); for 1870, the modal wealth of the families of northern white teachers rose modestly from $1300 to $1700, while the median wealth of northern white teachers in 1870 rose slightly from $5810 to $5990 (n = 932). While those numbers continue to put northern white teachers pretty solidly in the middle class, they do indicate that those teachers were moderately more privileged than I suggested in 2010. It remains significant, however, that fully one-third of the northern white teachers reported individual or family wealth of $200 or less, a number that has not changed between 2010 and 2014, confirming my sense that many came from circumstances that were less than privileged.

On the other hand, the most recent data indicate that black teachers and southern white teachers were even poorer than I reported in 2010. The 1870 modal wealth for southern white teachers was a remarkably low $300, down from the $400 mode reported in 2010; the median wealth of southern white teachers in 1870 was $1812 (n = 656). The modal wealth for black teachers in 1870 was zero as reported both in the book and as found in the most recent data, but the median wealth reported in 2010 for black teachers, $896, was too high; the most recent data find a median wealth for the black teachers who served in the freed peoples' schools to be $773 (n = 361). Thus, traditional accounts that assert that the northern schoolmarms in the South were from privileged homes may be marginally more accurate than I thought in 2010, but any characterization of the entire corps of teachers must confront the contrary reality: black teachers taught their freed brethren despite grinding poverty; southern white teachers taught their former bondsmen because of grinding poverty.

Not all collective biographies can hope to achieve a sample size of one hundred percent, of course. Those studying very large populations - all of the teachers in Oregon in 1930, say, or all of the secondary school principals in the mid-west from 1965 to 1985 - would be hard pressed to manage such a project. But this study does suggest strongly that the further away a collective biography is from including all possible cases, the more problematic the results will be. Any process of purposive sampling must be carefully developed and fully justified if the findings are to be taken seriously. Meanwhile, the Freedmen's Teacher Project will continue to find valuable evidence, particularly as it moves into its next iteration, though it may be approaching saturation. (18) On the other hand, it still may be the case that I am obsessive-compulsive.


Ronald E. Butchart

University of Georgia
