Free variation and other myths: interpreting historical English spelling.


The paper considers the interpretation of orthographic variation in Middle English texts, focussing on the question to what extent it is justifiable to use such variation as phonological evidence. It is suggested that all written variation, except when directly conditioned by orthographic context, is the result of clashes between two or more linguistic systems. This hypothesis is tried out on a text with notoriously variable spelling: the version of Lazamon's Brut found in London, British Library Cotton Caligula A ix.

1. The problem: Making the dead speak

This paper is an attempt to combine two pursuits that have so far largely been kept separate: the reconstruction of early English phonology and the study of Middle English texts in the tradition of the Linguistic Atlas of Late Mediaeval English (McIntosh, Samuels and Benskin 1986, henceforth LALME). The LALME methodology is based on studying the written language in its own right, without reference to the spoken mode; this restriction is essential for the initial analysis and classification of Middle English scribal texts. However, if we wish to produce an account of the Middle English language, including phonology, that takes advantage of the enormous advances in the field made possible by LALME, it becomes necessary to combine the two approaches. It is such a situation, relating to work on the Middle English Grammar Project, (1) that underlies the concerns of the present paper.

Grammars of Middle English, especially comprehensive ones meant for scholarly reference, have not appeared in large numbers. One reason is, it may be assumed, the linguistic variability of Middle English. While variation between texts makes the period an interesting one for linguistic study, it also makes writing a grammar a complex task. The variation within texts has been seen as an even more fundamental problem. As most Middle English texts were copied and recopied by scribes, it has been common to assume a kind of Chinese Whispers effect, every copyist contributing to an even more complex mixture. (2)

To produce a Grammar of Middle English at the present time is a task very different from that faced by earlier grammarians. While modern database technology does not speed up the collection of data, it makes possible much more powerful ways of searching and analysing them. In addition, there are numerous giants on whose shoulders we can stand, and important methodological advances have made possible a more efficient use of the available data. These advances are, above all, connected with LALME and its daughter project, the Linguistic Atlas of Early Middle English (henceforth LAEME). (3)

The methodology developed in connection with LALME has revolutionised Middle English studies in at least two respects. First of all, it has radically changed ideas about scribal contamination. It was shown by Angus McIntosh (1963 [1989], 1973 [1989]: 61) and, later, in a seminal account by Benskin and Laing (1981) that Middle English scribes were capable both of copying quite faithfully and of translating into their own dialect. At the same time, the development of variationist linguistics has changed ideas about variability, making it a natural and analysable aspect of language.

Secondly, the LALME methodology was based on the direct study of written language in its own right. This was found to show regionally significant variation, independent of speech, and accessible without the conjectures required in historical phonology. (4) On this basis, it was possible to build up a typological framework that allowed the localisation of more than a thousand Late Middle English texts, (5) each shown to be dialectally consistent.

It might seem to be a straightforward matter to combine these resources with database technology, and to produce a full description of Middle English based on electronic searches. As far as describing Middle English orthography is concerned, the procedure is indeed relatively simple, assuming that enough time and manpower are available for entering the data. The challenges mainly relate to practical analytical procedures, such as using a sensible classification system and designing a database that makes possible an efficient analysis of the spellings.

Matters become more complicated as soon as we go beyond the purely descriptive level. In order to explain or interpret spelling patterns, reference to the spoken mode becomes necessary, and once we wish to consider how the spelling maps on to Middle English phonology, there are considerable theoretical problems involved. How these are to be solved depends largely on our view of the relationship between writing and speech.

There are two main theoretical traditions as regards writing and speech: one that sees writing essentially as a way of encoding speech and another that sees writing and speech as two parallel, largely autonomous systems. These two approaches have sometimes been called "relational" and "autonomistic" respectively (Sgall 1987: 2-3). While the LALME methodology is based on the second one, the first still appears to have considerable currency, especially among theoretical linguists. As the question is of fundamental importance for the interpretation of historical written data, it should be considered from the outset.

2. Man schrieb wie man sprach?

The use of spelling as evidence for the reconstruction of spoken language belongs to the traditional ways of studying historical stages of languages. In a classic article on the methods of historical phonology, Herbert Penzl (1957) considered the following kinds of evidence:

-- Orthographic evidence

-- Orthoepic evidence

-- Metrical evidence

-- Comparative evidence

-- Contact evidence

A similar list is given by Roger Lass (1992: 27-28); reflecting an essential difference in theoretical standpoint, it lacks contact evidence but adds general linguistic theory.

For the Middle English period, spelling is usually considered the most important type of evidence; the others are either by their nature secondary (comparative evidence), very problematic (metrical evidence) or scarce (contact and orthoepic evidence). However, the use of Middle English spelling as phonological evidence is highly problematic in itself in part, this is due to reasons different from those relating to spelling evidence from other periods.

For the most part of the history of English, and the history of many languages, there has been a general model or standard of spelling. For such periods, the conventional nature of the spelling system is obvious, and spellings are not assumed to reflect in detail the speech of the individual writer. The study of spellings then concentrates on changes and aberrations: the introduction of new contrasts, back spellings, occasional or "naive" spellings, and so on.

During the Middle English period, for historical reasons, there was no single model for spelling; as a consequence, there are no occasional spellings, or all spellings might be considered occasional. In such a situation, it may be difficult to distinguish between a spelling that reflects an aspect of pronunciation and one that simply shows varying spelling conventions.

The traditional view has been that Middle English spelling, on the whole, reflects pronunciation. Luick (1921-40: [section] 27) formulated this in a classic statement: (6)

As early as the first half of the twelfth century records emerge which show the break with the old tradition. In the end this tendency prevailed, and one wrote as one spoke. From that time forth, till late in the fourteenth century, all English written records proceed in local dialects.

The assumption in most earlier work on Middle English appears to be that authorial texts as a rule reproduce the "real" spoken dialect of the authors, while the messiness of the majority of surviving texts reflects scribal contamination. The latter view, as noted above, has been modified by the LALME work. The former, on the other hand, has remained largely unproblematized.

A general statement such as "one wrote as one spoke" allows for a wide range of interpretations. Clearly, it matters which level of language we consider: authorial or scribal choices of vocabulary, or of syntactic structure, will depend on factors very different from those involved in spelling cf. Benskin and Laing 1981: 93-96). The present discussion will focus on the levels of spelling and phonology.

Luick's statement may simply be taken to mean that the characteristics of written language produced in a particular area have a connection with the characteristics of the spoken language in the same area. That such regional connections existed in Middle English can hardly be denied. For example, it is not unfair to assume that the general predilection for spelling words like man or ram with an <o> in the West Midland area was related to the common pronunciation of those words in this area, especially as this is supported by twentieth-century dialectal evidence (see e.g. Wakelin 1982).

However, it is one thing to say that written records proceed in local dialects or reflect local dialects, and quite another to treat them as if they were a faithful recording of speech, or as S. R. T. O d'Ardenne put it, "a phonetic transcript from the mouths of peasants (d'Ardenne 1936: 178)". It is not unusual to find examples of what looks like the latter approach in the standard textbooks. Jordan (1968: 51), for example, has the following observation under the treatment of OE ae:

Auch Lay[amon] A. hat etwas helleren Laut als a.

The statement is problematic for various reasons; most obviously, a text cannot have a sound. However, even if hat is taken as shorthand for a more complex relationship, it is still unclear what this relationship might be, and how we should interpret the helleren Laut. As Jordan wrote before phoneme theory, he clearly could not specify whether he meant an allophonic realisation of a or a different phoneme. However, it is also unclear whether "Layamon A" refers to the manuscript itself, the scribe of the manuscript, or even to the author, and how the information about pronunciation is derived from the written data.

The problem with statements such as the above is that they all too often are taken at face value, without problematising the evidence. A vague idea that Middle English writing habits reflect regional speech is not sufficient for the interpretation of specific spellings; any attempts at such interpretation must proceed according to clearly defined principles, and should be firmly based on a theory of the relationship between writing and speech. It is the aim of the present paper to suggest, in a preliminary way, such a framework for interpretation. Before proceeding so far, however, it may be of interest to look at the question of OE ae in Lazamon A, which will provide a useful example case for the further discussion.

3. OE ae in Lazamon A: A case of free variation?

The Historia Brutonum, or Brut, by Lazamon survives in two manuscripts, BL Cotton Caligula A.ix and BL Cotton Otho C xiii. The manuscripts are now thought to be roughly contemporary, and they have been dated to the end of the thirteenth century. The Caligula version, often referred to as Lazamon A, is generally considered to be closer to the authorial version of the text, while the Otho version (Lazamon B) represents a considerable reworking. Of the two versions, A is longer, and it is notable for its conservative language, including a virtually native vocabulary with hardly any loanwords. The spelling is variable and has been thought to include archaistic elements (Stanley 1969: 23-28). Margaret Laing (1993: 70) notes that "[t]he orthography, especially of the vowels, is very variable as though the system had not yet settled into a coherent form". For LAEME, the Caligula text has been localized in Northwestern Worcestershire, in the area connected with the author, Lazamon (Areley Kings) (Laing 1993: 70 and personal communication); the language of the manuscript is, accordingly, thought to represent the same dialect as that of the original.

Whether the Caligula text, or Lazamon A, is the work of one or two scribes does not appear to be a resolved question. In the Catalogue of sources for LAEME, Margaret Laing (1993: 70) presented it as her own view, as well as that of Angus McIntosh, that the text was the work of a single scribe. (7) Later work by Laing, as well as by Frances McSparran. (8) assumes two scribes; however, the evidence is not conclusive (Laing, personal communication). The main difference between the orthographies is the greater number of variants found in the stretches ascribed to Scribe B. This need not be significant in itself, considering the much greater length of these stretches compared to those ascribed to Scribe A.

For a full consideration of the question of OE short in Lazamon A, the data should ideally be placed in the context of all vowel spellings in both A and B stretches. Such a study is beyond the scope of the present paper, but will, it is hoped, be the subject of a separate one. The following discussion will restrict itself to the subsystem of short unrounded low and mid vowels in the portions ascribed to Scribe A. The data are summarized in Table 1, and include all the spellings for OE a, ae, ea and e. (9)

The words containing the reflex of OE short in the scribe A portion show variation between three main spellings, <a>, <ae> and <e>, as well as occasional occurrences of <ea> and one <eo>. The variation between <a> and <ae> looks like what is traditionally known as "free variation", with no obvious contextual constraint. Of the <e> spellings, on the other hand, the vast majority occur in low-stress words (wes, hefde), with only a handful of exceptions (creft, gersume).

The reflexes of Old English vowels that are regularly diphthongised in Middle English, such as daeg 'day', are not included, as the spelling of diphthongs follows quite a different pattern. Environments causing breaking, retraction, back mutation or smoothing show various developments in the Old English dialects. The present data show a clear difference between WS ea before 1 + consonant groups and in other contexts; while all the latter show a variable pattern similar to that of the reflex of OE ae, the former goes with OE short a, showing regular <a> spellings. Accordingly, the reflex of Germanic a before 1-groups may be considered to reflect OE a rather than ea (the result of retraction rather than breaking), and is classified separately.

The organisation of the data in Table 1 shows that the orthography of Lazamon A (Hand A) is by no means chaotic, at least with regard to this vowel set. The usage is certainly variable, in that the reflexes of OE ae and ea appear with four different spellings, two of which are also regularly used for short vowels in other contexts. However, with the exception of very few minor variants, most of which can be explained individually (see Table 1), the usage in full stress environment is regular enough to be described as follows:

1) The reflex of OE short a always appears as <a>.

2) The reflex of QE short e always appears as <e>.

3) The reflex of OE short ea may appear as <a> or <ae>, and very occasionally as <e> or <ea>.

4) The reflex of QE short ea is relatively rare, and appears as <a>, <ae>, <e> or <ea>.

It may be noted that the single completely aberrant spelling variant, <eo> for OE short ae, only appears in a word that normally occurs in low-stress position: <weos> for was.

The question, then, is how the variation in 3) and 4) might be explained or interpreted, and whether it might have any connection with pronunciation, such as the "helleren Laut" suggested by Jordan. A consideration of this should begin by reviewing the theoretical basis for the interpretation of spelling.

4. Writing and speech: The relational and autonomistic approaches

There are various models of the relationship between writing and speech in twentieth-century scholarship. The main division of opinion concerns the question how far writing forms an autonomous system. Many scholars have held that writing essentially mirrors speech. A very clear statement to that effect was given by Roger Lass in Historical linguistics and language change:

[W]e don't seem to have much if anything of a theory of orthographic ... change. My guess is that there can't be any such theory, because spelling as a language-faculty 'module' ... is probably not nearly as autonomous (if it is at all) as phonology. The safest bet would seem to be taking orthographic variation as (roughly) mirroring phonological; the picture represented by Vices and virtues is rather like a set of field-recordings of utterances. That is, the graphic shape that surfaces represents the phonological shape that would have surfaced if the utterance had been in another medium (Lass 1997: 65).

Accordingly, Lass has tended to use spelling data quite directly as though they provided a phonemic, even phonetic transcript (e.g., Lass 1987).

This kind of view, termed "relational" (Sgall 1987: 2-3), has sometimes been combined with a variationist approach; historical spelling variations may then be treated as quantifiable phonological data. The following statement by James Milroy allows for such an interpretation, even though it does not presuppose it:

For written Middle English ... our access to variation is direct, and it is this primary source that we use to reconstruct the diversity of spoken Middle English (Milroy 1992: 156).

If we combine Milroy's statement with the idea that writing reproduces speech, we can assume a direct relationship between the kinds of variation present in the two modes. For example, a speaker who habitually varies between /o/ and /a/ in his pronunciation of man would produce the equivalent variation (<mon> and <man>) in his/her spelling. The relationship might be seen either in terms of direct correspondence (the spelling <o> always stands for spoken /o/) or covariation (the proportions of the written variables correspond to the proportions of the spoken ones). The assumption of such a direct relationship lies behind much of the work on Old English dialects conducted by Thomas Toon (e.g. Toon 1976, 1983 and 1987).

Along such lines of reasoning, one might try to interpret the spelling variation in Lazamon A (Hand A) as corresponding to spoken variation. The first question would, then, be what the spellings <a>, <ae> and <e> might reflect. According to Roger Lass (1992: 30-31), we can usually make educated guesses about the kind of sounds the letters in historical texts stand for. It would be natural to assume that both <ae> and <a> relate to open unrounded vowels, and that <ae> probably relates to a non-back vowel; if <e> relates to a different vowel, this would probably be still fronter and closer.

It could, then, be assumed that the writer (10) who produced the text varied in his speech between two vowel qualities, a front and a back open vowel, and showed this variation directly in his writing. If the vowels were phonemically distinct, the back variants may have belonged to the phoneme regularly realised as <a>, that is, the reflex of OE a, and the variation could be taken to reflect an incomplete merger of /ae/ and /a/. This would not, however, explain the <e> and <ea> spellings, nor the lack of variation in the a set.

A more serious problem with such an explanation is that it depends on a view of writing that is scarcely tenable. If we assumed a direct reproduction of spoken variation, whether phonemic or subphonemic, we would need to envisage the activity of writing as a kind of continuous transcription process. The writer would begin with an actual pronunciation, either aloud or in his/her mind, or perhaps in connection with dictation, and transcribe it, so that each actual graph on the page is a realisation of a spoken phone. It appears that Roger Lass pictures this kind of scenario in early English when he writes that "the picture presented by Vices and virtues is rather like a set of field-recordings of utterances".

If we could assume that this were the case, historical phonology would be easy indeed. However, writing simply does not seem to work that way. The function of writing is not to record speech -- it is to communicate linguistic utterances in a different mode from speech. This is a point made perhaps most eloquently by Josef Vachek:

It is often overlooked ... that speech utterances are of two different kinds, i.e. spoken and written utterances. The latter cannot be simply regarded as optical projections of the former. To difference of material ... is added another difference ... that is to say, a difference of functions. ... Writing is a system in its own right, adapted to fulfil its own specific functions, which are quite different from the functions proper to a phonetic transcription.

(Vachek 1945-49 [1976]: 128, 132)

Most twentieth-century scholars that have taken an active interest in the written language have, in fact, tended to stress its relative autonomy: such scholars include Vachek, Pulgram, McIntosh, Samuels and Haas. The main features of this "autonomistic approach" have been usefully outlined in an article by Petr Sgall (1987); the article summarizes, in approximately the following terms, what could be called the basic common points relating to this approach: (11)

1) All alphabetic spelling systems are based on a relationship between the letter and the phoneme

2) Spelling and phonology form related but autonomous systems

When an alphabetic spelling system is first developed, there is normally a close letter-to-phoneme correspondence, even though it may not be absolute. For example, the letter <n> always refers to the phoneme /n/, and the phoneme /n/ can only be referred to by <n>. Petr Sgall calls the 1:1 relationship the prototypical one.

There will not normally be distinct spellings for allophones: such spellings are by definition not necessary for the purposes of intelligibility, and would instead impede communication in writing. We cannot normally expect to find direct information about subphonemic variation.

The moment a writing system is put into use by a community it assumes an autonomous existence. When a person writes, if he has any proficiency at all, he does not recite the words and then transcribe them; instead, he has a ready formula for spelling that he recalls and reproduces. For example, in order to write knee, a literate English speaker does not need to go via the pronunciation and then apply some kind of rule stating that <kn> corresponds to /n/; rather, the letter sequence <knee> is stored in the speaker's memory as a spelling for the word, just as the phonemic sequence /ni:/ is stored as the pronunciation.

Writing works with its own set of conventions, whether there is a single standard written norm or not. Consequently, on the individual level, a person can change his pronunciation in some aspect without making the equivalent change in spelling. The long-term effect is, as is well known, that letter-to-phoneme correspondences tend to become more complex with time. To take a simple example, the correspondence of <n> to /n/ appears to have been a 1:1 relationship in Old English; however, in Present Day English, it has become considerably more complex (/n/ can correspond to at least <n>, <nn>, <kn> and <gn>, while <n> in certain contexts stands for /n/). Synchronically, in the absence of an overall standard, there will be different written conventions that do not necessarily reflect spoken differences.

5. Writing and speech: The Middle English situation

It is possible to overstate the autonomy of written language. McIntosh (1956 [1989]: 11) was perhaps close to doing so when he claimed that written language could be studied as though its users were deaf and dumb; however, he hardly meant that to be taken literally. Writing can be, and is, influenced by speech, both on the level of the community and the individual. Samuels (1972: 5) notes that "there is a separate continuity of change and development within each medium, but ... the two must keep in step, for, if the level of correspondences between them drops, the written language becomes an uneconomic medium".

This may well have happened towards the end of the Old English period. Certainly the Early Middle English period is a time of much adjustment and remodelling. John Anderson and Derek Britton, in an important article, have suggested that the writers of English in the twelfth and thirteenth centuries often were "lesser Orms", taken to spelling reform by necessity:

It must have been the case, wherever and whenever scribes had a need to write in contemporary English, that new systems had to be devised; and by their very newness and their local character those several new spelling systems came to reflect the phonology of the period and place of their devising far more adequately than the old national standard would have done.

(Anderson and Britton 1999: 303)

The writers may somewhat overstate the case in the first part of the sentence: not all writers of English worked in isolation even during this period, and it is surely unreasonable to think that every attempt to write English would have required the devising of a new system. However, on the whole English spelling did go through a phase of intensive modification, emerging as a far more flexible and variable medium, with a closer relationship to the spoken mode than had been the case in the preceding period.

New systems were, however, not invented from scratch, as Anderson and Britton (1999: 303) note: "these new orthographies were not complete reinventions of English spelling, but modifications which plainly imply some knowledge of the features of Old English spelling and letter-shape". Apart from Old English conventions, available from the frequent copying of Old English texts, and, at least in part, from handed-down tradition, the scribes shared a common knowledge of the orthographic systems of Latin and, to a lesser extent, Old French.

There is also no particular reason to assume that Early Middle English writers, as a rule, actively set out to create or reform spelling systems in order to make them correspond to pronunciation. The great majority of people setting out to write in English, even in this period, had presumably been exposed to written English before. Writing is, almost by definition, a social activity, and the case of a person literate in Latin and/or French setting out to create an English spelling system from scratch would have been exceptional indeed. Spelling reform based on phonological principles is hardly feasible as a general development, bearing in mind that the normal function of writing, that of communicating utterances, relies on familiar, not phonological, spellings.

Middle English scribes were not writing in a vacuum untouched by convention: the people who could write, including the scribe of La3amon A, had learnt conventions for writing. The "new spelling systems" would be modifications of these conventions, made by selecting those elements of existing materials that were felt to be most suitable for their purpose: not as a transcript of speech, but as an economic and appropriate medium of communication within a given community.

6. The interpretation of spelling

The variation in Lazamon A does not represent recorded field utterances, and Middle English texts cannot be used as though they were such. The question is what kind of phonological evidence it is possible to derive from a Middle English text. The conclusions so far have been mainly negative; however, it would be unreasonable to assume that such plentiful and varying spelling data could not tell us anything specific about speech. Before the question can be addressed, however, two preliminary notes are necessary.

Firstly, it is important to note that not all spelling variation has any potential equivalent in speech. In the LALME terminology, introduced by Angus McIntosh (1974 [1989]: 46), variants with a potential counterpart in speech are called S-features, while purely graphic variables are W-features. As an example, the man/mon variation referred to above represents an S-feature, while the variation of <sh> and <sch> spellings in forms like shal/schal represents a W-feature. To distinguish between the two is not always straightforward; as a case in point, the spelling <x> as in xal 'shall', typical of Norfolk texts, has been interpreted both as a graphic variant corresponding to <sh> and as a spelling for a rare consonant phoneme or cluster, such as /C/ or /ks/ (Smith 2000).

In the case of OE in Lazamon A (Hand A), at least part of the variation does involve S-features: the variants have potential spoken counterparts, viz, the vowels mentioned in the previous section. At the same time, it may be noted that <ae> and <e> were W-level variants in medieval Latin, and this use may to some extent have been transferred into Early Middle English orthographies; at the same time, the merger of a and ae in Early Middle English, whatever their phonemic status in Old English, would have made possible a similar, purely graphic, variation between the letters <a> and <ae>.

As far as S-features are concerned, another preliminary note is needed. It is important to distinguish between general information about dialect areas and the evidence of individual texts. In Middle English, the written conventions of different areas would naturally reflect characteristics of the spoken dialect, at least in the not too distant past. The LALME maps can thus provide very suggestive information about the dialectal spread of various phonological features, such as the reflex of OE /y(:)/. For example, if one area in the fourteenth century abounds in spellings such as <busy> and <fur> for busy and fire, we may reasonably assume that front close rounded vowels had survived particularly long in this area, unless, of course, there is reason to think that the written tradition is unusually conservative. On an individual level, as we have seen, the evidence is less direct; a spelling <busy> does not necessarily reflect a rounded vowel in the dialect of the scribe, any more than it does today.

What, then, can be deduced from the spellings of an individual text? While written variation does not reflect speech in a direct way, various well-known types of written variation provide evidence for phonemic change. Most obviously, back spellings, or hypercorrections, are generally considered reliable evidence for mergers. If a scribe writes <e> for the reflex of OE eo we cannot be certain that the spelling is not just part of the spelling conventions he has learned; however, if he also uses a spelling such as <eo> or <oe> for e, there is good reason to think that the two vowels have merged in his spoken system.

Whether or not combined with hypercorrection, orthographic variation involving a specific feature, standing out in an otherwise fairly regular text, is generally a sign of a problem area of some kind. This might indicate some instability of letter-phoneme correspondences, as a result of a change or clash of systems. Occasionally, it is justifiable to use negative evidence to draw the opposite conclusion: if a scribe who otherwise shows variable usage distinguishes very regularly in spelling between two phonemes, we may conclude that it is at least very unlikely that he would not make the equivalent distinction in speech. Scribal misreadings, finally, are a small but sometimes illuminating category of evidence about the spoken mode.

Spelling evidence may also be used in conjunction with various other kinds of evidence, the other evidence providing a point of reference according to which the spelling data can be mapped. The evidence of rhymes and alliteration may thus provide a useful point of comparison to spelling evidence. Comparison with orthographic and phonological evidence from other periods may, similarly, be illuminating. Further, comparison wih other contemporary systems, including those of other languages in contact with English, is an important source of evidence, even if the extent of direct influences and correspondences is sometimes difficult to assess.

Most surviving Middle English texts are, of course, scribal rather than authorial. Scribal behaviour has been much studied over the last few decades, and new methods have been devised for the analysis of scribal texts. It is often possible to define a scribe's behaviour fairly precisely, either as translation, literatim copying, or constrained selection, the latter admitting of varying degrees of tolerance. (12) In many cases, it is then possible to separate scribal layers, and to define which written forms belong to the scribe's own active repertoire, and which are acceptable to him/her. This information, seen against the linguistic and textual context, may then be used in connection with phonological interpretation. Such techniques work best with texts surviving in several copies, where much is known about the linguistic background and context of the text, but they may yield useful information even in the case of a single copy.

In all, there are numerous methods available for the interpretation of Middle English spelling. It may be noted, however, that assumptions about the actual quality of sounds can on the whole be made only on the regional level; as regards individual texts/writers, the information is mainly systemic and structural.

The problem is how to combine and apply these methods to make sense of an extremely complex field. The interpretation must take account at least of the following aspects: the specific problems pertaining to authors and scribes, including different scribal strategies and layers; external factors of linguistic variation, including factors relevant to the spread of orthographic conventions; diachronic change; textual context; and the interaction between the written and spoken levels. A further complication is that Early Middle English scribes were generally used to copying texts in Latin, and sometimes in French; thus interference from other orthographies may need to be considered as well. Finally, towards the end of the period, varying degrees of standardisation will also play a role. The interpretation of Middle English spelling should ideally be able to take into account all this.

What is needed is the theory of which Roger Lass notes the absence: that is, a theory of specifically orthographic variation and change. Such a theory needs to be flexible enough to cater for all the different factors outlined above; it should be general enough to be transferable to other (non-standard) languages and periods, and at the same time specific enough to have explanatory force. It will need to be tried out in practice, over time, on as many kinds of material as possible. Accordingly, this is not the place to present a ready theory, but rather a tentative suggestion towards one.

7. Towards a theory of orthographic variation

Weinreich, Labov and Herzog (196.8: 166-167), defined the traditional linguistic code or system as "a complex of interrelated rules or categories which cannot be mixed randomly with the rules or categories of another code or system". This definition had led to some unnecessary complexities being imposed upon linguistic data, as any instance of mixing categories had to be defined as code-switching. The solution to this, presented by Weinreich et al., was to introduce variable rules.

The introduction of a systematic study of variation has had wide-reaching effects on modern linguistics. Most importantly, it has turned the focus on linguistic performance, and showed that variability in language is both functional and orderly. We no longer expect variation to be "free", even though it may appear random when we lack the necessary information; one of the main objectives of modern variationist and sociolinguistic studies has been to uncover the external factors that govern linguistic variation.

In the last few decades, numerous attempts have been made to apply the variationist framework onto historical periods, with many interesting and groundbreaking results. The approach has, however, one major weakness when applied to historical periods: while modem sociolinguistic studies are able to isolate specific non-linguistic factors, such as age, sex or social class, and can study their covariation with linguistic variants in a controlled way, such information is generally unavailable for medieval texts. The modem variationist methods have, accordingly, been modified in various ways for the purposes on historical study.

Even so, it seems that the variationist paradigm in itself is not sufficient for the analysis of medieval orthography. In particular, the concept of variable rules is not very helpful with regard to the type of evidence medieval linguists are dealing with: the individual written text with its unique history.

For example, it is possible to describe the reflex of OE ae as a "variable ae", that in Lazamon A (Scribe A) may be realised as <a>, <ae> <e> and occasionally <ea>. Following standard variationist or sociolinguistic methodology, one can go on to study the occurrence of these different variants in a sample of texts of the same period, and attempt to isolate the factors relevant for their use. The main problem here is that the information we have about external factors is very limited indeed. Some factors may, of course, be retrieved: extralinguistic factors like text genre, manuscript connections, script type and so on, as well as date and localisation, may be defined and used for quantifying the evidence. The Middle English Grammar database, in fact, will contain information about such factors, and is designed to allow for this type of analysis. (13)

However, while such an approach should prove useful for detecting general patterns, these patterns will always need to be related to the usage of the individual texts. Historical texts are not usually altogether comparable, nor can they easily be made to form a basis for valid statistical conclusions. The problem is that the approach does not in itself take into account the complex, and unique, nature of the individual Middle English text, whether authorial or scribal.

A text language, the language of an individual text, can be described as a Weinreichian system consisting of variable rules. However, it seems to the present writer that the explanatory power of this idea is insufficient. Of course, the text language as it stands is usually all we have -- we seldom have access to the immediate exemplar of a copy, for example. However, the variants present in a text are not necessarily most sensibly analysed as organic parts of a single system, covarying with external factors. Much written variation simply falls outside such a framework of explanation. Most importantly, the interaction between the language of the exemplar and that of the scribe sometimes produces an output that cannot, in all its detail, sensibly be taken to represent a system pertaining to one particular time or place. It makes, of course, good methodological sense to postulate as few layers as possible, and to include as many forms as possible within a single scribal repertoire. At the same time, the analysis will need to take into account all kinds of variation, whether or not they fit into such a repertoire. Furthermore, the kind of explanations relevant for written variation, often pertaining to textual aspects, presuppose the simultaneous consideration of more than one linguistic system.

It would thus seem more promising to choose the other possible solution to the problem of the inflexible system, mentioned at the start of this section. Rather than a single system consisting of variable rules, a text language could be seen as the orderly and analysable outcome of the dynamic interaction of different systems.

The systems naturally include the scribal layers studied in the LALME tradition. However, in order to cope with the complexity of the material, they need to be flexibly defined, to include all types of written and spoken systems that may come into contact during the history of a text. Needless to say, we can never hope to recover such information in detail, nor would that be a particularly sensible aim. However, the concept of interacting, or layered, systems would seem to be a promising one as an explanatory framework for the study of orthographic variation. On this basis, we could formulate the following hypothesis:

All written variation that is not conditioned by orthographic context is the result of a clash between two or more systems.

There is an obvious parallel to conditioned phonetic variation, even though the two are not directly connected. An example of conditioned orthographic variation is the use of <i> and <y> by most Middle English writers; the latter tends to occur mainly in minim environment. Such conditioned variation may be extended to other environments, cf the use of <y> where no minims are involved, e.g. in words such as yt, hys 'it, his' or in scripts where minim confusion is not an issue.

Where no orthographic conditioning is present, it is suggested that the variation is the result of a clash, or, to put it in less violent terms, a contact situation between different systems. A system may, then, be defined in a flexible, non-committal way, rather like the sociolinguistic term "variety". Systems may be spoken or written; they can be represent regional varieties or standardised norms, and they can belong to different languages. They may interact on a synchronic level: a writer may be competent in more than one written system, or there may be discrepancies between the writer's spoken and written systems. There may also be a written standard or similar model that influences the writer's choices. The systems may also interact on a diachronic level, as in the case of a scribe copying from an exemplar containing an earlier form of language. Repeated copying may lead to an accumulation of interacting systems, although, as Benskin and Laing (1981: 79-81) have demonstrated, so-called Mischsprachen (14) seldom reach a huge complexity.

These different kinds of possible systemic clashes may be summarised as follows:

written system 1 / written system 2

written system / standard or similar model (when not definable as written system 2)

written system / spoken system

written system / spoken system 1 / spoken system 2

scribe's written system / exemplar

scribe's written system / exemplar system 1 / exemplar system 2

The resulting patterns of variation are conditioned by factors such as function and context, and, in the case of scribal texts, by the copying habits of the scribe. A text produced by a translating scribe would mainly reproduce the written system of that scribe, with some modification caused by interaction with the system of the exemplar. This modification would be considerably more throughgoing in the case of a constrained scribe, while a literatim scribe would essentially produce the system of the exemplar, with some variation in areas where his own written usage differs widely from the former.

8. An interpretation of the variation in La3amon A (Hand A)

The above suggestion for an interpretative framework should, finally, be tested using the example of spelling variation in La3amon A (Hand A). The first question is what systems may be involved in the text language.

First of all, systems must not be postulated freely. As the concept is flexible, and our knowledge of historical language states is imperfect, systems must be defined according to strict guidelines if they are not to be applied as a meaningless cover-all explanation. Systems must, therefore, as far as possible be based only on recorded facts or, at the very least, respectable hypotheses; they should never be invented in order to fit the data.

In the case of La3amon A, the natural system to begin with might be the late Old English spelling system. In an influential article, Eric Stanley (1969) drew attention to the archaic, and possibly archaistic, spellings of La3amon, which survive in the Caligula version. He connected these with the conscious use of the late Old English Schriftsprache as a model, and noted La3amon's closeness in time and place to a known thirteenth-century student of Old English, the Tremulous Hand of Worcester. (15)

Most varieties of Old English appear to have had a short vowel system with the following unrounded open and mid vowels: a, ae, e and ea. Distributional differences concern mainly the environments for breaking, back mutation and smoothing; however, if the so-called "second fronting" also involved a real sound change, the varieties affected by it would have had a radically different distribution of these vowels.

In Late Old English, we learn in the handbooks, ea merges with ae:

Towards the end of the OE period the diplithongs /eo, eo/ and /aea, aea/ became monophthongs in almost all dialects ... the monophthongization was straightforwardly to /ae(:)/ by the loss of the second element.

(Hogg 1992: 215)

Later, the resulting short vowel merges with a. According to Hogg (1992: 217), and Campbell (1959: 136) this merger is attested in spellings (<a> for ae) from 1100 onwards, but, as Campbell notes, the change did not seem to happen simultaneously in all dialects: the "AB-language" in the first half of the thirteenth century still differentiates between the two vowels in spelling. On the whole, the evidence relating to this later merger is difficult to interpret in terms of orthographic and phonological development.

It could be assumed that the La3amon A text at an earlier textual stage was spelt using a system that reproduced the Old English spelling conventions for the vowel set here considered. The writer who produced this had, however, naturally merged /ae/ and /ea/ in his spoken system. As a result, there would almost certainly be some interchange between <ae> and <ea> spellings in words of both historical sets. As the distinction between /ae/ and /a/ appears to have been retained to the mid-thirteenth century at least in parts of the Southwest Midland area, the writer may not have merged these vowels. In that case, he distinguishes regularly between them, spelling them <ae> and <a> respectively. Alternatively, if the two vowels are merged, some interchange between the spellings may be expected. This writer may well have been La3amon himself although in principle it could also be a translating scribe imposing his own system on the text. On the whole, however, it is probably most likely that all the archaic, Old English-based, characteristics of the language go back to the author.

We may then consider what happens when such a post-Old English system is copied by an ordinary, mainstream English scribe of the late thirteenth century. Whatever the details of his written and spoken systems otherwise, it is fairly certain that the merger of a and has taken place in his spoken system, and that his normal writing system uses <a> for the resulting phoneme /a/, and <e> for the mid-close phoneme /e/. Any other system at this date would be exceptional.

Precisely what happens when he confronts the archaic system of his exemplar depends on his copying habits. If he had a preference for translation, the resulting system would only show <a> and <e>, possibly with some irregularity in the case of words he would not understand; this would be the most likely result in the late Middle English period. It would not, however, lead to the large-scale variation between <a> and <ae> in Lazamon A (Hand A).

However, in the thirteenth century most scribes were still trained in copying Latin, a practice that presupposed literatim copying. A scribe used to copying Latin would often extend the literatim habit to English copying as well. However, if the scribe was not familiar with the system of the text he copied, he would be likely to slip towards his own usage when copying. The resulting copying behaviour would essentially be constrained selection, with a tendency towards literatim copying. It would, moreover, be expected that the literatim tendency would grow stronger as the scribe would become more used to the usage of the exemplar. Such a development seems, in fact, take place in the Caligula text (cf. Stanley 1969: 23, 25-26).

The late thirteenth-century scribe would, then, work as follows. Faced with a form that agrees with his own practice, he copies it as it is. Faced with an exotic form, he varies between translating it and copying it as it stands. Of the set of vowels here considered, the two exotic forms in the Old-English-based system of the exemplar would be <ae> and <ea>; in the scribe's own written system, the words containing these would have <a>. However, the scribe would be used to the letter <ae> in a different system, in Latin, where it was equivalent to <e>, not <a>.

The correspondences between the different systems are shown in Table 2. The Latin correspondence is shown with a stippled line: it is a different kind of relationship from those shown with full lines. Full diagonal lines indicate mergers.

When faced with an <a> spelling in the exemplar, the scribe naturally writes <a>. Faced with <ae>, he would either translate to <a>, his own spelling of those words, or simply copy <ae>; his usage here might vary considerably. Occasionally, the Latin practice would leak in and he would substitute <e>.

When faced with <ea>, much the same would happen, except that there would be no leak of <e>. However, we have postulated some interchange of <ae> and <ea> in the exemplar, as the author's, or earlier scribe's, spoken system would here have clashed with that of the Old English spelling system. This interchange is carried over to the present output, so that <ae> and <e> also appear for OE ea, and <ea> very occasionally appears for OE ae.

Finally, for <e>, the scribe would virtually always write <e>; however, the correspondence between <ae> and <e>, activated by the <ae> spellings in the text, might here leak in an occasional <ae>.

We thus assume an interaction between four systems; it may be noted that all the systems postulated are quite uncontroversial. The main clash takes place between the scribe's own system and that of the exemplar; in addition, there is a secondary, interacting clash between the system of the exemplar and Latin orthography. When we add what must be considered the most likely scribal behaviour, given the time and context, the predicted output corresponds in detail to the variation in Lazamon A (Hand A).

This approach is clearly very different from that which looks at the text as a transcription of speech. It has, it is felt, the advantage of being able to cope both with complex explanations and the interaction between writing and speech. However, the present example is far too limited for other than tentative conclusions, and much further work, both on Lazamon and on other texts, will be needed in order to refine the methodology.

The final question is how far the above explanation of the variation in Lazamon A (Hand A) can shed light on the phonology. Firstly, the regularity of the pattern suggests strongly that the writer responsible for the earlier layer, perhaps Lazamon himself; had retained a full three-way distinction between /a/, /ae/ and /e/. If the first two vowels had merged in his spoken system, it would be difficult to imagine that he could have retained the Old English spelling distinctions as regularly as he did. The key here is the regularity of the a set, including a from retraction before 1-groups: if the <ae> spellings would simply be archaising "ye olde" signs, as Stanley (1969: 28) would seem to imply, we would certainly expect them to turn up in these categories as well. This conclusion does not seem unreasonable, given that Lazamon must have been roughly contemporary with the scribe of Ancrene wisse, who also kept the categories apart. (16)

In addition, it may be concluded that there is no reason to assume that Scribe A had anything but a stable short vowel system where OE a, ae and ea were merged, or that the variable orthography is anything but a product of copying. Finally, the textually underlying written system, perhaps Lazamon's own, does not appear to have showed any sign of the so-called second fronting; if it had, the resulting pattern would have looked quite different. Of course, precisely how the scribe, or La3amon himself, pronounced these vowels, we shall never know.
Table 1

Spellings for vowels corresponding to West Saxon short a (except
before nasals), ?? ea and e in La3amon A (Hand A). Analysis based
on data from LABME database.

Reference vowel
(West Saxon) Spelling: main forms

WS a (except before nasals) <a> 36 types, 131 tokens

WS ea + IC <a> 24 types, 206 tokens

WS ae <a> 41 types, 283 tokens (a)
 <ae> 24 types, 62 tokens (b)
 <e> 19 types, 354 tokens (c)

WS ea <a> 13 types, 16 tokens
 <ae> 7 types, 10 tokens
 <a> 9 types, 12 tokens
 <e> 8 types, 13 tokens

WS e <e> 81 types, 262 tokens

Reference vowel
(West Saxon) Residual forms

WS a (except before nasals) 1 X staeoeli (OE

WS ea + IC 1 X ael (WS eall); 1
 X tolde, tolden (WS

WS ae 1 X bear, eafter,
 peal, neafde (OE
 baer, aefter, paell,
 1 X weos (OE weas)

WS ea

WS e 1 X maeste-cun (OE
 mete-cynn ?maete-);
 3 X aeft (OE eft,
 associated with

(a)Types (inflectional forms not listed separately): abbe, aoel-, after,
bar, craft(-), fader, faste, garsum(-) habbe, hafo, hafde, hauede, nafo,
pal, vaste, vastliche, wal-kempen, wan, was (64), wat, water(-) pat

(b)Types (as in a): aeol(-) aefter, -faest, gaersume, haefde(-),
paelles, waeles, wael-, waes (8), waetere, paet

(c)Types (as in a): creft(-) gersume, hefde(-), hefede, heuede, iber,
wet, whet, pet

Table 2

The interaction of spoken and written systems behind the spelling
variation in the unrounded low and mid vowel set in La3amon A (Hand A).

System 1 System 2 System 3 System 4

OE spelling Author/exemplar Author/exemplar Scribe's
(presumed spoken system written system own system
a /a/ a a
ae /ae/ ae(ea)
e /e/ e e

System 1 Output

OE spelling MS
a a
ae a, (ae), ((e)), ((ea))
e e, ((ae))

(1.) The Middle English Grammar Project is a collaborative undertaking ongoing in Glasgow and Stavanger. Its immediate goal is to produce a descriptive account of the orthography, phonology and morphology of Middle English, based partly on the materials collected by LALME, partly on data collected especially for the purpose and entered onto a database. The first instalments of the Grammar will, it is hoped, be published within the next two or three years, and will include a Catalogue of Sources and an electronic Corpus of the text samples used in the database.

(2.) For a classic statement of this standpoint, see Tolkien (1929: 104-105).

(3.) The Linguistic Atlas of Early Middle English (LAEME, ongoing work) forms, together with the Linguistic Atlas of Older Scots, the Edinburgh Medieval Atlas Projects, ongoing at the Institute for Historical Dialectology, University of Edinburgh.

(4.) McIntosh (1956 [19891, 1963 [1989]).

(5.) The period covered by LALME is 1325-1425 for the South and 1350-1450 for the North; however, a few earlier and later texts are included.

(6.) The English translation is from Stanley (1988: 311). The original reads: "Andererseits treten schon in der ersten Halfte des zwolften Jahrhunderts Aufzeichnungen hervor, die mit der alten Tradition brechen. Diese Richtung drang schlieBlich durch und man schrieb wie man sprach: von da an bis tief ins vierzehnte Jahrhundert bewegen sich alle englisehen Schriftwerke in lokalen Mundarten".

(10.) The word "writer" is henceforth used as a cover term for either author or scribe, when specifying one or the other is not feasible or desirable.

(11.) The concise wording given here is based on my understanding of the introductory discussion by Sgall (1987: 1-6).

(12.) For a full account of the various scribal strategies, see Benskin and Laing (1981).

(13.) See note 1.

(14.) If linguistic variation is generally supposed to be orderly, a Mischsprache cannot be expected to consist of an entirely random mixture of forms. However, the practical definition of a Mischsprache will be a language that contains variation of such a complexity that it, for working purposes, might as well be random.

(15.) Stanley (1969: 30). For a study of the Tremulous Hand of Worcester, see Franzen (1991).

(16.) In an earlier paper, I have suggested that the scribe of Ancrene wisse, surviving in Cambridge, Corpus Christi College 402, may have been the only person who, in all detail, wrote the so-called "AB-language"; see Black (1999).


