# Verbs and Numbers: A Study of the Frequencies of the Hebrew Verbal Tense Forms in the Books of Samuel, Kings, and Chronicles.

This monograph presents a statistical study of the frequencies of
verbal tense forms (infinitives absolute and construct, participle,
imperative, suffix conjugation, and prefix conjugation) in Samuel,
Kings, and Chronicles--with the objective of characterizing late
biblical Hebrew (LBH). The study uses the Massoretic Text, in spite of
acknowledged textual complexities, to avoid the subjectivities inherent
in an eclectic base. While at other points the work is overly
conservative, this decision should be applauded. The variants most
critical to this study are those that affect the tense forms, and it is
difficult to see what criteria could be used to prefer one variant over
another except those of the sort being tested by the study. Thus, any
departure from an externally defined text (such as MT) runs a serious
risk of circular reasoning.

Many factors other than diachronic change might condition the frequency of verbal forms. The study considers some of these: narrative vs. discourse; lists (1 Chron. 1-9, 23-27) vs. non-list material; synoptic vs. unique passages. Other important distinctions (such as poetry vs. prose, or different linguistic registers) are unfortunately not noted. Furthermore, in spite of the rich repertoire of functions of the Hebrew verb, only morphosyntactic features detectable by a computer program have been taken into account, in an attempt to avoid the subjectivity of human analysis.

While subjective classifications are to be avoided, it is a mistake to identify "objective" with "computer-recognizable." After forty years of research into computer processing of natural language by the Artificial Intelligence community, there is still no program capable of general language understanding, and many experts are concluding that the processing of human language intrinsically requires human intervention. Many day-to-day verbal transactions that we would consider perfectly clear and objective remain inaccessible to computer analysis. The lesson of this experience for computer-assisted biblical studies is that we should restrict the computer to its areas of strength (including tabulating, calculating, and displaying), and not permit a commendable desire for objectivity to drive us out of our particular competencies (recognizing patterns in natural language text).

The study reaches some straightforward conclusions. Samuel and Kings resemble one another and differ from Chronicles in using more verbs overall, fewer infinitives construct and participles. The frequency of imperatives is relatively constant across the three books. Prefix and suffix conjugations differ in all three. In general, the differences among books are most accentuated in narrative text, while discursive passages show similar usages in all three books. In sum, Verheij concludes that the language of Chronicles is indeed different from that of Samuel and Kings, but that the relation of Samuel and Kings to one another is unclear, and the homogeneity of Early Biblical Hebrew is open to question.

The kinds of statistical analysis used in the study are of great importance for philological investigations, but biblical scholars are not generally trained in their use, so it is important to ask how well Verheij uses these tools.

The study avoids some common errors. For example, the proportion of verbs of a given form in a text is computed with respect to the total number of verbs, not the total number of words, a precaution that recognizes that the books under study differ in the overall percentage of verbs that they contain.

The main statistical technique used is chi-squared analysis, which permits the analyst to determine the significance of the deviations of a set of numbers from their expected values. For example, 15.8% of the words in Samuel, Kings, and Chronicles, taken together, are verbs. Even if all three books represented the same dialect of Hebrew, one would not expect the percentages for each of them to be exactly 15.8%. Some variability is to be expected, but how much should be allowed before we become suspicious? In particular, the percentages for the individual books are 18.2%, 16.9%, and 12.3%, respectively. The chi-squared test shows that this degree of variation is probably not the result of chance variations, and we should indeed conclude that Chronicles is significantly sparser in verbs than Kings or Samuel.

Like any tool, a statistical test can be applied with great skill and with sensitivity to its capabilities and limitations, or in a cookbook fashion that treats it as a black box and assumes that if one drops in a set of numbers, the results will be meaningful and reliable. Some aspects of Verheij's use of chi-squared suggest the latter approach.

A fundamental assumption of the chi-squared test is that the cases one is studying are independent of one another. Independence might be violated by the Chronicler's quoting from Samuel and Kings, leading to unusually high similarities between the books (or to unusually high differences, if the editor is following a stylistic canon of variability to avoid wooden repetition). A more subtle problem of independence arises in certain configurations of data, where increasing the size of one cell tends to decrease the size of another. The study could profit from more discussion of the independence assumption throughout the analysis.

Closely related to the notion of independence is that of "degrees of freedom." If the variables are independent of one another and are not used in determining the expected values from which their variation is being measured, the degrees of freedom of the system is just the number of variables. However, if the data themselves are used to estimate the expected values, the degrees of freedom will be fewer than the number of variables. The estimate one gives for the significance of a set of variations depends not only on the magnitude of those variations, but also on the degrees of freedom of the problem, and for this reason each chi-squared analysis should specify the degrees of freedom used. Verheij does not even mention degrees of freedom, much less give the appropriate values for the numerous analyses offered in the study.

A frequent side effect of applying a technique mechanically, without consideration for its abilities and limitations, is a corresponding underutilization of other techniques. Chi-squared is a useful test, but by no means the only technique useful for Verheij's valuable data. The study would be more useful if it exposed readers to a variety of analyses, particularly some that take advantage of the mind's ability to recognize patterns and deviations in graphical displays.

In sum, Verheij has provided a useful analysis of an important aspect of the historical grammar of biblical Hebrew, but it is one that should be extended by a sensitive application of a wider variety of statistical techniques.

