Printer Friendly

Genericity from a cross-linguistic perspective (1).

Abstract

In this article, I will investigate genericity from a cross-linguistic perspective. After discussing some methodological and theoretical problems involved in this task, I will present a multidimensional approach, arguing that it is necessary to factor apart different types of information involved in the notion of genericity. In the empirical part, generic marking and interpretation in the nominal domain will be analyzed in terms of this approach for five languages: English, German, French, Hungarian, and Greek. As a result, I will claim a fundamental typological difference between QUALITY-marking languages (English) and DISCOURSE REFERENT-marking languages (French, Hungarian, Greek), with German representing a mixed type between the two.

1. Introduction

Generic statements express generalizations about kinds. A classic generic sentence contains a kind-referring noun phrase as its topic and a characterizing predicate which expresses a time-stable and prototypical (but not necessarily essential) property of the topic. This may be illustrated with the English sentences in (1).

(1) a. The boa constrictor is very dangerous.

b. Boa constrictors are very dangerous.

c. A boa constrictor is very dangerous.

However, different languages employ different grammatical devices for expressing generic meaning and they also make use of different grammatical and semantic or pragmatic cues which may contribute to the interpretation of a certain sentence as generic rather than episodic. Nevertheless, there are a number of properties which seem to hold cross-linguistically in the domain of genericity. Let us begin with these properties.

First, natural languages typically exhibit more than one noun phrase construction in order to make a generic statement. In particular, we usually find several phrasal types which may be interpreted as referring to kinds. (2) In English, for instance, it is possible to construct a kind-referring phrase (in the case of count nouns) as a definite singular phrase (the boa constrictor), as an indefinite singular phrase (a boa constrictor), or as a bare plural phrase (boa constrictors). When languages allow different (formal) types of generic noun phrases, these types are typically partially synonymous, as in English. That is to say, it is never the case that all different generic types permitted in a language would be intersubstitutable in all possible (generic) contexts. However, we always find some contexts which tolerate the intersubstitution of two or more different generic constructions without significant change of meaning. Strong complementarity thus seems to be the exception. Moreover, in generic discourse, it is possible in many languages to switch between different generic constructions and still refer to the same kind, namely the topic of the generic text in question (cf. Link 1995; Chur 1993).

Second, it is probably very rarely found in the languages of the world that, in noun phrases, generic meaning is encoded in a unique and unambiguous way by the use of exclusively generic forms (e.g. by a generic article). (3) Rather, generic interpretation typically results from the interaction of a number of variable factors such as the lexical semantics of the constituting elements, pragmatic knowledge and discourse situation, grammatical marking of determination and quantification on the noun phrases, and grammatical marking of tense, aspect, and mood on the predicates, syntactic position of the noun phrases, and so on. Here, the relevant grammatical elements themselves (e.g. determiners, quantifiers, number and aspect markers) are, as a rule, systematically ambivalent with respect to generic and nongeneric interpretations. Consequently, we normally observe constructional ambiguity on the phrasal level. (4) In English, for instance, each of the above-mentioned three types of generic noun phrases is ambiguous in the sense that they may be associated with a generic interpretation and one or more nongeneric interpretations: in addition to the generic reading, the definite singular phrase is associated with a specific/definite reading, the indefinite singular phrase with a specific/indefinite and a nonspecific/indefinite ("attributive" or "narrow-scope") reading, and the bare plural phrase may likewise carry a specific/indefinite and a nonspecific/indefinite interpretation; cf. Section 2). What is more: it seems to be typical that constructional ambiguity on the phrasal level is not necessarily eliminated during sentence composition but is retained on the sentence level. Therefore, it often happens in very many languages that particular sentences, when looked at in isolation, out of context, admit both a generic and a nongeneric interpretation. This is clearly the case with sentences containing a definite singular phrase in English (such as [1a]), whereas the corresponding sentences containing an indefinite singular or a bare plural phrase ([1b] and [1c]) are normally felt to be confined to a generic interpretation.

A third cross-linguistically common property of genericity follows from the fact that alternative generic constructions are often partly synonymous and the fact that there are usually no grammatical markers exclusively used to mark generic meaning. The semantic values of those grammatical elements which are involved in marking genericity considerably differ when they are used in generic and in nongeneric sentences. This is particularly obvious with determiners and number. Consider, for instance, English. Whatever the semantic difference between a definite singular phrase (e.g. the boa constrictor) and an indefinite singular phrase (a boa constrictor) or between a singular and a plural phrase (e.g. a boa constrictor vs. boa constrictors) in generic sentences is, it is clearly not identical with the corresponding semantic differences found in nongeneric ("particular") sentences. (5) Accordingly, it is only in generic sentences that it is possible to vary the determiner or number in some contexts without drastically altering the meaning or to switch between definite and indefinite or between singular and plural forms in a text while continuing to talk about the same discourse referent. (6)

We will now turn to the cross-linguistic differences in encoding and decoding generic meaning.

First, languages differ in their constructional potential for marking genericity. This is trivial inasmuch as different marking devices may immediately follow from more general grammatical patterns of a language. Of course, in an articleless language we will only find "bare phrases" in the set of possible generic phrases. When, however, an articleless language morphologically conflates referential marking and role marking, the morphological case of a phrase will probably be a relevant feature in generic marking. In Finnish, for instance, first (i.e. "proto-agentive") arguments are regularly marked as nominative (rather than as partitive) under generic interpretation, and second (i.e. "proto-patientive") arguments as partitive (rather than as accusative). Some East-Asian languages such as Korean or Tagalog employ topic-marking elements with generic phrases, while in Vietnamese, for instance, some types of generic phrases contain a classifier as opposed to classifierless generics (for genericity in Finnish, Tagalog, and Vietnamese, cf. Behrens 2000).

Second, languages may differ considerably with respect to the status of their generic constructions, even in cases where they exhibit a very similar constructional potential. In particular, the observable differences concern the relative frequency of the corresponding types of noun phrases and the degree of contextual restrictions they are subject to when used generically. In connection with this, significant differences may also be found in the degree of markedness associated with the generic interpretation of corresponding types of noun phrases in different languages. In several European languages such as French or (Modern) Greek, the definite plural phrase is by far the most frequently used type of generic phrase. In contrast to this, it is much less frequent and may only be used under highly restricted conditions as kind-referring in other languages, such as English (cf. Section 5.2.3). Besides this, definite noun phrases in general (i.e. including singular phrases) exhibit a much stronger bias toward nongeneric interpretation in a language such as English than in languages in which definite marking is the dominating device for reference to kinds (as in French, Greek, etc.).

Third, the generic context types in which corresponding constructions are allowed to occur are also subject to cross-linguistic variation. For instance, in the case of descriptive generalizations in terms of prototypical properties (such as "being dangerous" for boa constrictors), English, German, and French may use the indefinite singular in addition to the other constructions. In contrast to this, only a definite phrase (singular or plural) is felicitous in this type of generic context in Hungarian or Greek, although both languages tolerate indefinite singulars in other generic contexts (cf. [21], and Section 5.2.4.4). This may be illustrated with example (2), which contains roughly equivalent generic statements about the kinds "boa constrictor" and "elephant" in the five languages in question. The sentences are taken from the novel Le petit prince by Antoine de Saint Exupery. (7) The French sentence given in (2c) is the original, the others are translations thereof. (8)

(2) a. A boa constrictor [IND, SG] is a very dangerous creature, and an elephant [IND, SG] is very cumbersome.

b. GERMAN: Eine Riesenschlange [IND, SG] ist sehr gefahrlich, und ein Elefant [IND, SG] braucht viel Platz.

c. FRENCH: Un boa c'est [IND, SG] [TOPIC] tres dangereux, et un elephant c'est [IND, SG] [TOPIC] tres encombrant.

d. HUNGARIAN: Az oriaskigyo [DEF, SG] nagyon veszelyes, az elefant [DEF, SG] roppant terjedelmes.

e. GREEK: O voas [DEF, SG] ine tromera epikindhinos ki o elefandas [DEF, SG] arketa enoxlitikos.

Concerning this example, the following should be noted: French does not simply employ an indefinite singular phrase in the present context, but a construction which marks the indefinite phrase explicitly as the topic of the sentence (x, c' est ...). In this way, a generic interpretation is more strongly forced and set off from a corresponding nongeneric reading, in which properties would be ascribed to a specific, existing boa constrictor and a specific, existing elephant. As far as Hungarian and Greek are concerned, however, avoiding ambiguity with the specific/indefinite reading cannot count as a relevant factor for using a definite article under generic interpretation. There is one pragmatically plausible interpretation in which a specific/indefinite subject (topic) may be characterized by an "individual-level" (cf. Carlson 1977b) predicate such as "being dangerous": the group, whose members the indefinite subjects are, is itself definite (e.g. a definite group of animals in the zoo). Under this condition, Hungarian would necessarily and Greek preferably use a "definite-indefinite" construction (i.e. a construction which contains both a definite article and an indefinite one: Hung. az egyik x, Greek o enas x) rather than a simple indefinite phrase (i.e. one containing only an indefinite article).

Finally, I would like to draw attention to cross-linguistic differences in patterns of ambiguity involved in marking genericity. As mentioned above, the ambiguity between generic and nongeneric readings can be studied on different levels of analysis, namely, on the level of phrasal constructions, on the sentence level, in discourse, etc. Certain (formal) types of noun phrases which look alike at first glance may exhibit different ambiguity patterns. For instance, a Finnish plural phrase in the nominative (e.g. boat; cf. [3b]) looks like an ordinary bare plural phrase in English (e.g. boa constrictors; cf. [3a]). The Finnish phrase is, however, ambiguous between a specific/definite and a kind-referring reading, while the English phrase exhibits an ambiguity between indefinite (specific or nonspecific) and kind-referring readings. When combined with a characterizing predicate, the Finnish phrase regularly yields a sentence which is clearly ambiguous between a generic and a nongeneric statement, while in English, there is a tendency to interpret the resulting sentence only generically. In this way, Finnish "bare plurals" (in the subject position) pattern, with respect to ambiguity, exactly like phrases overtly marked by a definite article in other languages, e.g. in German (cf. [3c]).

(3) a. Boa constrictors [[empty set] (9)] PL] are very dangerous.

b. FINNISH: Boat [PL] [NOMINATIVE] ovat hyvin vaarallisia. ('Boa constrictors as a species or the boa constrictors at hand are very dangerous.')

c. GERMAN: Die Riesenschlangen [DEF, PL] sind sehr gefahrlich. (meaning the same as the Finnish sentence)

Or consider the indefinite singular phrase in Hungarian. Taken as a construction, without being filled with lexical material and not standing in a certain sentential context, it displays roughly the same range of referential ambiguity as its English or German counterparts. In contrast to English and German, however, there is no context in which the Hungarian phrase could be interpreted ambiguously (see above).

From a typological point of view, it is especially insightful to cross-linguistically compare the ambiguity patterns of those generic constructions which are used more frequently in a language and have an unmarked status there. Later in this article, I will propose a typological distinction in the marking of generic meaning that is partly based on such dominant ambiguity patterns. In doing so, I will confine myself to the discussion of data from five European languages (English, German, French, Hungarian, and Greek), which are superficially similar in their grammatical potential relevant for marking genericity (i.e. determination is indicated by article distinction rather than by case distinction, etc.).

2. Difficulties in the cross-linguistic comparison of genericity

Cross-linguistic comparison of genericity is burdened with a number of serious theoretical problems. Even though genericity has been given increasing attention in recent years, it still belongs to those areas of linguistics which are poorly understood and extremely controversially disputed. There are relatively few languages such as English and French, for which the description of this phenomenon can look back on a longer tradition. For most languages genericity has been recorded only in a rudimentary way. As such, it is usually briefly touched upon in the discussion of other more general topics such as article systems or classifier usage. The exception proves the rule: for some of the European languages such as German and Greek, there exist recent comprehensive monographs on this topic (cf. Chur 1993; Marmaridou 1984). It should also not go unmentioned that a number of interesting contrastive studies on encoding and interpreting generic noun phrases or on the question of distinguishing between different types of generic sentences have appeared. For example, German and Tardif (1998) compare English and Mandarin, Pease-Gorrissen (1980) English and Spanish, Smolska and Rusiecki (1980) English and Polish, Dayal (1992) English and Hindi, Lee (1992) English and Korean, Casadio and Orlandini (1991) English and Latin, and Matthews and Pacioni (1997) even compare Cantonese and Mandarin.

Most of these works are concerned with the comparison of a particular language with English, doing so on the basis of theoretical proposals specifically developed using data from English. This reflects the extraordinary dominance of English as the one natural language whose specific characteristics influence theoretical discussion far more strongly than the characteristics of any other language in the world. It is not my intention at this point to indulge in an extensive criticism of the widespread practice of choosing English as a sort of reference language for the presentation of data and theoretical problems, given that I will not entirely refrain from this practice either. Nevertheless, I consider it appropriate here to point to some obvious risks inherent in this practice.

First, it should be stressed that, as far as genericity in English is concerned, there is no agreement at all among linguists as to how to deal with it. In this context, Jacobsson's (1998) short but very much to-the-point overview of the enormous diversity of approaches is valuable. It would not be an exaggeration to say that there are divergent opinions on practically every fundamental question such as: How does reference work in the generic domain? How broadly should the notion of "generic sentence" be defined (e.g. should it include habitual sentences which express characteristic properties of particular individuals)? What is the semantic difference between the different linguistic realizations of generic reference (if there is any difference) and how should one distinguish between generic terms proper and related, generic-like expressions (cf. Note 2)? (10)

Though some of the controversy in the treatment of English generics only concerns technical issues of representation, much of it is linguistically motivated. A great deal of the arguments adduced in favor of or against a certain theoretical claim are based on rather fine-grained semantic and syntactic analyses of English data. As always in typological studies, the question thus arises: how can we make use of such language-specific arguments in the investigation of other languages which crucially differ from English with regard to relevant factors such as (a) their article systems (perhaps no article system at all) and number marking, and (b) the set of construction types used for encoding kinds and the patterns of ambiguity and synonymy characterizing them? Carlson (1989: 167), for instance, makes explicit that he would like to confine his proposals to the analysis of English generic sentences, though in the hope that what he says about English "will shed light on similar constructions in a wider range of natural languages." However, what does "similar constructions" mean in such a case: constructions with a similar meaning (e.g. generic meaning) or constructions with a similar form (e.g. bare plurals)?

Before going into some details of the typological preconditions underlying Carlson's approach, I would like to emphasize one crucial point. The problem I am addressing is only in part identical with the epistemological and terminological difficulties which usually arise in the context of typological studies when language-specific manifestations of universal categories have to be defined (cf. Croft 1990: 11ff.; Wierzbicka 1995: 181).

There is general agreement in the typological community that "surface" forms or constructions are not the right kind of tertium comparationis, simply because most of them are not shared by all languages and because superficially similar constructions may have different meanings by virtue of their being integrated in entirely different sets of distinctions. In this sense, a construction such as "bare plural" does not provide a suitable basis for cross-linguistic comparison. This was illustrated with data from English and Finnish in the preceeding section (cf. [3]). On the other hand, typologists also seem to agree that homogeneous and very abstract meanings, at which linguists normally arrive at the end of their semantic analyses of language-specific categories, are not unproblematic either when taken as the starting point for cross-linguistic research. Meaning components that form a homogeneous unity in one language may be distributed over a number of distinct linguistic manifestations in another.

Consider, for instance, the case of "indefinite generics." When "indefinite generics" are defined according to formal criteria and understood as a term referring to noun phrases that contain an indefinite article and, as such, receive a generic interpretation (of whatever kind), cross-linguistic investigations very soon lead to the result that many languages do not have "indefinite generics" quite simply because they do not possess an indefinite article, let alone one which could occur in generic contexts. If, instead, one proceeds from an abstract core meaning that generic phrases which are formally indefinite have in certain well-described languages, one will probably run into another difficulty: the reconstructed meaning will not match the generic meaning that formally indefinite phrases display in other languages. Or the proposed meaning description will be far too abstract and will thus fail to account for significant contextual differences in the generic use of indefinitely marked phrases in different languages, for example for the difference mentioned above between English and German on the one hand and Hungarian and Greek on the other hand (cf. Section 1). (11) The common strategy in typological studies for coping with such problems is to try to set apart the constitutive semantic factors in a multifactoring analysis. (12)

Difficulties in cross-linguistic investigation of genericity thus remind us of familiar problems in the cross-linguistic identification of universal categories. In the realm of genericity, however, matters are by far more complicated chiefly for two reasons. First, genericity is usually manifested as a covert category rather than as an overt one expressed by unambiguous grammatical markers. As stressed above, ambiguities and interpretational uncertainties are a common feature of generic expressions around the world. Second, the semantic features which may be thought of as constitutive for the overall meaning of generic expressions such as the notions of (in)definiteness and (non)specificity have been commonly studied in the area of particular sentences which express spatiotemporally fixed situations or at least situations which involve existing individuals. This particularly applies to theories of definiteness, which are typically called "article theories" and whose quality is generally rated according to their capability of capturing the major usage types of the definite article in nongeneric sentences in a particular language. To this day it seems barely understood how to apply the notions of definiteness and specificity to the semantic interpretation of generically used expressions in a language-independent way, that is, independently of the actual devices employed under generic interpretation in particular languages.

In any event, it is striking that there are languages for which the hypothesis that generics are related to indefinites or even constitute a variant of indefinites fits better than in others. Then, there are languages that rather suggest an affinity between generics and definites. It hardly needs to be mentioned that in English, where the definite article--when compared with languages such as French or Greek--is comparatively seldom encountered in noun phrases to be interpreted generically and where zero determination is the most prominent and most frequent type of device for encoding generic meaning, the hypothesis of a relationship between generics and indefinites has particularly many adherents (for statistical data cf. Figure 1 and Figure 2).

[FIGURES 1-2 OMITTED]

According to a common assumption (cf. Burton-Roberts 1977: 187), bare plurals as subjects of characterizing predicates simply have a "nonspecific" reference. That is to say, beavers and dams in Beavers build dams are to be treated on a par, both being nonspecifics. Declerck (1991) goes a step further, proclaiming homogeneous reference for all variants of bare plurals, including such cases where existence presupposition is involved. In addition to the semantic variation that may be observed between the interpretations of beavers (as subjects) and dams (as objects) in a characterizing sentence, he also considers the semantic variation between the two object-occurrences of actresses in sentences such as Betty hates actresses and Betty personally knows actresses (in the reading: 'some actresses') as a pragmatically guided variation: it is claimed that the bare plural may, in principle, either refer to the whole set (the "generic set") or to any subset thereof. At first glance one might also rank Carlson's (1977a) proposal for a unified treatment of the generic reading and the indefinite/existential reading of English bare plurals among the group of similar approaches. However, Carlson generally regards bare plural noun phrases as names of kinds and, in doing so, ultimately implicates an affinity between generics and definites as well. (13)

The basic tenet of Carlson's argumentation can be briefly summarized as follows. The null determiner is not to be regarded as the plural counterpart of the indefinite article. One piece of evidence used to support this assumption is the observation that in opacity-inducing contexts (e.g. occurring as objects of verbs such as seek), bare plural phrases do not really exhibit an ambiguity between a transparent ("wide-scope" or "existential") and an opaque ("narrow-scope" or "nonspecific") reading, in contrast to singular phrases containing an indefinite article. Rather, bare plurals tend to take on only an opaque reading in this context. Moreover, the existential reading and the kind-referring reading are typically in complementary distribution: with episodic predicates expressing a spatiotemporally bounded event (stage-level predicates), bare plural subjects will normally be interpreted in the existential sense, whereas with spatiotemporally unbounded predicates expressing properties (individual-level predicates), they are normally interpreted as referring to kinds. According to the well-known structuralist reasoning, complementary contexts correlating with interpretational differences of a certain form may be used as an argument that the form is not ambiguous. In this sense Carlson argues for assuming the context of the sentence to give rise to a distinction between indefinite/existential and generic interpretations, rather than treating the bare plural phrase itself as ambiguous.

Theories requiring a unified account of the meanings of the English bare plural claim that two different semantic distinctions are collapsed in a homogeneous meaning in English, each of which may be systematically kept apart by distinct forms in a number of languages, namely: (a) the distinction between a "specific," existence-presupposing interpretation and a "nonspecific" interpretation (which does not strongly presuppose the actual existence and/or identity of particular individuals) and (b) the distinction between a "nonspecific" interpretation and an interpretation which explicitly appeals to the class as a unit, and not only to its members. In doing so, such theories are crucially based on certain typological characteristics of English and similar languages.

First, English does not have a formal distinction between "specific" and "nonspecific" interpretations which would be marked on nominal expressions. Singular count nouns are, as a rule, required to be marked by a determiner under both interpretations, and the semantic distinction appears to be obvious only in opaque contexts. In the case of plural count nouns, there is likewise only a certain statistical tendency towards a correlation between the presence vs. absence of determiners such as some and the choice of a "specific" vs. "nonspecific" interpretation. In net result, the situation in English readily conveys the impression that spatiotemporally bounded, episodic predicates would a priori force a "specific" ("widescope") interpretation and that spatiotemporally unbounded predicates, especially habitual predicates, would in their turn be affine to "nonspecific" ("narrow-scope") interpretations. However, there are a number of languages (Hungarian and Greek, among others) that systematically make a formal distinction between specific and nonspecific noun phrases, which orthogonally applies to semantic distinctions such as spatiotemporally boundedness vs. unboundedness or episodic vs. habitual interpretations of the predicate (cf. Behrens and Sasse 2003).

Second, English does not employ a systematic formal distinction between two possible cases involving unspecific members of a class: the case in which the noun phrases in question are only used to modify some general activity and the case in which the noun phrases in question serve as the topic of a characterizing statement. In the latter case, the fact that the unspecific members together form a class (a "kind") is foregrounded, while in the former case this is not in the focus of interest. That is, in English dams and beavers are as a rule encoded identically as in Beavers build dams).

A number of languages (e.g. Hungarian, Greek, French), however, do exhibit such a formal distinction, for example, by using a definite article for pointing to kinds and zero (or indefinite, or partitive, etc.) forms for encoding simply nonspecific readings. In many languages, formal marking of this semantic contrast is even allowed for noun phrases which occur as objects of habitual predicates. Consider examples (4a) and (4b) from French (cf. Bennett 1977). Both sentences are ambiguous between a habitual and a nonhabitual interpretation, the habitual interpretation permitting a choice between the definite form (les pommes) and the partitive form (des pommes).

(4) a. Jeanne mange les pommes. 'Jeanne eats apples.' (habitual), 'Jeanne is eating the apples.' (nonhabitual)

b. Jeanne mange des pommes. 'Jeanne eats apples.' (habitual), 'Jeanne is eating apples.' (nonhabitual)

Such differences between languages demonstrate the inadequacy of simply adopting, in cross-linguistic studies, semantic analyses that have been developed on the basis of the typological conditions holding for certain specific languages. In particular, it is to be expected that alternations within habitual sentences such as in (4) will remain enigmatic in those theories of genericity which, proceeding from the formal unity of English constructions such as the bare plural phrase, claim that the nonspecific/ generic distinction may be accounted for in terms of the context-dependent selection of variants of the same meaning, for example, in terms of the pragmatically guided selection of noninclusive vs. inclusive reference (cf. Declerck 1991). There is nothing in the context which would motivate the use of a definite form in contrast to the partitive one or vice versa.

Two further examples are adduced below showing that semantic analyses influenced by language-specific theories may lead to apparently contradictory results. One concerns generics marked by the definite article in French, the other generics marked by a topic marker in East-Asian languages. Investigating the evolution of the article system form Old French to Modern French, Epstein (1994) makes the observation that the range of generic contexts in which a definite article is used has been continuously amplified. He also stresses, however, that nouns with generic reference could be expressed with the definite article as early as in Old French, illustrating this with examples such as (5a) and (5b):

(5) a. Si cum li cerfs s'en vait devant les chiens, ... (La chanson de Roland, ca. 1080) 'As the deer runs from the dogs, ...'

b. La leaute do it l'en toz jorz amer. (Le Charroi de Nimes, ca. 1150) 'One must always love loyalty."

In this connection, he makes the following remark: "Under the traditional analysis, however, we would expect these nouns to occur with the zero article, since they are not semantically definite" (Epstein 1994: 67) (the noun phrases in question are marked with underlining). But why exactly should they be "not semantically definite"? One possibility is that the notion of definiteness is considered to apply only to particular individuals, identifiable through time and space (e.g. located in what Langacker [1987] calls "basic domains" of reference). If so, it does not make sense to characterize noun phrases occurring in generalizing sentences such as (5) (which abstract from particular situations and individuals) as either "definite" or "nondefinite." That is, the term "definite" would simply not apply in that case. Another possibility for why the cited phrases should not be "definite" is that they are very simply not marked by a definite article in some languages, such as English. However, it should be pointed out that it is only from the perspective of such languages that "we would expect these nouns to occur with the zero article." Approaching the same data from the perspective of other languages such as Hungarian, Greek, Arabic, etc., we would certainly "expect" them to be used with a definite article.

Occasionally the literature also offers examples of genericity approached from the perspective of a language other than English. Lee (1996), for example, advances the hypothesis that generic sentences are topic sentences in which a kind-referring noun phrase is constructed as the topic. He motivates this assumption by citing evidence from languages which have an explicit topic marker, such as Japanese and Korean. In these languages, kind-referring noun phrases are overtly marked by such a topic marker when combined with a characterizing predicate in a generic sentence. Since topics display a strong association to definiteness in that topic noun phrases usually contain a definite determiner (with the exception of proper names) in article languages, Lee makes the following additional claim: the bare-plural form in English is definite when used as a generic noun phrase in topic position.

How independent of the use of definite determiners is semantic definiteness in generic sentences, then? Is the French noun phrase with the definite article "in actual fact" indefinite (so that we should expect zero-marking), or, by the same token, is the English noun phrase without a determiner "in actual fact" definite (so that we should expect a definite article)?

As already indicated earlier in this article (cf. Note 5), there are two opposite heuristic strategies which often collide in the treatment of genericity. The traditional bottom-up strategy starts with the basic grammatical ingredients (article, plural, etc.) and attempts to interpret genetic sentences in terms of the semantics assigned to these grammatical elements on the basis of the investigation of nongeneric sentences. Adherents of such an approach usually emphasize, for example, that the speaker using a form with the indefinite article generically has a single instance of a kind in mind that serves as the basis for his generalization. The opposite top-down strategy starts at the generic sentences and tries in the first place to clarify the relation these have to each other. Such an approach, for example, is more likely to stress the fact that the indefinite and the definite articles are substitutable for each other in certain generic contexts, while in nongenetic contexts they are never substitutable. I take it that both approaches have their legitimacy and should be regarded as complementary in language-specific studies. From a typological point of view, however, the importance of the second strategy has to be emphasized. That is, I will tentatively assume that generic sentences which are adequate translations of each other also have a synonymous meaning (but cf. Note 25). Proceeding from this assumption, I will try below to explain why different groups of languages show clearly different types of grammatical markers (e.g. zero determiners vs. definite determiners) when encoding the same or nearly the same meaning.

This does not mean, however, that the semantic insights gained from a detailed analysis of particular constructions in an individual language (e.g. from the analysis of bare plural phrases in English) would have no significance for typological investigations. The observation that the difference between nonspecific and generic meaning is minimal and does not lead to sentence ambiguity when these meanings are realized by the same form (cf. Section 3) strongly suggests the assumption that the cognitive salience of the distinction is fairly small. This, in turn, leads to the expectation that in languages which exhibit distinct marking for nonspecific and generic meanings, this formal distinction will be neutralized in some contexts, for example, in the form of free variation or automatic alternation.

3. Basic assumptions

Before proceeding to the analysis and typological evaluation of the data, I will briefly sketch the basic assumptions and concepts underlying the present approach.

In more recent works on genericity, a distinction has usually been made between "generic noun phrases" (which do not necessarily have to occur in generic sentences) and "generic sentences" (which do not necessarily have to contain a generic noun phrase) (cf. Krifka et al. 1995; Declerck 1991). This is motivated by the following considerations: It is possible to refer to kinds without making any sort of generalization. The noun phrase the potato in (6a), for instance, clearly refers to a kind ("solanum tuberosum"). However, it occurs in a sentence which expresses an episodic event with the first person plural (we) as its subject, rather than in a sentence with a characterizing predication about this particular genus. In this sense, we may say that (6a) is not a generic sentence even though it contains a generic noun phrase. On the other hand, it has also been observed that habitual sentences such as (6b) resemble traditional generic sentences in that they express a typical characteristic of their subjects. A broad definition of the term "generic sentence," which makes it synonymous with the term "characterizing sentence" and thus allows it to generally apply to descriptions of characterizing habits, would automatically qualify (6b) as a generic sentence that lacks a generic noun phrase. (14) A narrow definition of the term "generic sentence," which requires that the characterization expressed by the predicate concern a kind rather than a particular individual, leads, in turn, to the conclusion that sentences such as (6c) and (6d) are nongeneric (cf. Declerck 1991: 97), even though both contain a kind-referring noun phrase and perform a characterization by means of an attitude verb, namely the characterization of the first person subject or object, respectively.

(6) a. Yesterday, we had a very interesting discussion about the potato. (The teacher told us that it was first cultivated in South America....)

b. John smokes a cigar after dinner.

c. I love beavers.

d. The beaver has always fascinated me.

I will adhere to an explicit distinction between generic phrases and generic sentences and will use the term "generic sentence" according to the narrow definition just mentioned. On the other hand, I will use the term "generic phrase" in a rather broad sense, not restricting it by the syntactic position it occupies in the sentence (see [6a] and [6c]) or the type of the characterizing predicate. That is, a noun phrase such as boa constrictors will be regarded as a "generic phrase" both in the environment of kind predicates such as be extinct (which may be asserted only of kinds) and in the environment of predicates such as be dangerous, which may be equally asserted of kinds and their members (cf. Note 2). A novel aspect of my approach is that I will introduce a third level of linguistic description as being relevant for genericity, namely the text level. A generic text comprises generalized knowledge about a particular kind or about a particular stereotype situation. This kind or this situation constitutes the paragraph topic of the generic text in question. In (7), for example, we have an excerpt from a generic text (drawn from the British National Corpus), which deals with the kind "gold."

(7) The recognition of old as a symbol of excellence might almost seem an integral part of human consciousness.... It owes its unique status to the fact that the people who developed modern science and in many other ways created the modern world community had acknowledged the supremacy of old since prehistoric times.... The primary appeal of old as of other precious substances was to the senses.... The softness of gold made it relatively easy to employ for ornamental purposes.... The visual splendour and durability of gold which made it an outstanding symbol of excellence were matched by the fact that however widely distributed and keenly sought in nature it has remained rare.... The predominance of Russia was overtaken during the latter half of the nineteenth century by a succession of gold rushes to more or less remote parts of the world colonized predominantly by the British.... By the first decade of the twentieth century Australia was yielding 230,000 lb of gold a year.

Generic texts have their own peculiar discourse structure. According to my experience with genericity in different languages of the world, discourse structure in generic texts is usually assimilated to a certain extent to the discourse structure found in texts on particular individuals and particular events and facts. For example, in languages possessing an explicit device for definite anaphora (e.g. definite pronouns), this device is put to use for anaphoric reference to kinds basically in the same way as it is employed in texts dealing with specific participants (cf. the occurrences of it in [7]). However, several significant differences between generic and nongeneric texts with respect to reference tracking may also be observed. Thus, in generic texts, there is in general a significantly higher frequency of nominal mentions (instead of pronominalizations) in a sequence of mentions with the same referent. Particularly in languages which (in certain discourse constellations) regularly employ zero anaphora (instead of a definite pronoun) in the nongeneric domain, such as Hungarian or Arabic, this device for signaling reference continuity seems to be significantly more strongly restricted in the generic than in the nongeneric domain. (For further differences between generic and nongeneric discourse structure, see below Section 5.2.4.2)

A classic generic sentence, whose sentence topic refers to a kind and whose predicate characterizes this topic, may be uttered in isolation and understood as generic when so uttered. Frequently, however, it is embedded in a generic text. This does not imply that a generic text contains only generic sentences of the classical type or that every mention of a linguistic expression allowing reference to the topic of a generic text is in actual fact to be interpreted as a generic NP. In the text in (7), for example, characteristic properties of the kind "gold" are repeatedly constructed as nominalizations (e.g. softness, durability, visual splendor) that take the kind-referring phrase gold as their genitive attribute. Hence, the knowledge of properties characterizing a kind is presupposed here rather than explicitly predicated. Furthermore, in the case of gold in gold rushes and 230,000 lb of gold, we are not dealing with generic phrases, at least on the traditional interpretation of this term.

In my view, it is a fundamental property of nouns (15) in natural language that they are--as lexical elements--neutral with respect to different modes of reference or nonreference. I thus reject the idea that a certain use of noun would be linguistically prior to all other possible uses. In particular, I reject the common assumption that those uses which allow reference to particular spatiotemporally bounded objects in the world would have linguistic primacy over kind-referring uses (cf. Langacker 1991), (16) and, that therefore, the latter should be derived from the former (cf. on this point, Carlson 1989: 175). Conversely, I do not assume for any language that nongeneric uses are derived from generic ones as occasionally also suggested in the literature (cf. Krifka 1995: 399). This also pertains to languages in which generic uses are always zero-marked and identical to lexical stems. (17) In other words, I consider generic and nongeneric uses as systematically related alternative uses of lexical nouns, neither of them being derived from the other.

As a prerequisite to the following discussion, a brief excursus on the term "ambiguity" is in order. First of all, I would like to point to a distinction which has been made in Behrens (1998) between "heuristic ambiguity" and "interpretative ambiguity." Heuristic ambiguity is a pretheoretic notion and refers to the fact that the analyzing linguist is able to distinguish between two different semantic interpretations (or "understandings") (18) of a given linguistic form, independently of the question of mental representation. Interpretative ambiguity manifests itself in the way speakers process and judge actual utterances in their language. Cross-linguistic studies in semantics usually operate with heuristic ambiguity. When comparing the linguistic realizations of semantic distinctions across languages and stating that some languages employ distinct forms while others collapse the distinction into a single "ambiguous" form, linguists normally make no commitments to the mental representations of such "ambiguous forms" in the nondistinguishing languages. That is, they leave it open whether these forms are actually interpreted as being ambiguous or only as vague according to usual criteria employed in testing ambiguity in individual languages (cf., however, below).

In my view, however, cross-linguistic research on semantic issues and methods testing ambiguity as a matter of interpretation in individual languages are not mutually exclusive matters. It is desirable at any rate for semantic typologies to also integrate more subtle language-specific findings, even if it has to be noted that extensive testing to prove interpretative ambiguity in all the different languages remains a task for future research. If one proceeds from a language-specific point of view and is interested in how a particular case of (heuristic) ambiguity is processed by native speakers, the best diagnostic tests are what Cruse (1986) called "direct" tests for ambiguity (such as the so-called "identity test" or the negation test), which examine potential ambiguity effects in (sentential) context. I consider positive results of such tests as sufficient evidence of the strong ambiguity of the tested elements, for example, the sentences themselves or their smaller constituents such as phrases or lexical items.

So far I have applied ambiguity tests to only a few languages, in particular German and Hungarian. The results suggest that it makes a considerable difference whether the generic interpretation goes together with (a) a specifc/definite, (b) a specific/indefinite, or (c) a nonspecific/indefinite interpretation. The examples in (8) show how the negation test works in these three cases in German:

(8) a. Die Riesenschlange [DEF, SG] ist gefahrlich.

i. 'The kind "boa constrictor" is dangerous.'

ii. 'The particular boa constrictor (i.e. the one presently facing us) is dangerous.'

NEG: Die Riesenschlange ist nicht gefahrlich.

i. 'The kind "boa constrictor" is not dangerous.'

ii. 'The particular boa constrictor (i.e. the one presently facing us) is not dangerous.'

b. Italiener [[empty set], PL] handeln mit Zigaretten.

i. 'Italians deal with cigarettes.'

ii. '(Some) Italians are dealing with cigarettes.'

NEG: Italiener handeln nicht mit Zigaretten.

'Italians do not deal with cigarettes.'

c. Ich beschaftige mich mit Blumen [[empty set], PL].

i. 'I occupy myself with flowers, i.e. I study the kind "flower".'

ii. 'I occupy myself with flowers, i.e. I spend much time with flowers.'

NEG: Ich beschaftige reich nicht mit Blumen.

'I do not occupy myself with flowers.'

In example (8a), the test quite clearly yields a positive result. The negated sentence is twofold ambiguous in the same way with respect to a generic and a specific/definite interpretation of the subject phrase as the asserted sentence. It is therefore possible, without contradiction, to assert the generic meaning while denying the specific/definite meaning or vice versa. Cases such as exemplified by (8b) appear to be less clear. The first problem is that it is not so easy to find sentences which may actually exhibit both interpretations in question, that is, a generic interpretation in one situation and a specific/indefinite interpretation in another situation. This, however, is the precondition for the application of the test. Even if one finds instances such as (8b), speakers normally state that the negated sentence is confined to the generic reading. Therefore, the test only works on one condition: we test whether the asserted sentence in the episodic reading and the negated sentence in the generic reading are contradictory. With those speakers who fulfill the precondition of the test (i.e. accept the specific/indefinite interpretation in the asserted sentence which is weaker than the generic one), we receive positive results (no contradiction). In example (8c) finally, there is a general tendency to obtain a negative result (no ambiguity). Here too, it is not easy but possible to read two interpretations into the asserted sentence (one more kind-evoking interpretation and one that is more likely to be only nonspecific), in analogy to the formal distinction between these two interpretations observed in certain other languages (e.g. Hungarian would use a definite article in the first case and an articleless form in the second). However, this semantic difference entirely disappears in the negated sentence. No matter which interpretation is chosen for the asserted sentence, speakers always judge it contradictory to its negative counterpart.

One might object that ambiguity tests are not really applicable in all these cases since the tested interpretations are "privative opposites" in the terminology of Zwicky and Sadock (1975): there is a more general meaning (i.e. the generic one) which includes the other, more specific meaning (i.e. the nongeneric one). What is interesting, however, is the fact that the test effect described by Zwicky and Sadock as typically occurring with "privative opposites" does not arise with (8a). Generic sentences are generalizations which tolerate exceptions. It seems that exceptions tend to be recognized as such only when they are clearly individuated and perhaps also identifiable. Stating that a particular boa constrictor has different properties from the kind "boa constrictor" is by no means felt to be a contradiction. Quite to the contrary, speakers seem to be reluctant to accept contradictory statements about the properties of nonspecific members of a kind and the properties of the kind itself. Here the members seem automatically to inherit all their properties from the kind. This would add further evidence supporting the hypothesis alluded to above that the linguistic salience of the distinction between a nonspecific and a generic interpretation is perhaps generally low (cf. Section 2 above).

Another type of diagnostics in testing ambiguity are "indirect" tests which make use of system-internal configurations of distinct and neutralized realizations of a semantic distinction. As in the case of lexical-semantic ambiguity, they do not normally provide sufficient evidence but can be used to support the result of stronger tests. Supposing there is a language in which the generic and the nongeneric interpretations systematically correlate with distinct anaphoric pronouns, this could be considered as comparatively good (though not necessarily sufficient) evidence for a distinct mental representation of the concerned interpretations. Two further remarks are necessary in this context. I do not consider the fact that two semantic interpretations tend to occur in complementary contexts (e.g. with complementary types of predicates) as a priori evidence for nonambiguity, since this holds for genuine cases of homonymy as well. For this reason it is always necessary to prove that there are no neutral or ambiguous contexts or that the supposed distinction of meaning actually disappears in neutral/ambiguous contexts.

There is a further albeit relatively problematic criterion, namely the status of the involved grammatical categories (e.g. determiners, number) as obligatory or optional. Gelman and Tardif (1998) suggest that there is an essential difference between English and Mandarin Chinese generic constructions: English constructions permitting both a generic and a nongeneric interpretation are ambiguous in their view, whereas Chinese constructions permitting both interpretations are to be considered as "neutral." Their chief argument for this assumption, however, does not rest on a difference in the results of experiments; rather, they point to the different status of those grammatical categories which are capable of playing a role in the encoding of genericity (determiner, number). These are said to be obligatory in English, optional in Mandarin. Given the same type of (heuristic) ambiguity (e.g. that between a generic and a definite/specific interpretation) in both languages, it is not clear why obligatoriness of number or determiner marking should entail true ambiguity on the one hand, and why the optionality of the corresponding grammatical categories should entail only generality or vagueness on the other. It seems to me that this conclusion is questionable if only for the reason that obligatory and optional marking is explicitly defined by the authors only with respect to nongeneric sentences (Gelman and Tardif 1998: 218). Be this as it may, it remains an empirical question whether, given the same type of semantic distinction (e.g. between specific/definite and generic interpretations), languages may really differ in such a way that "consideration of a NP as generic or not can at times be bypassed" (Gelman and Tardif 1998: 219) in some languages, while it cannot in others. The typological variation with respect to ambiguity addressed in this article concerns only different semantic distinctions.

4. Descriptive framework

As a conceptual framework for describing genericity from a cross-linguistic perspective in this article, a multidimensional approach has been adopted which was first proposed by this author in Behrens (1995) and subsequently refined in a joint work with Hans-Jurgen Sasse (Behrens and Sasse 2003). The basic idea of the typological proposal presented in this article is that genericity is a semantically and discourse-pragmatically complex phenomenon which involves oppositions on different dimensions. Every relevant dimension may be associated with different grammatical devices in different languages and languages vary with respect to the interdependency and hierarchical relation between the dimensions involved. When marking generic phrases, languages do not appeal to the semantic/pragmatic distinctions on every dimension. Rather they pick out certain dimensions and highlight the distinctions on these dimensions as particularly relevant for the concept of genericity by using precisely those grammatical devices for indicating generic meaning which are (language-specifically) associated with the selected dimension. This is a crucial source of typological differences, since, as I will suggest, different languages may pick out different dimensions as relevant.

I will briefly introduce those dimensions relevant in dealing with genericity (cf. [9]). For more detailed information concerning the entire framework and the motivation of the dimensions proposed, readers are referred to Behrens and Sasse (2003).

(9) a. The dimension of propositional function: TOPIC, ATTRIBUTE, PREDICATE

b. The dimension of discourse function: DISCOURSE REFERENT VS. NON-DISCOURSE REFERENT

c. The dimension of individuality: OBJECT VS. QUALITY

d. The dimension of spatiotemporal location: S-T CONCRETE VS. S-T ABSTRACT

e. The dimension of form: SHAPE VS. SUBSTANCE

Distinguishing between the dimension of propositional function and the dimension of discourse function, two organizational levels for which the term "reference" has equally been used in the philosophical and linguistic tradition are differentiated. The first concerns the basic organization of propositions communicated by utterances, the second the question of whether or not an expression is used by the speaker to indicate a "discourse referent." The philosophical tradition (from Aristotle via Frege to Strawson or Searle) proceeds from a bipartite structure of propositions, consisting of a subject (which is considered as basically being a referring expression) and a predicate (which, in turn, is considered as a nonreferring expression). In the linguistic tradition, especially in the typological literature (Croft 1991: 67; cf. also Miller 1985: 224), one often finds a distinction between three functional primitives on a single level (the level of propositional speech acts), namely between "reference," "predication," and "attribution" or "modification." For linguistic reasons, in the framework presented here, ATTRIBUTION is also assumed to represent a third distinct function on the level of the dimension of propositional function.

Reference and propositional speech acts, however, are seen as being orthogonal to each other. It is claimed that the propositional speech act that contrasts with predication is not "reference" per se, but the act of establishing a TOPIC as that entity about which something will be predicated. (19) Both ATTRIBUTION and PREDICATION may involve reference to identifiable discourse entities, as is the case with definite possessor ("genitive") ATTRIBUTES and with definite PREDICATES in identifying sentences. Selecting a TOPIC indeed presupposes reference, but not necessarily reference to an entity which has already been established in the discourse, that is, to a DISCOURSE REFERENT. Classic generics are TOPICS and, as mentioned in Sections 1 and 2 above, some languages (e.g. Korean, but also Tagalog; cf. Behrens 2000) use so-called "topic markers" with generic noun phrases.

Instead of the more common distinction between "referential" and "nonreferential" expressions, in the second dimension a distinction is made between expressions which represent DISCOURSE REFERENTS and those which do not do so (in the following referred to as NON-DISCOURSE REFERENTS (20)) (cf. Karttunen 1976, for the first elaboration of the notion of "discourse reference"). DISCOURSE REFERENTS are entities which are established in a constructed representation of the ongoing discourse, rather than entities in the real world. The basic idea behind discourse reference may be explicated in terms of a "conceptual metaphor" in Lakoff's sense: "discourse is an office." There are at least two figures from this complex metaphor which were used independently of each other in the literature in order to explain this discourse-governed concept of reference, the first is Heim's (1983) "file-card" metaphor and the second is Kuno's (1972) "registry"-metaphor. According to the "file-card" metaphor, a separate mental file exists for each autonomous referent in the discourse about which speech act participants communicate. Once a new referent is introduced, a new file is opened for this new DISCOURSE REFERENT and will be continuously updated by new pieces of information when the speech act participants continue to speak about the same DISCOURSE REFERENTS. Kuno's (1972) concept of a "registry of discourse," which comprises all discourse entities which are "familiar" ("old" in his terminology) to the speech act participants, focuses on a further important property of DISCOURSE REFERENTS: "[o]nce their entry in the registry is established they do not have to be reentered for each discourse" (Kuno 1972: 271). There are two essential reasons why a discourse entity may be contained in the registry of discourse: either it has been textually or situatively introduced in the previous discourse, in which case it is stored in the "temporary registry"; or it is permanently anchored in the registry of discourse due to the speaker's and hearer's general world knowledge; it then is part of the "permanent registry." It is particularly kinds and uniques (e.g. the sun) that Kuno assumes to be contained in the permanent registry of discourse. (21) In the present framework, two points are crucially important in order to understand the treatment of genericity. First, discourse reference as understood here does not presuppose existing referents located in space and time. Second, it does not presuppose the individuation and distinctness of referents either. To put it in Langacker's (1987: 189ff., 1991: 57) terminology, DISCOURSE REFERENTS are not necessarily "instances" which would have a particular (spatial or temporal) location in the domain of their instantiation, and they are not necessarily "bounded" in their respective domains. Therefore, I assume that kinds, which are "familiar" (22) to speaker and hearer due to their general world knowledge, may be interpreted and constructed as DISCOURSE REFERENTS just as those discourse entities which are explicitly introduced into the discourse and correspond to existing entities in the real world.

Allowing reference by means of definite anaphora (i.e. anaphora used in the case of "identity of reference" rather than in the case of "identity of sense") is considered to be the most important diagnostic feature of DISCOURSE REFERENTS. This is obviously satisfied with kind-referring phrases, at least in all the languages I have investigated (cf. Section 3 above). In contrast, there are a couple of grammatical contexts that are cross-linguistically associated with NON-DISCOURSE REFERENTS such as modifiers in compounds, nominal predicates in ascriptive sentences, and standards in (explicit or implicit) constructions of comparison. All of them usually do not tolerate anaphoric continuation by means of definite anaphora (for a more detailed discussion of apparent exceptions, cf. Behrens and Sasse 2003). In sum, the two occurrences of the cat in (10a) and (10b) will be considered as representing DISCOURSE REFERENTS, and the two occurrences of cat in (10c) and (10d) as representing NON-DISCOURSE REFERENTS.

(10) a. (So these days I've been taking care of my step-mother's cat ...) The cat has not run away. (It has apparently been in the house all along ...)

b. The cat is one of the most poorly understood of all animals. (It is a realist and does things because there is a reason for doing them.)

c. These greeting cards make artful gifts for cat lovers.

d. That new sofa just doesn't smell right unless it smells of cat.

By distinguishing between the dimension of individuality and the dimension of spatiotemporal location, two semantic aspects are factored apart which are often fused into a single distinction between "types" and 'instances" or "tokens" (cf. Langacker 1987, 1991; Jackendoff 1983). The two values on the dimension of individuality (QUALITY and OBJECT) are intended to capture the insight that, as a matter of principle, speakers may use lexical elements in a sentence in two different ways. On the one hand, they may focus on the intensional properties of the denoted entities without making any commitment to the individuality of the objects which bear these properties (QUALITY use). On the other hand, they may also focus on the fact that the bearers of the relevant intensional properties can be conceived of as distinguishable, and hence countable, objects (OBJECT use). Numerals and quantifiers are the most common grammatical correlates of an OBJECT use due to their bounding force. For a QUALITY use, the possibility of a "transnumeral" interpretation is considered as the most important diagnostic feature, that is, the fact that number markers such as singular and plural are not interpreted as contrastive values indicative of the actual number of the involved referents. Accordingly, the complex phrases five bags of gold, two bags of gold, and one bag of gold in (11a) are assumed to represent OBJECT use, while the bare phrase in (11b) is said to manifest QUALITY use. Since the distinction between OBJECT and QUALITY uses is a matter of grammatical construction and semantic interpretation on the sentence level rather than a matter of lexical restriction, the bare singulars train and bus in (12a) are also considered as QUALITY uses, because they clearly satisfy the criterion of transnumerality. The same is true of the grammatical occurrences of train and bus in (12b), constructed as a bare singular, a bare plural und (twice) as a definite singular. (23)

(11) a. He gave five bags of gold to one, two bags of gold to another, and one bag of gold to the last ...

b. To reward his wisdom, she gave him gold,...

(12) a. After meeting him, I went by train and bus to Melaka,... b. Train is a good way to travel. Buses are usually cheaper and a bit faster, but are also much less comfortable. You can always get up and take a walk in the train, and the bathrooms in the bus are an absolute disaster.

Generic uses are claimed to take the value QUALITY on the dimension of individuality. The possibility of substituting plural generics by singular ones and vice versa in certain contexts and the possibility of switching within singular and plural phrases within a single generic text are considered as evidence for this (cf. Section 1 and Section 5.2.4.2).

The dimension of spatiotemporal location captures the difference between (a) those uses of lexical elements which correspond to spatiotemporally anchored (and hence, in principle, perceivable) entities (S[PATIO] T[EMPORALLY] CONCRETE value) and (b) those uses which are not connected to entities observable by human senses but require and abstraction of the spatiotemporal manifestation of the entities they regularly name (S[PATIO]-T[EMPORALLY] ABSTRACT value). (24) Nonfactual modality (conditionals, negation, etc.) and abstraction away from particular events by iteration (habituals) yield and S-T ABSTRACT context; in the same way, verbs of "propositional attitude" provide and S-T ABSTRACT context for their objects in one of their readings (i.e. in the nontransparent reading). In the examples in (13), it is the habitual reading of the verbs that leads to an S-T ABSTRACT value of the object phrases a bicycle, bicycles, bicycle, and his bicycle in the dimension of spatiotemporal location. Note that the phrase at issue in (13b) is even formally "definite" (due to the possessive pronoun); because of the habitual reading, however, there is no implication that it would be the same bicycle that John repaired each time. For that reason, examples such as (13b) are usually discussed with respect to the peculiarity of definite phrases taking a "nonspecific" interpretation in some contexts. In the present framework this semantic effect is represented by the dimension of spatiotemporal location.

(13) a. John likes riding a bicycle / riding bicycles / bicycle riding. b. John usually repaired his bicycle in the garden.

In addition, nominal predicates--with the exception of those which point to DISCOURSE REFERENTS (as in The murderer is John)--automatically receive the value of S-T ABSTRACT. In terms of the dimensions proposed, the essential difference between "ascriptive" predicates (cf. Lyons 1977) and classic generics, which likewise receive the value of S-T ABSTRACT, is then expressed by different values in the first two dimensions: classic generics are TOPICS and DISCOURSE REFERENTS, while nominal predicates in ascriptive sentences are classified as PREDICATES and NON-DISCOURSE REFERENTS.

Although there is a natural affinity between S-T ABSTRACT and QUALITY uses on the one hand and S-T CONCRETE and OBJECT uses on the other, there are good reasons for keeping spatiotemporal bounding and individuation as measured by counting apart. First, hypothetical contexts--unlike generic contexts--allow quantification by numerals. Second, what is yet more important, the conjunction of S-T CONCRETE and QUALITY (i.e. transnumerality in a context describing observable situations) is not ruled out either. In English, it is idiosyncratically confined to a few locative or instrumental ATTRIBUTES such as go by tram as shown in (12a) above, in other languages such as Greek or Hungarian, it is possible throughout for ATTRIBUTES. For further details confer Behrens and Sasse (2003).

It is only the fifth dimension (the "dimension of form") that concerns the issue of lexical conceptualization in the present framework. In the case of the SHAPE value, entities are conceptualized as having a characteristic shape, while in the case of the SUBSTANCE value, entities are conceptualized as shapeless mass, either because they normally occur without natural bounding properties or because they occur with continuously changing and thus uncharacteristic shapes. The values SHAPE and SUBSTANCE are thought to characterize referential potential (or "denotation"), as this is stored with (senses of) lexical items. Therefore, SHAPE and SUBSTANCE are assumed to be orthogonally related to the values OBJECT and QUALITY in the dimension of individuality since the semantic distinction between the latter is defined to be valid of actual uses of linguistic items in the sentence. If a language has a lexically-governed mass/count distinction, the SHAPE value corresponds with count nouns and the SUBSTANCE value with mass nouns. Languages which are commonly said to have a mass/count distinction both on the lexical and the phrasal level (such as English) are assumed to have certain mapping constraints between the dimensions of form and individuality in the present framework. That is, such languages tend to map SHAPE onto OBJECT and SUBSTANCE onto QUALITY (cf., however, train in [12a] and cat in [10d], where SHAPE nouns are mapped onto QUALITY phrases). Such constraints are assessed by Behrens (1995) as by no means universal, but as a typological feature of a certain class of languages. There are other languages, such as Hungarian, which allow an almost free alternation between OBJECT and QUALITY both with SHAPE and SUBSTANCE nouns.

As a lexical distinction, the difference between SHAPE and SUBSTANCE is not directly relevant for the question of whether or not a phrase has to be interpreted generically. However, it may cause a kind of "lexical split" with respect to the preferred strategies of encoding genericity as I will demonstrate below, citing German as an example (cf. Section 5.2.4.4).

Now, we are ready to answer the question of how classic generic expressions such as the boa constrictor in (1a) (repeated here as [14]) are to by analyzed in the present framework. They refer to well-established kinds and occur as the subject of a sentence whose predicate makes a characterizing statement about them. This prototype of genericity is represented by the following feature configuration: {TOPIC, DISCOURSE REFERENT, S-T ABSTRACT, QUALITY}.

(14) The boa constrictor is very dangerous.

Nonprototypical cases, which are borderline cases of genericity, can in part be described as slight changes in this feature configuration (e.g. as taking the value of ATTRIBUTE instead of TOPIC). The separation of dimensions also makes it possible to explain the impression that different languages draw the boundary between generics and nongenerics at different places. The reason for these differences is that the grammatical devices employed in the individual languages for the expression of {TOPIC, DISCOURSE REFERENT, S-T ABSTRACT, QUALITY} are frequently generalized to differing degrees across other feature configurations (e.g. in addition to TOPICS, they may also comprise ATTRIBUTES).

5. Empirical investigation

The basis for the empirical language comparison was constituted by the multilingual corpus of translations of Antoine de Saint Exupery's novel Le petit prince. This novel is probably among the most widely translated texts after the Bible. In Germany, it has even been translated into different dialects (the Bavarian dialect, the dialect of Cologne, etc.). Moreover, Saint Exupery's novel is particularly well suited for the investigation of genericity since it contains a comparatively large number of generic text passages. A key motif of the novel is that the Little Prince, on his roam about the Earth, meets many different people and figures (such as "the fox") who confront him with their stereotypical generalizations and fill him with amazement.

5.1. Levels of genericity

When looking at generic texts, one is particularly struck by the discrepancy between kind-referring phrases and generic sentences. A considerable part of the noun phrases marked in the text as kind-referring DISCOURSE REFERENTS (about one third) do not occur in declarative main clauses, which is usually considered to be a precondition for generic sentences. They frequently occur in sentence fragments (cf. [15]) and/or in other clause types, for example in interrogative clauses (cf. [16]) or in subordinate clauses.

(15) a. Men?[[empty set], PL]

b. GER: Die Menschen [DEF, PL]?

c. FR: Les hommes [DEF, PL]?

d. GR: I anthropi [DEF, PL]?

e. HUN: Az emberek [DEF, PL]?

f. TAG: Mga tao [[empty set] TOPIC, PL]?

(16) a. The thorns [DEF, PL]--what use are they'?

b. GER: Was fur einen Zweck haben die Dornen [DEF, PL]?

c. FR: Les epines [DEF, PL], a quoi servent-elles?

d. GR: T' angathia [DEF, PL] lipon se ti xrisimevun?

e. HUN: Mi hasznuk van a toviseknek [DEF, PL] [POSS]?

Kind-referring phrases in sentence fragments reveal an interesting difference between article languages, in which referential properties (concerning the individuality of referents and their familiarity in discourse) may be marked independently of propositional functions, and languages (e.g. Tagalog) that lack independent determiners and conflate marking of referential properties with what is usually called "case marking." Article languages maintain, in sentence fragments, the canonical form of generic determination which would also appear in a complete sentence, such as the bare form in English (cf. [15a]) and the form with the definite article in French, Greek, and Hungarian ([15c], [15d], [15e]). This can be contrasted with Tagalog. While kind-referring phrases in complete sentences are normally marked with the topic particle (ang) in this language, they often appear as bare phrases (i.e. without ang) in sentence fragments (cf. [15f]). This is understandable since this topic particle only secondarily signals reference properties in Tagalog, while its primary function is a syntactic one, namely establishing a relation to the predicate on the basis of thematic roles.

Many kind-referring phrases in the Le petit prince corpus appear in episodic sentences (or in generic sentences in a broader sense; cf. Section 3) in which a specific individual is characterized in terms of a kind (cf. [17]).

(17) a. I do not much like to take the tone of a moralist [IND, SG].

b. GER: Ich nehme nicht gerne den Tonfall eines Moralisten [IND, SG] an.

c. FR: Je n'aime guere prendre le ton d'un moraliste [IND, SG].

d. GR: Dhen m'aresi katholu na perno to ifos tu ithikologhu [DEF, SG],...

e. HUN: Nero szeretek erkolcspredikalo [[empty set], SG] hangjan beszelni.

Grammatical realizations of the standard of comparison in comparative constructions receive the feature ATTRIBUTE on the dimension of propositional function. Kind-referring expressions realized as ATTRIBUTES are not prototypical generics in the sense of the feature cluster {TOPIC, DISCOURSE REFERENT, QUALITY, S-T ABSTRACT} given above. This is reflected in the fact that kinds realized as ATTRIBUTES rather than as TOPICS are subject to many more idiosyncratic constructional constraints. In Hungarian, for example, standards of comparison connected with a particle (mint 'as/ like') show variation between the definite and the indefinite article; those bearing an affix (-kent 'as/like') require the bare form, while in possessor positions we find variation between the bare form--as in (17e)--and the indefinite article.

This tendency of ATTRIBUTES toward free variation, however, does not necessarily mean that all relevant distinctions would be entirely collapsed. Normally, this happens only in non-article languages such as Finnish and Tagalog (cf. Behrens 2000). In article languages, however, those distinctions on the dimension of discourse function and on the dimension of individuality that are possible with TOPICS are retained--at least in part--with ATTRIBUTES as well, so that we find comparable ambiguities and oppositions. Consider, for example, the possessive phrases in (17a)--(17e): the genitives marked by an indefinite article in English, German, and French (a moralist, eines Moralisten, un moralist) are potentially ambiguous between a specific/indefinite interpretation (OBJECT) and a kind-referring interpretation (QUALITY) and are in opposition to a form with the definite article, which could have a specific/definite interpretation. The phrase marked by a definite article in Greek (tu ithikologhu) is ambiguous between a specific/definite interpretation (OBJECT) and a kind-referring interpretation (QUALITY) and is in opposition to a form with the indefinite article which could have a specific/indefinite interpretation. The zero-marked possessor in Hungarian (erkolcspredikalo) can only be interpreted in the sense of QUALITY, but as such it is in opposition to forms with a definite and an indefinite article.

A further example of a kind-referring phrase constructed as ATTRIBUTE can be seen in (18), realized as a prepositional or postpositional phrase, respectively.

(18) a. I have lived a great deal among grown-[ups.sub.i] [[empty set], PL]. I have seen [them.sub.i] [PRO] intimately, close at hand.

b. GER: Ich bin viel mit [Erwachsenen.sub.i] [[empty set], PL] umgegangen und habe Gelegenheit gehabt, [sie.sub.i] [PRO] ganz aus der Nahe zu betrachten.

c. FR: J'ai beaucoup vecu chez les grandes [personnes.sub.i] [DEF, PL]. Je [les.sub.i] [PRO] ai vues de tres pres.

d. GR: Ezisa arketa me tus [meghalus.sub.i] [DEF, PL]. [Tous.sub.i] [PRO] idha apo poli konda.

e. HUN: Hosszu ideig eltem a [felnottek.sub.i] [DEF, PL] kozott. Nagyon kozelrol szemugyre vettem [oket.sub.i] [PRO].

Those languages showing a pronounced tendency to mark generics with the definite article (French, Greek, and Hungarian) do so in this case as well. If one looks only at the first sentence in English and German, one could be led by the context to the impression that the phrase grown-ups and its translation equivalents have a nonspecific, noninclusive interpretation, yielding roughly the following meaning: the narrator has repeatedly lived among different not further identifiable groups of grown-ups. Interestingly, English and German also allow definite pronominalization in the subsequent sentence. By the end of the second sentence, it will be clear that the grown-ups are DISCOURSE REFERENTS in these languages as well. (25)

Not only are there kind-referring phrases that occur in nongeneric sentences, but also generic sentences which lack a kind-referring phrase. What I have in mind here are not habitual sentences about specific DISCOURSE REFERENTS, but sentences containing anaphoric reference to a previously mentioned kind. In languages where anaphorically-referring subjects are not realized by pro-forms but rather are indicated by the respective verb forms, as in Greek and Hungarian, an overtly realized kind-referring phrase is missing altogether in such cases (cf. [19d] and [19e]). In the other languages, a pronoun is used to refer to the kind in question (cf. [19a], [19b], and [19c]).

(19) a. They also raise chickens. (anaphorically referring to men)

b. GER: Sie ziehen auch Huhner auf.

c. FR: Ils elevent aussi des poules.

d. GR: Ektos ap' afro anathrefun ke kotes. (no free proform)

e. HUN: Tyukokat is tenyesztenek. (no free proform)

5.2. Encoding of genericity in QUALITY-marking and DISCOURSE REFERENT-marking languages

5.2.1. Statistical evaluation of kind-referring phrases. In five European article languages (English, German, French, Greek, and Hungarian), I have statistically evaluated all those expressions from the Le petit prince corpus which can tentatively be assumed to have a kind-referring interpretation (cf. Figure 1). The choice of the expressions (as a rule, phrases) was made according to the following principle: whenever an expression in one of the languages compared was found to be marked with a device characteristic of marking genericity in that language (e.g. a definite article in French) and was undoubtedly not interpretable as specific in the respective context, then this expression and its equivalents in the other languages were included in the evaluation (provided that nominal equivalents were present). In doubtful cases, I took the definite article in French, German, Greek, and Hungarian as diagnostic for genericity. As demonstrated above by the application of the ambiguity test, the semantic difference between a nongeneric and a generic interpretation is sufficiently large in phrases marked by a definite article; it was thus relatively easy to sort out expressions with a specific/definite reading in the context. (26) Six different marking categories were distinguished, of which one is represented only in French: partitive plural (PART/PL). The remaining are: definite article combined with a singular or a plural form (DEF/SG, DEF/PL), bare singular or plural forms (O/SG, O/PL), indefinite article combined with a singular form (IND/SG). (27) The category "others" includes phrases containing a quantifier, a demonstrative, an indefinite determiner other than the indefinite article, or any language-specific combination of quantifiers and determiners. The statistics shown in Figure 1 include occurrences in syntactic positions other than the subject position (i.e. both TOPIC and ATTRIBUTES); only PREDICATE uses were excluded. The absolute number of tokens considered ranges between 258 and 273, depending on the language. These differences between the languages result from the fact that translation equivalents are lacking in some cases altogether or are realized by a different word class (e.g. adjective).

When looking at Figure 1, one is immediately struck by the significant difference between English on the one hand, and the remaining four languages on the other. Zero-marking figures prominently only in English (29.9% in PL, 16.3% in SG) and is only poorly represented in the other languages, continually decreasing from left to right. The total number of bare forms (SG + PL) is lowest in Greek (9.7%) and in Hungarian (9.8%) (cf. also the statistics in Figure 2 further below, which were compiled on the basis of subject occurrences). In English, on the other extreme, the use of the definite article is significantly more weakly attested than in the other languages. More precisely, the percentage of definite phrases in English both in the singular (11.2%) and in the plural (24.4%) is approximately twice as low as the percentage of definite phrases in the other languages. To the extent that the frequency of zero-marking continually decreases from left to right, definite-marking continually increases and scores the highest number in Greek (77.6%) and in Hungarian (75.6%). Figure 1 reveals a further difference, which looks less spectacular in terms of percentages but is nevertheless extremely interesting from a linguistic point of view: the relative proportion of indefinite singulars decreases continually from English (11.2%) through Hungarian (5.6%). Since the number of generics with an indefinite article is generally low in all languages, this decrease in fact reflects significant differences (cf. Section 1 above, Section 5.2.4.4 below).

5.2.2. QUALITY-marking vs. DISCOURSE REFERENT-marking languages. Before proceeding with the interpretation of the statistics and the linguistic differences that they reflect, I will introduce a typological distinction between QUALITY-marking languages (such as English) and DISCOURSE REFERENT-marking languages (such as French, Greek, and Hungarian). (28) (German is in actual fact more of a mixed type even though it largely exhibits the characteristics of DISCOURSE REVERENT-marking languages in the Le petit prince corpus). This typological difference manifests itself in the most frequent and unmarked type of device for marking generic meaning, let us call it the canonical type of generic marking. Considering the canonical marking of genericity, we can classify languages according to which semantic/pragmatic aspect they highlight as the most relevant aspect of genericity. QUALITY-marking languages highlight the fact that the referents of generic noun phrases should not be considered as individual entities (OBJECTS) but as abstract representatives of certain properties (QUALITY) characterizing kinds. DISCOURSE REFERENT-marking languages, in turn, highlight the fact that the referents of generic noun phrases may be conceived of as DISCOURSE REFERENTS that are established in the permanent registry of discourse by being part of our world knowledge.

I claim that QUALITY-marking languages and DISCOURSE REFERENT-marking languages behave like mirror images in the following respect. QUALITY-marking languages select the dimension of individuality as the relevant dimension in marking genericity. In these languages, the distinction between OBJECTS and QUALITIES is the most prominent distinction among those which are constitutive for the concept of genericity (cf. [9a]-[9d]). This distinction cuts across the generic/nongeneric boundary, applying in both domains. That is, the grammatical device that is canonically used with generics (e.g. zero-marking in English) is also used with nongenerics and, as such, it indicates QUALITY in both environments. DISCOURSE REFERENT-marking languages select the dimension of discourse function as the relevant dimension in marking genericity. In these languages, the distinction between DISCOURSE REFERENTS and NON-DISCOURSE REFERENTS is the most prominent distinction. Here, it is this distinction which cuts across the generic/nongeneric boundary. Accordingly, the grammatical device which is canonically used with generics (e.g. definite article) is also used with nongenerics and, as such, it indicates DISCOURSE REFERENTS in both environments.

English, as a QUALITY-marking language, makes a fundamental distinction between OBJECTS and QUALITIES in that OBJECTS, as a rule, have to be bound by a determiner or a quantifier while QUALITIES may be realized by bare forms orthogonally to all relevant semantic distinctions, particularly orthogonally to the difference between DISCOURSE REFERENTS and NONDISCOURSE REFERENTS. In indicating the QUALITY value of a grammatical form, zero-marking is common not only in S-T ABSTRACT contexts (habitual, modal contexts) but also in S-T CONCRETE contexts. That is, when a noun (phrase) is used to specify an event conceived of as a general activity in which the subject argument is engaged, bare forms are used--either plurals or singulars, depending on the conventionalization of the lexical nouns in question as "having a SHAPE" (plural) (e.g. clean windows) or "being a SUBSTANCE" (singular) (e.g. drink coffee). In the case of such ATTRIBUTE uses, even singular forms are occasionally allowed with SHAPE nouns (e.g. go to bed, go by train). The only context where zero-marking is almost completely ruled out, even though a QUALITY interpretation is unequivocally present, is the PREDICATE use of SHAPE nouns (e.g. *He is teacher).

Nevertheless, the most conspicuous characteristic of English is that it allows and clearly prefers zero-marking of QUALITY in combination with DISCOURSE REFERENTS (i.e. kind-referring noun phrases), even if the noun phrases in question are constructed as TOPICS. Here, QUALITY-marking more or less overrules DISCOURSE REFERENT-marking, namely marking by a definite article. The definite article is only used in the nongeneric domain as a systematic and necessary marking device for DISCOURSE REFERENTS in English. Definite singular generics are the only exception. They are, however, not very frequent, as shown by the statistics in Figure 1. And, as noted by Declerck (1991: 96), the unmarked interpretation of definite forms (e.g. The fox is cunning) is the nongeneric one, whereas the unmarked interpretation of bare forms (e.g. Foxes are cunning) is the generic one. In short: saying that English is a QUALITY-marking language means that there is one marking device which is applicable both to generics and nongenerics and that this marking device signals the shared value of QUALITY on the dimension of individuality.

As DISCOURSE REFERENT-marking languages, French, Greek, and Hungarian make a fundamental distinction between DISCOURSE REFERENTS (established in the temporary or permanent registry of discourse) and NON-DISCOURSE REFERENTS in that DISCOURSE REFERENTS have to be marked by a definite article, independent of the difference between OBJECTS and QUALITIES on the dimension of individuality and the difference between S-T CONCRETE and S-T ABSTRACT on the dimension of spatiotemporal location. In these languages, kinds (which are by definition associated with the values S-T ABSTRACT and QUALITY) are typically marked as DISCOURSE REFERENTS, in the same way as particular participants (S-T CONCRETE OBJECTS) after they have been textually or situatively introduced.

At least in the case of generic TOPICS, DISCOURSE REFERENT-marking clearly overrules QUALITY-marking (which would be zero in all three languages or a partitive form in French). However, bare or partitive forms in French, Greek, and Hungarian act as standard markers of a QUALITY interpretation only in the nongeneric domain. With generic TOPICS, these forms are almost nonattested in French, Greek and Hungarian, as shown by the statistics in Figure 2 below. It is only in the area of nonprototypical generics (i.e. with ATTRIBUTES) that the DISCOURSE REFERENT-marking languages in question show variation according to whether or not they apply the general marking device of DISCOURSE REFERENTS (the definite article) to noun phrases representing S-T ABSTRACT QUALITY. (29) What is more, a definite phrase in these DISCOURSE REFERENT-marking languages seems to be more or less neutral between a generic and a nongeneric interpretation rather than being biased toward nongeneric interpretation like in English. In short: the claim that a certain language is a DISCOURSE REFERENT-marking language means that there is one marking device which is applicable both for generics and nongenerics and that this marking device signals the shared value of DISCOURSE REFERENT on the dimension of discourse function.

Proceeding from the complementary selection of the dimensions relevant in generic marking in QUALITY-marking and DISCOURSE REFERENT-marking languages, we might also expect a complementary distribution of the dominant ambiguity patterns in these languages. Indeed, in QUALITY-marking languages it is the morphological distinction between DISCOURSE REFERENTS and NON-DISCOURSE REFERENTS that is typically dispensed with (e.g. both may be marked with zero), and in DISCOURSE REFERENT-marking languages it is the morphological distinction between OBJECTS and QUALITIES that is typically dispensed with (e.g. both may be marked with a definite article). It should be noted, however, that QUALITY-marking and DISCOURSE REFERENT-marking languages do not behave exactly symmetrically with respect to the question of ambiguity. Ambiguity between generic and nongeneric interpretation, as it may be observed on the phrasal level in QUALITY-marking languages (e.g. with bare plurals in English), usually does not yield sentence ambiguity (cf. Section 3). In contrast to this, phrasal ambiguity in DISCOURSE REFERENT-marking languages (e.g. in the case of definite plurals in French), generally causes sentence ambiguity. The explanation for this asymmetry is relatively obvious: there is, universally, a natural affinity between TOPICS and DISCOURSE REFERENTS, but none between TOPICS and OBJECTS. Hence, a noun phrase which is syntactically/morphologically marked as TOPIC and QUALITY is automatically interpreted as DISCOURSE REFERENT, while a noun phrase which is syntactically/morphologically marked as TOPIC and DISCOURSE REFERENT may in principle be interpreted equally as a particular individual (OBJECT) or as kind (QUALITY). There are probably other less obvious explanations for this ambiguity asymmetry, which may have to do with the cognitive saliency of the distinction in question generally or with its universal saliency within the domain of genericity (i.e. independent of the typological differences depicted). However, these hypotheses cannot be pursued any further in this article.

Finally, the following point should be stressed: QUALITY-marking is not necessarily associated with zero (as in English). In Finnish, for instance, QUALITY-marking is realized by partitive forms, which are regularly used both with generic and nongeneric nonsubject/nonoblique arguments. Likewise, DISCOURSE REfERENT-marking is not necessarily associated with definite articles in the classic sense. In Vietnamese, for instance, classifiers may be considered as markers of DISCOURSE REFERENTS. As such they are often used with SHAPE nouns (cf. Behrens 2000).

5.2.3. Similarity in determiner/number values. We will now turn to the question of whether the corpus contains sentences at all in which corresponding expressions in all five languages bear the same determiner/ number values. There are a number of instances, which fall into two significant groups of categories: definite plural phrases such as in (20) and indefinite singular phrases such as in (21).

(20) a. <<The grown-ups [DEF, PL] are certainly altogether extraordinary>>) he said simply, talking to himself as he continued on his journey.

b. GER: Die grossen Leute [DEF, PL] sind entschieden ganz ungewohnlich, sagte er sich auf der Reise.

c. FR: <<Lesgrandes personnes [DEF, PL] sont decidement tout a fait extraordinaires>>, se disait-il simplement en lui-meme durant le voyage.

d. GR: <<I meghali [DEE, PL] ine tromera parakseni>>, monolojise apla ston eafto tu kathos sinexize to taksidhi tu.

e. HUN: <<A felnottek [DEF, PL] ketsegtelenul egeszen kulonosek>>--csak ennyit mondott magaban utazasa kozben.

(21) a. When an astronomer [IND, SG] discovers one of these he does not give it a name, but only a number.

b. GER: Wenn ein Astronom [IND, SG] einen von ihnen entdeckt, gibt er ihm statt des Namens eine Nummer.

c. FR: Quand un astronome [IND, SG] decouvre l'une d'elles, il lui donne pour nom un numero.

d. GR: Otan enas astronomos [IND, SG] anakalipsi kapjon ap'aftus andi ja onoma tu dhini enan arithmo.

e. HUN: Ha egy csillagasz [IND, SG] felfedez egyet, nev helyett szamot ad neki.

English and Hungarian are cornerstone languages for these two groups. The definite plural is a highly marked construction in English (a QUALITY-marking language) and the indefinite singular is a highly marked one in Hungarian (a prime example of a DISCOURSE REFERENT-marking language). This is clearly shown in the fact that English and Hungarian display the lowest percentage of the respective generic constructions in the corpus.

According to a common assumption, English does not really have definite plural generics, as the use of definite plural phrases with kind-reference is allowed only in certain lexically or syntactically restricted cases. As a lexical "exception" one may mention deadjectival forms (e.g. the blind), including nationality names (e.g. the French). Of these, those having an overt -s-plural (e.g. grown-up[s], German[s]) allow variation between definite- and zero-marking of plural generics (cf. the grown-ups in [20a] vs. grown-ups in [22a]), while those lacking an -s-plural require the definite article with plural generics (the blind, the French). Syntactic exceptions involve restrictive modifiers such as of the Sahara in (22b) or where you live in (22c). The reason for the use of the definite article in (22d) could perhaps be seen in the fact that the paragraph topic "baobabs" is preceded here by a cataphorically referring pronoun (but see also [28a] further below). However, one point should be stressed: although the relative percentage of "DEF/PL" generics in English is significantly lower than in the other four languages, it is still relatively high (24.4%). Indeed not all "DEF/PL" attestations in English can be explained in terms of lexical and syntactic constraints. Rather, many of them have a semantic-pragmatic motivation, which will be discussed below (cf. Section 5.2.4.2).

(22) a. Grown-ups never understand anything by themselves,...

b. The wells of the Sahara are mere holes dug in the sand.

c. <<The men where you live,>> said the little prince, <<raise five thousand roses in the same garden.>>

d. Before they grow so big, the baobabs start out by being little.

The low percentage of "IND/SG" attestations in Hungarian is not accidental either. Among the investigated languages, Hungarian is subject to the strongest restrictions with respect to the use of an indefinite article in generic phrases (cf. Section 1 above, Section 5.2.4.4 below). Significantly, there is only one type of context in which--in addition to English, German, French, and Greek--an indefinite article is employed in Hungarian as well. These are conditional sentences (as in [21]) or sentences which otherwise have a modal (deontic) coloration.

It was very rarely found that all of the languages exhibited zero-marking in corresponding phrases. This was confined to cases where the phrases were constructed as ATTRIBUTES, that is, occurred in an environment that has been qualified as nonprototypical for genericity. In this borderline area of genericity, English, as a pure QUALITY-marking language, employs--just as in the case of prototypical generics--zero-marking, while German, as a mixed language, and the three other DISCOURSE REFERENT-marking languages show a remarkable tendency for tolerating variation between a definite and a bare form. Such variations can be most prominently observed in nouns denoting materials or abstract entities. For this reason, they chiefly concern singular phrases.

Example (23) shows that, in principle, material-denoting nouns may be zero-marked in all five languages when they occur as ATTRIBUTES, in particular, as modifiers of participles and adjectives. Such examples are, however exceptional in the Le petit prince corpus. In most cases, at least one of those languages which tolerate a variation between definite marking and zero-marking, employs the former. In example (24), for instance, it is German and Hungarian that use the definite article instead of zero.

(23) a. Clad in royal purple and ermine [O, SG], he was seated upon a throne which was at the same time both simple and majestic.

b. GER: Der Konig thronte in Purpur und Hermelin [O, SG] auf einem sehr einfachen und dabei sehr koniglichen Thron.

c. FR: Le roi siegeait, habille de pourpre et d'hermine [O, SG], sur un trone tres simple et cependant majestueux.

d. GR: O vasilias aftos dimenos me porfira ke ermina [O, SG], kathotan pano s' ena throno poli aplo ke meghaloprepo.

e. HUN: Biborba es hermelinbe [O, SG] oltozve egy igen egyszeru, de megis fenseges tronuson ult.

(24) a. He looked at me there, with my hammer in my hand, my fingers black with engine-grease [O, SG], bending down over an object ...

b. GER: Er sah mich an, wie ich mich mit dem Hammer in der Hand und vom Schmierol [DEF, SG] verschmutzten Handen uber einen Gegenstand beugte, ... (lit. '... [with hands] dirtied by the grease ...')

c. FR: Il me voyait, mon marteau a la main, et les doigts noirs de cambouis [O, SG], penche sur un objet ...

d. GR: M'evlepe me to sfiri sto xeri, me ta dhaixtila jemata ghrasso [O, SG], skimmeno pano apo ena praghma ... (lit. '... [with the fingers] full grease ...')

e. HUN: Ott alltam, kezemben a kalapacs, ujjaim feketek a gepolajtol [DEF, SG], es egy targy fole hajoltam, ...

Let us now turn to determiner variations with abstract nouns. In (25), the abstract noun "discipline" is constructed as a postnominal genitive phrase in English, French, and German, and as a prenominal phrase in Hungarian. Only German employs the definite article here, though it is precisely in this environment that the definite article is in free variation with zero (eine Frage von Disziplin [preposition & O] vs. eine Frage der Disziplin [genitive & DEF]). In the Greek example (25d), "discipline" is realized as a verbal ATTRIBUTE of the impersonal verb prokite (governing the preposition ja 'about': 'it is about/concerns/is a matter of'). Nouns appearing as the prepositional objects of this verb are always zero-marked when they have a QUALITY interpretation rather than an OBJECT interpretation on the dimension of individuality. However, this is clearly a matter of conventionalization. Other semantically related verbs behave differently. The verb afora 'it concerns/refers to', for instance, always requires the definite article for QUALITY-specified nouns: afora tin pitharxia 'it concerns [the] discipline'.

(25) a. <<It is a question of discipline [O, SG],>> the little prince said to me later on.

b. GER: <<Es ist eine Frage der Disziplin [DEF, SG]>> sagte mir spater der kleine Prinz.

c. FR: <<C'est une question de discipline [O, SG]>>, me disait plus tard le petit prince.

d. GR: <<Prokite kathara ja pitharzia [O, SG]>>, mu ipe poli arghotera o mikros pringipas. (lit. 'It is clearly a matter of discipline, the little prince said to me much later.')

e. HUN: <<Fegyelem [O, SG] kerdese>>)--mondotta nekem kesobb a kis herceg.

Example (26) is one of the few attestations where the corresponding phrases are actually realized as bare forms (O, SG) in all the languages.

(26) a. The second time, eleven years ago, I was disturbed by an attack of rheumatism [O, SG].

b. GER: Das zweitemal (sic!), vor elf Jahren, war es ein Anfall von Rheumatismus [O, SG].

c. FR: La seconde fois c'a ete, il y a onze ans, par une crise de rhumatisme [O, SG].

d. GR: I dhefteri fora itan otan, endeka xronia prin, kirieftika apo mja krisi revmatismon [O, PL].

e. HUN: Masodizben, tizenegy eve, csuz [O, SG] gyotort. ('The second time, eleven years ago, it was rheumatism that me attacked.')

This time, German chooses the variant "preposition & O" instead of the variant "genitive & DEF." In the Hungarian example, we find a verbal rather than a nominal construction: the equivalent of rheumatism (csuz) is constructed as the subject of a finite verb (gyotor 'attack'). For subject arguments denoting abstract entities and materials, Hungarian displays a very interesting variation which in part depends on the discourse function the subject takes in the sentence and in part simply on its syntactic position. Subjects constructed as TOPICS require definite marking (A csuz tizenegy eve gyotort utoljara '[The] rheumatism last attacked me eleven years ago.'). There are two further constructions, which I analyze as having the propositional function of ATTRIBUTES. In the first one, the subject is placed immediately in front of the verb with which it forms a close unit, either as its modifier or as a focus. In this case, no determiner is used. The fact that the focus in (26e) must necessarily fall on the subject explains the absence of an article. In the second construction, the ATTRIBUTE subject occupies a postverbal position and has to be marked by a definite article (Utoljara tizenegy eve gyotort a csuz 'It was eleven years ago that [the] rheumatism last attacked me.'). This may be illustrated with a second example from the corpus (cf. [27]) (30) where Hungarian deviates from the original of the translation in that the abstract noun in question is constructed as the subject ATTRIBUTE of a finite verb rather than as the possessor ATTRIBUTE of a noun:

(27) a. That was his first moment of regret [O, SG].

b. GER: Das war seine erste Regung von Reue [O, SG].

c. FR: Ce fut la son premier mouvement de regret [O, SG].

d. GR: Itan i proti fora, pu o mikros pringipas enjothe vathja metanjomenos [PARTICIPLE]. (lit. 'It was the first time that the little prince felt deeply repentant.')

e. HUN: Ekkor tamadt fel benne eloszor a megbanas [DEF, SG]. (lit. 'At this time repentance arose in him for the first time.')

To sum up, we can say that the Hungarian variation between the use of a definite phrase and the use of a bare phrase crosscuts the distinction between TOPICS and ATTRIBUTES. Zero-marking is found only with preverbal ATTRIBUTES adjacent to the verb, whereas both preverbal TOPICS and post-verbal ATTRIBUTES (as in [27e]) require a definite article. It should also be noted that in the case of nonfocus ATTRIBUTES, the syntactic position (preverbal vs. postverbal) and the determiner type (zero vs. definite article) do not correlate with any significant semantic distinction: preverbal bare phrases and postverbal definite phrases are semantically equivalent variants of each other.

The above-mentioned cases of determiner variation (free variation in German, syntactically-governed [automatic] variation in Hungarian, and lexically triggered variation in Greek) nicely demonstrate that "generics" and "nonspecifics" are very close to each other even in languages which apply the DISCOURSE REFERENT-marking strategy. The feature that distinguishes in these languages between phrases which are usually considered to be generic (phrases having the values "DISCOURSE REFERENT & QUALITY" and as such marked by a definite article) and phrases which are usually considered to be nonspecific (phrases having the values "NON-DISCOURSE REFERENT & QUALITY" and as such marked by zero) tends to be formally and semantically neutralized, at least in ATTRIBUTE positions and with nouns denoting materials or abstract entities.

5.2.4. Differences in determiner/number values

5.2.4.1. Plural phrases. It is kind-referring phrases in the plural that exemplify the lion's share of the differences in determiner/number values. As expected, English regularly lacks a determiner here, whereas the three pure DISCOURSE REFERENT-marking languages (French, Greek, Hungarian) typically use the definite article. Even though German can--in principle--choose between these two markings, it patterns like the latter three languages in the vast majority of cases. This difference is consistently observed in all types of possible kinds. One finds it with natural kinds such as volcanoes and flowers (cf. [28], [29]), with occupations and social roles such as kings (cf. [30]), with humans characterized in terms of a notable property such as conceited people (cf. [31]).

(28) a. If they are well cleaned out, volcanoes [O, PL] burn slowly and steadily, without any eruptions.

b. GER: Wenn sie gut gefegt werden, brennen die Vulkane [DEF, PL] sanft und regelmassig, ohne Ausbruche.

c. FR: S'ils sont bien ramones, les volcans [DEF, PL] brulent doucement et regulierement, sans eruptions.

d. GR: An ine kala katharismena, keghonde isixa-isixa ke kanonika xoris kamja ekriksi. (lit. 'If [they] are well cleaned out, [they] burn slowly and regularly, without any eruption.'; anaphoric reference to the "volcanoes" without a free proform such as "they", cf. [19d])

e. HUN: Ha rendesen ki vannak seperve, a tuzhanyok [DEF, PL] csendesen, szabalyosan egnek, kitoresek nelkul.

(29) a. Flowers [O, PL] are weak creatures.

b. GER: Die Blumen [DEF, PL] sind schwach.

c. FR: Les fleurs [DEF, PL] sont faibles.

d. GR: Ta luludhja [DEF, PL] ine adhinama.

e. HUN: A viragokk [DEF, PL] gyengek.

(30) a. Kings [O, PL] do not own, they reign over.

b. GER: Die Konige [DEF, PL] besitzen nicht, sie >regieren uber<.

c. FR: Les rois [DEF, PL] ne possedent pas. Ils <<regnent>> sur.

d. GR: I vasiljadhes [DEF, PL] dhen exun tipota dhiko tus. Vasilevun s' ola ta praghmata.

e. HUN: A kiralyoknak [DEF, PL] nem tulajdonai a csillagok. Ok uralkodnak rajtuk.

(31) a. Conceited people [O, PL] never hear anything but praise.

b. GER: Die Eitlen [DEF, PL] horen immer nur die Lobreden.

c. FR: Les vaniteux [DEF, PL] n'entendent jamais que les louanges.

d. GR: I mateodhoksi [DEF, PL] dhen akune tipota allo ektos ap' tus epenus.

e. HUN: A hiu emberek [DEF, PL] csak a dicseretet halljak.

With one exception ([30e] (31)), all of the relevant generic phrases in (28)-(31) appear as subjects, constituting the TOPIC of classic generic sentences. One wonders whether the same typological pattern also hold for objects. Even though the extent to which nonsubjects in characterizing statements are to be considered as generic is in general hotly disputed, it is largely agreed upon that some attitude verbs such as love (and its synonyms such as like or be fond of, as well as its antonyms such as hate) select a nonsubject argument ("direct object" or "prepositional object") which may have a genuine generic interpretation. Indeed, in the case of such attitude verbs, pure DISCOURSE REFERENT-marking languages strongly require the same marking device for the object they usually employ for generic subjects (i.e. the definite article in the languages considered here; cf. [32c], [32d], [32e]; [33c], [33d], [33e]). Once again, German turns out to be a mixed type, permitting variation between zero-marking (cf. [33b]) as in English (cf. [32a], [33a]) and definite-marking (cf. [32b]) as in the other languages, with a clear tendency toward zero-marking in prepositional structures (cf. [33b] vs. [32b]).

(32) a. I am very fond of sunsets [O, PL].

b. GER: Ich liebe die Sonnenuntergange [DEF, PL] sehr.

c. FR: J'aime bien les couchers de soleil [DEF, PL].

d. GR: Aghapo para poli to iljovasilema [DEF, SG].

e. HUN: Szeretem a naplementeket [DEF, PL].

(33) a. Grown-ups [O, PL] love figures [O, PL].

b. GER: Die grossen Leute [DEF, PL] haben eine Vorliebe fur Zahlen [O, PL] (lit. 'The grown-ups have a special liking for figures.').

c. FR: Les grandes personnes [DEF, PL] aiment les chiffres [DEF, PL].

d. GR: Jati i meghali [DEF, PL] aghapun tus arithmus [DEF, PL].

e. HUN: A felnottek [DEF, PL] szeretik a szamokat [DEF, PL].

Even though all DISCOURSE REFERENT-marking languages treat objects of verbs such as "love" and "hate" as DISCOURSE REFERENTS when these do not refer to particular individuals, they may differ with respect to the objects of other verbs or verb types such as "seek something," "talk about something," "be interested in something" (cf. Behrens 2000). (32) In particular, the number of those verbs in the lexicon which require DISCOURSE REFERENT-marking varies significantly from language to language. In Arabic, a fairly pronounced DISCOURSE REFERENT-marking language, it is by far larger than in the three languages considered here (cf. Behrens and Sasse 2003). But the latter also differ with respect to the set of verbs that (necessarily or optionally) select "generic objects" in S-T ABSTRACT contexts. Now, when DISCOURSE REFERENT-marking languages allow both definite and nondefinite marking with objects of some verbs, it would be interesting to know in what type of context definite marking is the preferred option. At the same time, we can also reexamine a question left open above: in what types of contexts--besides those lexical and syntactic contexts already mentioned--does English use the definite plural instead of the bare plural? These questions will be addressed in the following section.

5.2.4.2. Generic texts as scripts. The contexts where we predominantly find definite marking of non-subjects in DISCOURSE REVERENT-marking languages such as French, Greek, and Hungarian and definite marking at all (i.e. beyond lexically or syntactically motivated "exceptions") in the QUALITY-marking language English have an essential feature in common. In both cases, the relevant attestations are found within a generic text passage that can be considered as a linguistic manifestation of a "script." In artificial intelligence, cognitive science, and cognitive linguistics, a number of representational concepts have been developed since the 1970s, which attempt to model higher-level knowledge (and belief) structures. Three of these have come to be particularly well-known: "scripts," "frames" (a la Fillmore), and "ICMs" (idealized cognitive models a la Lakoff). In the context of the present study, the concept of "scripts" as introduced by Schank and his colleagues is of particular interest (cf. Schank 1980; Schank and Abelson 1977; Abelson 1973). From the very outset, the essential idea behind "scripts" was that they should be understood--in the words of an early definition by Abelson (1973: 295)--as a "sequence of themes involving the same actors, with a change in interdependencies from each theme to the next; an evolving 'story' of potentially changing relationships of actors." Thus, there are "accident scripts," "restaurant scripts," "dentist scripts," etc., each capturing generalized knowledge about a scenario, including information about events and participants typically involved in this scenario. In addition, scripts are basically structured with respect to temporal and causal relations between subsequent events. A frequent subtype of generic texts is constituted by linguistically encoded scripts in this sense: they narrate, in the form of short stories, how a particular kind typically interacts with other kinds in a particular environment. In this way, they not only refer to a single kind (the main topic of the text), but also to a number of other kinds as secondary participants.

The story Le petit prince contains several generic scripts. One of these is the "geographer script," which sketches a scenario about how geography books come into being. In addition to the principal participant (the geographer), a second participant appears here prominently, namely "the explorer." In addition, certain inanimate objects play an important role in this script, such as volcanoes, flowers, and the proofs that must be furnished by the explorer. Another script concerns "the catastrophe of the baobabs," which elaborates on the danger emanating from the kind "baobab." In this script, there are two further kinds repeatedly referred to: "sheep" and "little bushes". Finally, there is a third script, continually elaborated on throughout the entire story: the script about "the warfare between the sheep and the flowers." A key role in this warfare is attributed to the "thorns," which can be employed by the flowers as a kind of instrument (weapon).

Above, the hypothesis was advanced that kinds and uniques are established in the permanent registry of discourse, which qualifies them as potential DISCOURSE REFERENTS. In a generic text conceived of as a script, a further factor comes into play. All entities involved in a script (actors, instruments, locations) are in actual fact "textually established" at a certain point in the text. They are, as it were, also additionally anchored in a temporary registry, just like those introduced in the course of a story about particular events and particular objects.

This has clear consequences both for a QUALITY-marking language such as English and for DISCOURSE REFERENT-marking languages such as French, Greek, and Hungarian. In English, where the conditions for definite marking (i.e. uniqueness) are by far more rigorous and mainly valid in the S-T CONCRETE/OBJECT domain, they are met--in analogy to the latter domain--in the S-T ABSTRACT/QUALITY domain, as well. That is to say, kinds which are textually established within a generic script may be considered to be identifiable entities fulfilling the requirement of uniqueness, and, consequently, they may be marked with a definite article. The effect in the DISCOURSE REFERENT-marking languages under consideration is such that the basic asymmetry between the first two arguments (here, as a rule, between the subject and the object) of a two-place verb is cancelled out. Even though familiar kinds are established in a permanent registry, this does not imply that they always automatically appear as DISCOURSE REFERENTS. This happens only when they are constructed as TOPICS, and TOPICS tend to be confined to a single argument. At least in the languages under consideration here, second-highest-ranking arguments are in opposition to the highest-ranking arguments in their tendency to be presented as NON-DISCOURSE REFERENTS when interpreted as QUALITIES (with the above-mentioned exception of a language-specifically restricted group of verbs such as "love" or "hate"). Since the distinction between an s-x ABSTRACT and an S-T CONCRETE interpretation typically remains formally unspecified in the noun phrases, the well-known effect of ambivalence between a "nonspecific" and a "generic" reading arises. To put it more simply: DISCOURSE REFERENT-marking languages often behave like English in not showing a difference in the realization of an object depending on whether the verb conveys a particular event (I am eating fish / berries), a habitual event (I eat fish/berries), or a kind-characterizing (habitual) event (Bears eat fish/berries). Once they are established in a generic text, however, second-highest-ranking arguments such as objects may also be presented as DISCOURSE REFERENTS, that is, they may be marked with a definite article. (33)

Let us illustrate what we have said so far with some examples. Examples (34) and (35) are taken from the above-mentioned "geographer script," in which the explorer appears as a secondary participant. In (34) (the understood subject of which is the geographer), he is referred to in all five languages by means of a definite noun phrase, even though he is expressed as the syntactic object of the sentence. (34) In the second example (35), we likewise have definite marking in all languages, except Greek, where an anaphoric pronoun refers to a definite noun phrase in an earlier part of the text. The other four languages even use the definite singular. Though all five languages employ the plural as the unmarked number value with human kinds, the shift from plural to singular seems quite unproblematic in this context. This is a further characteristic feature of generic scripts. By the use of the singular, the individuality of abstract figures such as "the geographer" and "the explorer" is highlighted in analogy to stories about particular geographers and particular explorers.

(34) a. But he receives the explorers [DEF, PL] [in his study].

b. GER: Aber er empfangt die Forscher [DEF, PL].

c. FR: Mail il y recoit les explorateurs [DEF, PL].

d. GR: Dhexete omos tus ekserevnites [DEF, PL].

e. HUN: Fogadja azonban a felfedezoket [DEF, PL].

(35) a. One waits until the explorer [DEF, SG] has furnished proofs, before putting them down in ink.

b. GER: Um sie mit Tinte aufzuschreiben, wartet man, bis der Forscher [DEF, SG] Beweise geliefert hat.

c. FR: On attend, pour noter a l'encre, que l'explorateur [DEF, SG] ait fourni des preuves.

d. GR: Ki epita, otan ekini ('those') [PRO] ferun apodhiksis, tis kataghrafun oles me melani.

e. HUN: Ahhoz, bogy tintaval jegyezzek fel, megvarjak, mig a felfedezo [DEF, SG] bizonyitekokat szolgaltat.

The following two examples ([36] and [37]) are drawn from the script about "the warfare between the sheep and the flowers," in which "the thorns" are textually anchored as an important instrument employed by the flowers. In the first sentence of each example in (36), all five languages refer to them with a definite plural phrase. The same is true of "the flowers" in (37).

(36) a. The thorns [DEF, PL] are of no use at all. Flowers [O, PL] have thorns [O, PL] just for spite!

b. GER: Die Dorneni [DEF, PL], die haben gar keinen Zweck, die Blumen [DEF, PL] lassen siei [PRO] aus reiner Bosheit wachsen! (lit. [2nd clause] 'the flowers grow them out of pure spitefulness')

c. FR: Les epines [DEF, PL], ca ne sert a rien, c'est de la pure mechancete de la part des fleurs [DEF, PL]! (lit. [2nd clause] 'it is of pure spitefulness on the part of the flowers')

d. GR: T' angathja [DEF, PL] dhen ofelun se tipota, ine kathari kakia ton luludhjon [DEF, PL]. (lit. [2nd clause] 'they are/it is pure malice of the flowers'; anaphoric reference triggered by the verb form, i.e. without free proform)

e. HUN: A toviseknek [DEF, PL] semmi hasznuk, a tovis [DEF, SG] puszta komiszsag a virag [DEF, SG] reszerol! (lit. [2nd clause]: 'the thorn is pure malice on the part of the flower')

(37) a. The flowers [DEF, PL] have been growing thorns [O, PL] for millions of years.

b. GER: Es sind nun Millionen Jahre, dass die Blumen [DEF, PL] Dornen [O, PL] hervorbringen.

c. FR: Il y a des millions d'annees que les fleurs [DEF, PL] fabriquent des epines [PART, PL].

d. GR: Ekatommiria xronja ta luludhja [DEF, PL] eftjaxnan angathja [O, PL].

e. HUN: Millio eve gyartjak a viragok [DEF, PL] a toviseket [DEF, PL].

Unlike in an episodic text, where the textual introduction of a specific participant has consequences for the use of determiners throughout the rest of the text in that all subsequent mentions require the definite article, this does not hold for a generic text (cf. Section 3). Here, it is apparently possible to return, without difficulty, to the default encoding for kind-reference which would be chosen in a generic statement uttered in isolation. It is safe to assume that the default encoding is zero for all arguments in English and the definite article for the highest-ranking argument in the other languages. As for the second-highest-ranking argument, French uses the so-called "partitive" form, while Greek and Hungarian use the bare forms as the default form. Thus, in spite of their being textually established, English chooses the default form (bare plural: flowers) to refer to "flowers" in the second sentence in (36). In the same way, all mentions of "thorns" but one in (37) appear in the respective default form in the syntactic function of object. The exception is Hungarian: here, textual relevance is valued more highly, and, therefore, we find a definite form (a toviseket) rather than a bare one. The difference to be seen in the realization of "thorns" in the second sentence in (36) is also noteworthy. Whereas English--as usual--employs the bare plural, we find a definite pronoun in German (though not in conjunction with a verb of possession as in English, but with the predicate wachsen lassen 'grow [trans.]'). Hungarian opts for nominal resumption (cf. Section 3), shifting from a plural form (a toviseknek) in the first sentence (35) to a singular form (a tovis) in the second sentence in (36), while the latter is realized as a TOPICAL subject. For inanimate entities, the definite singular is indeed the unmarked generic form in Hungarian, which one would use in the isolated utterance of a generic statement. It should be added that Greek also exhibits anaphoric reference; this, however, is ambiguous between reference to the entire situation expressed in the first sentence (as in the French sentence) and reference to the "flowers" (as in the German sentence).

Someone who produces a generic text may, in principle, choose between these two alternative strategies: he may either adjust his generic statements to the text structure or opt for a more universal formulation independent of the respective text structure. And speakers may also shift between an "anaphoric" and a "nonanaphoric" encoding strategy when referring to the same kind within a single generic text. It is not surprising therefore that we find a considerable amount of variation in the encoding of generic participants. This is particularly obvious in German, which represents a mixed type between a QUALITY-marking and a DISCOURSE REFERENT-marking language. To illustrate this, examples (38)-(42) are presented below. The noun phrases underlined in these examples refer to kinds already established as participants of a generic text.

(38) a. Then it follows that they also eat baobabs [O, PL]? (they = 'sheep')

b. GER: Dann fressen sie doch auch Affenbrotbaume [O, PL]?

c. FR: Par consequent ils mangent aussi les baobabs [DEF, PL]?

d. GR: Epomenos tha trone ke ta baobab [DEF, PL]?

e. HUN: Szoval megeszik a majomkenyerfakat [DEF, PL] is?

(39) a. It is true, isn't it, that sheep [O, PL] eat little bushes [O, PL]?

b. GER: Es stimmt doch, dass Schafe [O, PL] Stauden [O, PL] fressen?

c. FR: C'est bien vrai, n'est-ce pas, que les moutons [DEF, PL] mangent les arbustes [DEF, PL]?

d. GR: Ine alithja, dhen ine etsi, oti ta provata [DEF, PL] trone tus thamnus [DEF, PL]?

e. HUN: Mondd, csakugyan igaz, hogy a baranykak [DEF, PL] lelegelik a bokrokat [DEF, PL]?

(40) a. <<We do not record flowers [O, PL],>> said the geographer.

b. GER: <<Wit schreiben die Blumen [DEF, PL] nicht auf>>, sagte der Geograph.

c. FR: Nous ne notons pas les fleurs [DEF, PL], dit le geographe.

d. GR: Dhen simjonume ta luludhja [DEF, PL], ipe o jeoghrafos.

e. HUN: Viragokkal [O, PL] hem foglalkozom--mondotta a foldrajztudos. (lit. 'With flowers, I do not occupy myself.'; contrastive interpretation implicating: 'with other things, I do')

(41) a. I hunt chickens [O, PL]; men [O, PL] hunt me.

b. GER: Ich jage Huhner [O, PL], die Menschen [DEF, PL] jagen mich.

c. FR: Je chasse les poules [DEF, PL], les hommes [DEF, PL] me chassent.

d. GR: Egho kinigho kotes [O, PL], i anthropi [DEF, PL] kinighane emena.

e. HUN: En a tyukokra [DEF, PL] vadaszom, az emberek [DEF, PL] ram vadasznak.

(42) a. Children [O, PL] should always show great forbearance toward grown-up people.

b. GER: Kinder [O, PL] mussen mit grossen Leuten [O, PL] viel Nachsicht haben.

c. FR: Les enfants [DEF, PL] doivent etre tres indulgents envers les grandes personnes [DEF, PL].

d. GR: Ta pedhja [DEF, PL] ofilun na dhixnun epikia pros tus [DEF, PL].

e. HUN: A gyermekeknek [DEF, PL] [DATIVE] nagyon turelmeseknek kell lenniok a felnottek irant [DEF, PL].

English has zero-marking here throughout (in all syntactic positions), that is, it is not sensitive to the text structure. French, Greek, and Hungarian--with two exceptions ([40e] and [41d] (36))--employ the definite article not only in subjects, but also in "direct objects" and other oblique arguments. That is to say, textual relevance is generally taken into account here. In (38) and (39), German patterns with English (bare forms for the object in [38] and both for the object and the subject in [39]). By contrast, the object ("flowers") in (40), which is to be interpreted habitually, is expressed in German by a definite phrase. In (41), German exhibits the behavior that DISCOURSE REVERENT-marking languages show in cases in which a generic sentence is uttered in isolation: definite-marking on the subject and zero-marking on the object. Finally, with respect to the marking of the subject phrase and the prepositional phrase in (42), German acts once again like a QUALITY-marking language. The overall picture that emerges may be summarized as follows: only relative predictions can be made for the use of definite marking in the context of a generic text. These run along two hierarchies:

(43) a. The hierarchy of generic language types: DISCOURSE REFERENT-marking language > mixed language (DISCOURSE REFERENT--marking and QUALITY-marking language)> QUALITY-marking language

b. The hierarchy of syntactic realizations: SUBJECT > DIRECT OBJECT > OBLIQUE (37)

When differences are encountered in the marking of translation equivalents (definite vs. zero), we may expect that the language which uses a definite article is located higher in the language hierarchy. In turn, when different markings are encountered in one and the same sentence in a single language, we may expect that the definitely-marked phrase is the one that occupies a higher place in the hierarchy of syntactic realizations.

5.2.4.3. Some complicated cases. Not all cases of (inter- and intralinguistic) determiner variation may be explained in terms of the typological difference between QUALITY-marking and DISCOURSE REFERENT-marking or by reference to textual factors or to other regularities following cross-linguistically valid hierarchies. We have already seen that there often are language-specific conditions that restrict the choice of determiners on a constructional or lexical basis, especially in the case of singular phrases (based on SUBSTANCE nouns and used as ATTRIBUTES), but also sometimes with plural phrases (cf. Note 36). In addition, there are some quite complicated cases which defy a ready explanation. One such case is illustrated in example (44):

(44) a. Computations have been made by experts [O, PL] [NONSUBJECT, PASSIVE]. (With these pills, you save fifty-three minutes in every week.)

b. GER: Die Sachverstandigen [DEF, PL] [SUBJECT, ACTIVE] haben Berechnungen angestellt.

c. FR: Les experts [DEF, PL] [SUBJECT, ACTIVE] ont fait des calculs.

d. GR: I idhiki [DEF, PL] [SUBJECT, ACTIVE] exun kani ipolojismus.

e. HUN: A szakertok [DEF, PL] [SUBJECT, ACTIVE] pontos szamitasokat vegeztek.

Excepting English, the other four languages mark the "experts" with a definite article. They are not established within a generic text and are not specific/definite either. The predicate ("make calculations") with the verb form in a past/perfect tense biases an s-f CONCRETE interpretation rather than an S-T ABSTRACT one. It is hardly to be understood as characterizing a habit of the "experts." The semantic implication is such that at least one particular expert must have existed for whom this predicate holds. This is supplemented by the pragmatic implication that the computations in question have most likely been made not by all relevant experts, but by a rather small subset of them (noninclusive interpretation). In short: a generic interpretation (at least a prototypical one) is out of the question. This is also supported by the syntactic realization found in English (by-phrase in a passive sentence). In spite of all this, the "experts" are constructed in the other languages as the definite subject of an active sentence. One may perhaps adduce another example, a German sentence with a bare plural subject ([45]), to shed some light on this problem.

(45) a. GER: Wissenschaftler [O, PL] haben fruher behauptet, dass Cholesterin der Gesundheit schadet.

'Researchers formerly claimed that cholesterol is detrimental to the health.'

i. 'a particular, identifiable group of researchers, distributively or collectively, claimed that ...'

ii. 'nonidentifiable groups of researchers claimed that ...; perhaps there was only one researcher who claimed over and over again that ...'

iii. 'researchers as a kind (i.e. the relevant subtype such as physicians) may be characterized by formerly taking the view that ...'

This sentence can be associated with at least three different semantic--pragmatic nuances. If we assume that the most important difference between a "specific" and a "nonspecific" interpretation is the presupposition of existence (which the latter cannot claim), the phrase Wissenschaftler 'researchers') in (45) would have a "specific" reading both on interpretation (i) and on interpretation (ii). This follows from the S-T CONCRETE bias of the predicate. Nevertheless, there is a difference between (i) and (ii), which is analogous to Donnellan's (1966) classification of definite phrases: only (i) but not (ii) is "specific" in the sense that the speaker has an identifiable group of particular individuals in mind. By contrast, interpretation (ii) is even distinguished by a certain transnumeral flavor, which it shares with the third, generic, interpretation. It is obviously for this third interpretation that one would most probably expect a definite article in German. In actual fact, however, the most likely interpretation of die Sachverstandigen 'the experts' in the previous sentence (44) corresponds to (ii), but not to (iii). The same holds true for the object of this sentence (Berechnungen 'calculations'), which is even in the scope of the subject. This constellation of two "nonspecific" (i.e. in the sense of [ii]) arguments of a transitive verb seems to be problematic for languages which are totally or partially DISCOURSE REFERENT-marking and, at the same time, exhibit a tendency toward asymmetric marking of the arguments (TOPIC/DISCOURSE REFERENT VS. ATTRIBUTE/NON-DISCOURSE REFERENT). (38) It might be supposed that, in such cases, higher-ranking arguments (subjects) with a (ii)-interpretation either take the canonical marking of (i)-interpretations (indefinite-specifics and thus prospective DISCOURSE REFERENTS and OBJECTS) or the canonical marking of (iii)interpretations (generics and thus DISCOURSE REFERENTS and QUALITIES). Obviously, it is the latter that happens in example (44).

There is a further complication involved in the interpretation of statistical results such as presented in Figure 1. On the one hand, the statistics in Figure 1 neatly demonstrate certain basic differences such as the typologically relevant difference between QUALITY-marking and DISCOURSE REFERENT-marking languages. On the other hand, they obscure certain differences between the languages in question. For example, the fact that German is a mixed-type language, in which bare plurals and definite plurals appear as variants in the same (even prototypical) generic contexts, is not captured by the statistics. The relatively high percentage of definite plurals in German (46.9%) gives the impression that German basically functions like French, Greek, or Hungarian, where--in contrast to German--generic TOPICS have to be marked by a definite article. Moreover, it is not possible to see from Figure 1 that, in German, the bare plurals have an unmarked status outside generic texts and the definite plurals have a marked one. The marked status of definite plurals is indicated, among other things, by stylistic effects in certain contexts. For example, they may suggest the idea of a closed universe such as occasionally observed in discourse with children. It is fair to assume that such a stylistic effect may have been consciously employed given the present text genre and the topic of the story.

[FIGURE 1 OMITTED]

The high percentage of definite plurals in Hungarian shown in Figure 1 (50.4%) likewise appears to be somewhat deceptive. The situation is different here, however. In Hungarian, the definite plural does not occur as the marked variant of the bare plural, but--in some cases--as the marked variant of the definite singular. In the area of genericity, Hungarian is characterized by a "lexical split." By the term "lexical split," I will refer to the phenomenon that there is a basic difference either in the set of grammatical devices employed for encoding genericity or in the interpretation and markedness of such devices, which correlates with basic, lexically established properties. The lexical split in Hungarian is triggered by the animacy hierarchy, resulting in a difference between nouns denoting human entities and nouns denoting nonhuman entities. For nouns denoting human entities, both definite plurals and definite singulars are allowed, and the definite plural seems to be increasingly favored as the unmarked variant. With nouns denoting nonhuman entities, the opposite is true. Here the definite singular is unequivocally the unmarked variant, while the definite plural, if permitted at all, more or less has the semantic effect of personification. It is precisely this effect that arises in the Hungarian translation of Le petit prince, in which "volcanoes," "thorns," "sheep," "baobabs," "boa constrictors," etc., are generally expressed by a definite plural form when kind-reference is present. This effect is certainly intended since these are important participants in the generic scripts. By contrast, if we open a Hungarian biology textbook in which natural kinds are described, we will encounter the definite singular throughout. The definite plural is reserved for hyperonyms in a sort reading (plural: a macskak [lit. 'the cats' = 'felidae'], singular: a macska [lit. 'the cat' = 'felis silvestris forma catus']). Nevertheless, the Le petit prince corpus also contains some attestations where Hungarian is the only language that employs the singular instead of the plural (cf. [46]). Consider in particular (46b), where the singular form (a majomkenyerfa 'the baobab') was chosen even in the environment of a predicate such as rengeteg 'many'.

(46) a. HUN: Az oriaskigyo [DEF, SG] ragas nelkul, egeszben nyeli le zsakmanyat.

'Boa constrictors [O, PL] swallow their prey whole, without chewing it.'

b. HUN: Es ha a bolygo [DEF, SG] kicsi, a majomkenyerfa [DEF, SG] meg rengeteg ('many'), szetrepeszti a bolygot. 'And if the planet [DEF, SG] is too small, and the baobabs [DEF, PL] are too many, they split it in pieces.'

As noted in Section 2, there is an old controversy in the literature on how to distinguish, in English, between different generic constructions in terms of reference. According to a rather influential idea, definite singular phrases refer to the class (e.g. "kind") as a whole, while plural constructions allow reference to the members of the class. Investigating the difference between singular and plural generics in Hindi as compared with English generics, Dayal (1992) takes up this idea and arrives at the following conclusion (supposed to be valid for the two languages examined but tentatively also for other languages):

I believe that the only semantic difference between the singular kind and the plural kind is in their relation to objects, the singular kind "denotes the species itself" while the plural kind denotes the "members of the species", to use the words of Jespersen (1927). While their property sets are not very different, in some sense the singular generic is more abstract than the plural generic. Because of this, plural generics can be used as simple generalizations based on sufficiently many object level verifications. (Dayal 1992: 57)

It should be stressed again at this point that, contra Dayal, I take the view that generics, in principle, do not refer extensionally to existing members of a kind but instead always refer intensionally, pointing to the name of the kind. Expressed in terms of the framework used here this means that generics are in principle associated with QUALITY rather than with OBJECT on the dimension of individuality. This does not preclude, of course, the existence of borderline cases between a QUALITY and an OBJECT interpretation, such as discussed above in the context of example (45). Moreover, individual languages may allow quite different associations with the formal difference between singulars and plurals in the domain of generics. In approaching this question it is certainly not unimportant whether or not the language in question has a grammaticalized mass/ count distinction (Hungarian typically does not have such a distinction). Furthermore, it is important whether the language is a QUALITY-marking one, in which the construction "DEF/SG" plays a comparatively marginal role with the consequence that the bare plural forms clearly dominate in SHAPE nouns. (In the Le petit prince corpus, the frequency of the construction "DEF/SG" in English is, in terms of percentage, at least twice as low as in any other language; cf. Figure 1). In particular, however, I consider it incorrect to assume that it would be universally possible for the distinction between singular and plural forms (when used with generics) to correlate with how strongly the generalization expressed in the generic sentence is interpreted. In any event, the use of the definite singular in Hungarian does not imply stronger (more strongly verified and/or exceptionless) generalizations; this is neither the case with nonhuman denoting nouns, where it is the unmarked form, nor with human-denoting nouns. (39)

5.2.4.4. Singular phrases. Discussing cross-linguistic differences within singular generics, I will distinguish between two kinds of variation. In the first case, there is a variation between definite and bare phrases (DEF/SG vs. O/SG), in the second case, between phrases containing an indefinite article and those containing either a definite article (instead of the indefinite one) or no determiner at all (IND/SG vs. DEF/SG or O/SG).

Let us first deal with cross-linguistic differences manifested in the use of definite singulars as opposed to bare singulars. The relevant examples here involve SUBSTANCE nouns, for example, nouns denoting materials or abstract entities. (40) The cross-linguistic distribution of definite marking and zero-marking in this area is almost parallel to that found within plural phrases. As such, it may be explained in terms of the typological difference between QUALITY-marking and DISCOURSE REFERENT-marking languages. Looking at classic generic uses of nouns such as "water" or "authority" (e.g. at uses in which such nouns are constructed as TOPICS rather than as ATTRIBUTES), we find the standard patterns of the respective language types throughout. English as a QUALITY-marking languages requires zero-marking, French, Greek, and Hungarian as DISCOURSE REVERENT-marking languages require definite marking (cf. [47] and [48]).

(47) a. Water [O, SG] may also be good for the heart ...

b. GER: Wasser [O, SG] kann auch gut sein fur das Herz ...

c. FR: L'eau [DEF, SG] peut aussi etre bonne pour le coeur ...

d. GR: To nero [DEF, SG] bori na 'ne eksisu kalo ke ja tin kardhja ...

e. HUN: A viz [DEF, SG] jot tehet a szivnek is ...

(48) a. Accepted authority [O, SG] rests first of all on reason [O, SG].

b. GER: Die Autoritat [DEF, SG] beruht vor allem auf der Vernunft [DEF, SG].

c. FR: L'autorite [DEF, SG] repose d'abord sur la raison [DEF, SG].

d. GR: I eksusia [DEF, SG], lipon, prepi na stirizete pano sti lojiki [DEF, SO].

e. HUN: A tekintely [DEF, SG] elsosorban az ertelmen [DEF, SG] nyugszik.

One particular difference in the generic marking of SUBSTANCE and SHAPE nouns, however, should be noted. For SUBSTANCE nouns, occurring in a generic text seems to have no (or only a marginal) effect on the choice of determiners: in English, SUBSTANCE nouns serving as TOPICS of generic sentences that are embedded in a larger text throughout exhibit zero-marking just as they do in isolated generic statements. And in French, Greek, and Hungarian, the variation between definite and nondefinite marking of SUBSTANCE nouns when they are constructed as NONTOPICS is determined by lexical and other constructional factors rather than by textual ones. Besides this, it is German that deserves special attention in the context of generic marking of SUBSTANCE nouns. I have mentioned that German is a mixed-type language combining the QUALITY-marking and the DISCOURSE REFERENT-marking patterns. This holds for SHAPE nouns in the sense that plural generics (TOPICS) may appear either as bare phrases or as definite phrases, the choice between them being a matter of markedness and stylistics. Now, we would expect the same pattern to also obtain with SUBSTANCE nouns. However, we find a remarkable difference between the two important subtypes of SUBSTANCE nouns, namely between material-denoting nouns and nouns denoting abstract entities. It is only with abstract nouns that free determiner variation in a way comparable to that found in SHAPE nouns may be observed (cf. example [48b], which demonstrates the definite variant). In the case of material-denoting nouns, German as strongly prefers bare phrases (cf. [47h]) as English does. (41) Thus we can say that German exhibits a kind of "lexical split," making a principled difference between material-denoting nouns on the one hand and all other types of nouns on the other, behaving like a QUALITY-marking language with the former and like a genuine mixed-type language with the latter. (42)

Now, I will turn to cross-linguistic differences in the use of indefinite articles in generic phrases. Although all generic statements express "law-like" regularities in a certain sense, there are different ways in which this law-like character may apply. Researchers on genericity normally make a distinction between definitory generics, which state "essential" (and hence definitory) properties, and descriptive generalizations, which capture prototypical properties derived from empirical observations (cf. Burton-Roberts 1976, 1977; Krifka et al. 1995). Or they point to the difference between descriptive statements, which express physical or biological "laws," and normative statements, which express social norms (cf. Dahl 1975). From a cross-linguistic point of view, it seems necessary to work with at least three main types of generics as shown in (49a), (49b) and (49c), that is, distinguishing between definitory or metalinguistic and descriptive generics and also treating normative generics as an independent type rather than collapsing them with one of the other types (cf. Casadio and Orlandini 1991). The cross-linguistic investigation of indefinite determiners in generic noun phrases in particular suggests the usefulness of even finer subclassifications, especially within the area of descriptive generalizations. In this article, I propose paying attention to the difference between unrestricted characterizations (cf. [49b] [i]) and characterizations restricted by quantificational information or by explicit conditional structures (cf. [49b] [ii] and [iii]).

(49) a. Definitory/metalinguistic uses stating "essential" properties (A beaver is a mammal)

b. Descriptive generalizations in terms of "prototypical" properties:

i. Simple (nonrestricted/nonquantifying) characterizing statements (A boa constrictor is very dangerous) ii. Characterizing statements with quantifying structures (expressing properties of the average member in terms of quantification) (An adult beaver can fell a tree ten inches in diameter in about six minutes)

iii. Characterizing statements with conditional structures (expressing potential properties which hold under certain conditions) (Ira beaver breaks a tooth or otherwise distorts its bite, the incisors elongate, force open the mouth permanently and cause the animal to starve)

c. Normative uses (A soldier has to carry a gun, A priest shouldn't deal with drugs)

As mentioned above, among the languages investigated in this article, it is English that exhibits the widest range of contexts in which generics with an indefinite article can occur. It is the only language that can freely use an indefinite article in all those contexts listed in (49). Among the different uses of English "indefinite generics," it is the definitory one which is least felicitous with an indefinite article in the other languages. Uses such as A beaver is a mammal, as uttered in the course of ordinary communication or as information about the meaning of the word beaver, are entirely ruled out in Hungarian or Greek. Even in German, they are also among the more marginal cases. It also seems likely that French "IND/ SG" phrases have to be coupled with the topic construction ("x, c' est ...") in this context, but this remains open to further investigation. The Le petit prince corpus contains some metalinguistic attestations in which information about the meaning of words is asked for in the form of an interrogative sentence (cf. [50] and [51]). Note that only English uses an indefinite form here (a geographer in [50] and a rite in [51]) without inserting an additional demonstrative element in a construction such as "what is that/this x?" The other languages do employ such a construction, in combination with an indefinite form (French in both sentences and German in [50b]) or even in combination with a definite form (Hungarian in both sentences). Or they simply use zero-marking (Greek in both sentences and German in [51b]).

(50) a. <<What is a geographer [IND, SG],>> [asked the little prince].

b. GER: <<Was ist das >ein Geograph [IND, SG]<?>> (lit. 'What is that, a geographer?')

c. FR: Ou'est-ce qu'un geograph [IND, SG]? (lit. 'What is that what a geographer [is]?')

d. GR: Ti ine jeoghrafos [O, SG]? (lit. 'What is geographer?')

e. HUN: Mi az a foldrajztudos [DEF, SG]? (lit. 'What is that, the geographer?')

(51) a. <<What is a rite [IND, SG]?>> [asked the little prince].

b. GER: <<Was heisst >fester Brauch [O, SG]<?>> (lit. 'What does "custom" mean?')

c. FR: Qu'est-ce qu'un rite [IND, SG]? (lit. 'What is that what a rite [is]?')

d. GR: Ti ine jorti [O, SG]? (lit. 'What is feast?')

e. HUN: Mi az a szertartas [DEF, SG]? (lit. 'What is that, the rite?')

Concerning descriptive generics used to characterize the prototypical member of a kind, we observe a clear split between unrestricted generics and restricted ones. Unrestricted generic sentences having the form of simple assertions (such as illustrated by [2] above) are not felicitous in Hungarian and Greek. In both DISCOURSE REFERENT-marking languages, a definite article is used instead of an indefinite one in this context. French, though a DISCOURSE REFERENT-marking language as well, allows an indefinite article with an explicit topic construction as already mentioned. The remaining three contexts (i.e. [49b] [ii] and [iii], [49c]) are, in principle, compatible with "IND/SG" phrases in all five languages. In the Le petit prince corpus however, indefinite singulars are actually attested only in one of them in all languages, namely in conditionally restricted generic sentences (cf. [21] above). The corpus does not contain any normative generics and in the few examples of quantificational characterization, Hungarian still employs its unmarked generic form, rendering the indefinite article of the French original as a definite one (cf. [52]).

(52) a. <<A sheep [IND, SG],>> [I answered,] <<eats anything it finds in its reach.>>

b. GER: Ein Schaf [IND, SG] frisst alles, was ihm vors Maul kommt.

c. FR: Un mouton [IND, SG] mange tout ce qu'il rencontre.

d. GR: Ena provato [IND SG] troi o, ti sinandisi brosta tu, tu apandisa.

e. HUN: A baranyka [DEF, SG] mindent megeszik, ami utjaba kerul.

It is worth noting that those three uses which are allowed in clear DISCOURSE REFERENT-marking languages such as Hungarian and Greek are exactly those uses that are judged as being the best in languages oscillating between QUALITY- and DISCOURSE REFERENT-marking such as German. For the sake of completeness a further commonality should finally be mentioned. It is well-known that English generics marked by an indefinite article cannot be combined with a "kind predicate" such as be extinct (cf. Krifka et al. 1995; cf. Note 11 in this article). It hardly needs to be mentioned that this holds true for all the other languages as well.

5.2.5. Statistical evaluation of kind-referring phrases in subject position.

Discussing similarities and differences in generic marking, I proceeded from the statistical evaluation of the kind-referring phrases in the Le petit prince corpus in Section 5.2.1. In this first statistical evaluation (cf. Figure 1), I included phrases in all syntactic positions except PREDICATES when they could be interpreted as referring to kinds. That is, I considered both TOPICS and ATTRIBUTES. Recall, however, that the prototypical generic phrase has been defined as a phrase displaying the feature values {TOPIC, DISCOURSE REFERENT, S-T ABSTRACT, QUALITY}. In addition, we have seen above that the difference between TOPIC and ATTRIBUTE uses is highly relevant to the choice of grammatical devices, in particular in DISCOURSE REFERENT-marking languages. I therefore wondered whether the statistical picture would change if I carried out the statistical evaluation on that subset of phrases considered in Figure 1 that are realized as subjects. In all five languages subjects constitute good candidates for TOPICS. The results of our second statistical evaluation are presented in Figure 2 (the absolute number of tokens considered ranges between 97 and 114, depending on the language).

The results are indeed quite impressive. Bare forms as subjects are practically absent in all three DISCOURSE REFERENT-marking languages (French, Greek, and Hungarian). The few exceptional cases, leading to a percentage of 1.8% for bare singulars in French and to a percentage of 1% for bare plurals in Hungarian, can clearly be identified as fossilized or as constructional specialities. (43) In the QUALITY-marking language English, by contrast, restriction to subject occurrences even results in a slight rise of bare plurals from 29.9% to 32.7%. It can even be assumed that the percentage would have been higher yet were it not for the large number of generic scripts. The change in conditions also brings out the relative dominance of the definite singular in Hungarian far more clearly: if one considers only subject occurrences, the proportion of Hungarian definite singular phrases (33%) is approximately ten percent higher than in the other DISCOURSE REVERENT-marking languages. It is thus almost identical to the proportion of bare plurals in English. In French and Greek, in contrast to Hungarian, the percentage of definite plural phrases rises most significantly. This reflects the intuition that, in these languages, the definite plural generally constitutes the default construction for SHAPE nouns. Finally, the exclusion of ATTRIBUTES results in a slight decrease of indefinite singular forms in Hungarian (down to 4.1%), since this also excludes, for example, constructions of comparison, in which the use of an indefinite article is permitted.

6. Summary

In this article, genericity was investigated from a cross-linguistic perspective. After discussing methodological and theoretical problems arising in cross-linguistic research on genericity, I presented a multidimensional approach, arguing that it is necessary to factor apart different types of information involved in the notion of genericity. It was claimed that there is a prototypical configuration of values on the relevant dimensions universally characterizing generic expressions: they are TOPICS, they represent DISCOURSE REFERENTS, they are associated with S(PATIO)-T(EMPORALLY) ABSTRACT contexts, and they have a QUALITY reading rather than an OBJECT one.

In the empirical part of the article, this multidimensional approach was employed to investigate generic marking and interpretation in five languages (English, German, French, Hungarian, and Greek). The investigation was mainly carried out on the basis of a multilingual corpus (the novel Le petit prince by Antoine de Saint Exupery and its translations). As a result, a fundamental typological difference between QUALITY-marking languages (such as English) and DISCOURSE REFERENT-marking languages (such as French, Hungarian, Greek) was introduced. I argued that proceeding from the most frequent and unmarked type of grammatical device for marking generic meaning on noun phrases (zero determiner in English and definite article in the other languages), it is possible to classify languages according to which semantic/pragmatic aspect they highlight as the most relevant aspect of genericity. QUALITY-marking languages highlight the fact that the referents of generic noun phrases are not to be considered as individual entities (OBJECTS) but as abstract representatives of certain properties (QUALITY) characterizing kinds. DISCOURSE REFERENT-marking languages, in turn, highlight the fact that the referents of generic noun phrases may be conceived of as DISCOURSE REFERENTS that are permanently established in the discourse by being part of our world knowledge. The difference between QUALITY-marking and DISCOURSE REFERENT-marking languages has a number of ramifications, which go beyond the grammatical marking of generic noun phrases. In particular, this is manifest in the dominant type of ambiguity between generic and nongeneric interpretations.

A further novel aspect of the work on genericity carried out here is that kind-reference was investigated on the text level rather than merely in isolated sentences. I demonstrated that, against common belief, textual factors may influence grammatical marking also in the case of generics. Moreover, the kind of influence depends on the generic type of a language. Finally, I pointed to the relevance of some (cross-linguistic or language-specific) hierarchies in determining encoding of genericity, in addition to typological variation or to textual factors. These are, among others, the hierarchy of arguments and that of syntactic functions or the animacy hierarchy. It can be assumed that the typological picture that has emerged has essential theoretical consequences in that it relativizes the dominant status of the English pattern by identifying it as a representative of a peculiar type, the QUALITY-marking type.

Received 14 August 2001

Revised version received

18 June 2003

University of Cologne

Notes

(1.) This article is based on research carried out in the framework of a project on "Lexical Typology" funded by the Deutsche Forschungsgemeinschaft (German Research Society) under grant number Sa 198/14. Correspondence address: Allgemeine Sprachwissenschaft, Institut fur Linguistik, Universitat zu Koln, D-50923 Koln. E-mail: leila. behrens@uni-koeln.de.

(2.) There are considerable differences in the semantic analysis of genericity in English. Some linguists (e.g. Burton-Roberts 1977) consider only sentences which contain an indefinite singular subject and a predicate expressing an "essential" (i.e. "definitory") property of the subject as "generic sentences proper." By contrast, Krifka et al. (1995) have called the generic status of the indefinite singular in sentences such as (1c) into question. They suggest the definite singular to be the prime example of kind-referring phrases because this is the only construction that can be used in contexts which force a kind-referring interpretation (i.e. with "kind predicates" such as be extinct) and, at the same time, cannot be used in contexts which also tolerate an object-referring interpretation (i.e. in the case of "nonestablished" or "ad hoc" kinds such as green bottle). And finally, some other linguists such as Declerck (1991) point out that the most typical type of generic marking in English is the bare plural, because in the case of bare plurals (as in [1b]), the unmarked interpretation is a generic one, whereas in the case of definite or indefinite singular (as in [1a] and [1c]), it is a nongeneric reading. The question of how to deal with such controversies in language-specific analyses will be touched on in Section 2. Here, it may suffice to note that all three constructions (definite singular, indefinite singular, and bare plural) will be considered as possible types of English generic phrases in this article (cf. Lyons 1977; Carlson 1977b).

(3.) Indeed, I know of no language that would fit such a description one hundred percent. What comes to mind in this connection are languages possessing an elaborate article system such as Bavarian. Bavarian has two definite articles (strictly speaking: two paradigms of the definite article) which are complementarily associated with a generic and an anaphoric use (cf. Scheutz 1988; Kolmer 1999). One still cannot maintain even for Bavarian, however, that the definite article used with generic mentions is a unique marker of genericity in the sense that it is used necessarily and exclusively with a generic interpretation. Another possible counterexample comes from sign languages. Perniss (2001), for instance, points to the existence of an exclusive generic marker in German Sign Language.

(4.) Since the focus of interest in this article lies on generic marking and interpretation of nominal expressions, only ambiguity phenomena in the area of noun phrases will be addressed. For a cross-linguistic investigation of generic meaning indicated on the predicate, and hence, for ambiguity of verb forms appearing in certain tenses or aspects, the reader is referred to Dahl (1985).

(5.) Note that I am not claiming that the semantic distinction between definite and indefinite determiners or between singular and plural will be totally neutralized in the generic domain, in the sense that there is no context in which any semantic distinction between the opposite forms in question could be observed. Such a suggestion has sometimes been made in the literature on German generics (cf. Vater [1979: 62]; Oomen [1977: 19ft.]; cf. also Chur 1993 for critical comments). This, however, would mean that the contrasting forms are intersubstitutable in all possible generic contexts, which is, of course, empirically not tenable, neither in German nor in any other language.

(6.) In English, number switching is even possible between a generic singular antecedent and an anaphorically referring plural pronoun, as shown in the following example taken from the British National Corpus: Given good conditions a goldfish will live for 10-20 years. In occasional cases they may live for over 40 years. There is probably one type of nongeneric context where number switching is cross-linguistically allowed, namely in sentences containing collective nouns. However, it is normally not possible to change the number of the collective noun itself (e.g. to switch between (a) herd and herds) without an obvious change of reference.

(7.) A major part of the examples adduced in this article are taken from this book. I will refrain from giving a complete morphological translation since I do not consider it necessary given the present topic. In each case I will give only the English translation and highlight the relevant generic phrase with underlining. For the generic phrases the relevant features will be added in brackets. In cases where constructional differences between the translation equivalents substantially contribute to differences in semantic interpretation, this will of course be noted.

(8.) It is necessary to make some remarks concerning the use of translation material in cross-linguistic investigation. One anonymous referee of this article objected that we probably could not decide on the basis of a translated form whether it is the only possible form in the language in question or one of several alternatives, even if we generally assume that the translator used the "best" (i.e. most idiomatic) translation of the original French. (S)he also wondered whether it could be the case that in translating a definite form from the French, one translator (the English one) felt free to change it into an articleless form, while another translator (e.g. the Hungarian one) always tried to copy the original definite forms. I think it is possible to exclude this uncertainty of the basis of a careful analysis of the whole material. I analyzed the class of examples showing deviation from the original for every language and compared it with the class of examples in which a corresponding form is used. I then checked these data against the judgments of native speakers, testing whether they would use the translated form, and also testing whether and in which context alternative forms could be used. The results were relatively clear. In cases in which a form corresponding to the original was rejected (for the sentence in question and for the type of context in which this sentence occurred), the translators always used a different form (e.g. a definite form in Hungarian for an indefinite form in French as in [2], or a bare plural in English for a definite plural in French). In those cases where alternating forms were accepted, I typically found a variation when I considered all sentences manifesting the same type of context in which the sentence in question occurred. For instance, we find a variation between definite plurals and bare plurals for generic topics in the German text (cf. examples in Section 5.2.4.2 below). In contrast, there is no such variation in Hungarian and Greek, in that a bare form is never used. This corresponds to the fact that native speakers reject the bare plural in these contexts as well. Therefore, in such cases we may interpret the lack of variation as evidence against the influence of the original and against the existence of possible alternatives which would have a preferred status.

(9.) The symbol "O" is used to indicate the lack of determiners in article languages.

(10.) Behrens (2000) contains a more detailed description of the questions controversially discussed in works on genericity. Readers interested in the basic issues in the treatment of English generics are recommended to consult The Generic Book, edited by Carlson and Pelletier (1995), particularly the introductory article in this volume (Krifka et al. 1995) and Chur (1993), which also offers a detailed comparison of different approaches, as well as Burton-Robert's (1976, 1977), Carlson's (1977a, 1989), and Declerck's (1986, 1987, 1991) articles, and, of course--as an excellent initiation into the subject--Lyons (1977).

(11.) For example, Burton-Roberts' (1976, 1977) semantic description of indefinite generics in English (they are said to be abstract concepts having an underlying metapredicative structure which can be paraphrased as "to be an x") fails in a number of languages that use phrases containing a definite rather than an indefinite article in this generic context. Declerck's (1991: 94-95) description of the core meaning of English indefinite generics (which can be roughly paraphrased as "take any one (relevant) member of the kind x and you will see that ...") is more suitable for certain languages, but is not capable of grasping subtle differences.

(12.) Manfred Krifka (1987) has made a proposal for a more abstract delimitation of indefinite generics ("i-generics") and definite generics ("d-generics"). He proposes different testing procedures for identifying these two generic types: when occurring as subjects, "d-generics," but not "i-generics," can be combined with kind predicates (e.g. be extinct) or predicates expressing an accidental property (e.g. be popular in the case of madrigal). In turn, the fact that a noun phrase which does not refer to a "well-established kind" can be combined with a characterizing predicate in a generic sentence is regarded as a sign of the "i-genericity" of this noun phrase. The results of such tests produce a potential cross-classification of generic construction types in a language: for German and English, for example, it would turn out that the bare forms (SG/PL) are both "d-generics" and "i-generics," whereas forms with the definite article (SG/PL) are to be ranked only as "d-generics" and forms with the indefinite article only as "i-generics." Such tests may certainly be used in any language. But the question remains whether the semantic differences they are used to test universally correspond with a formal distinction between definite and indefinite articles, as they at least partly do in English and German. Interestingly, Krifka et al. (1995: 4, fn. 3) criticize the terms "d-generic" and "i-generic" as misleading, among other things, because there is no such universal correspondence. It should probably also be noted that several authors in the same volume (e.g. Link 1995) still use these terms, however they do so independently of the formal type of the determiner, solely on the basis of test behavior.

(13.) This idea is explicitly expressed in Carlson (1979: 65): "Bare plural NP's will be treated as definite descriptions of a very special sort."

(14.) Here, I will assume that a cigar is not a generically-used phrase.

(15.) Not every language has a clearly defined word class whose members can justifiably be called "nouns" in the traditional sense of this term. However, every language seems to have lexical elements which may be used to refer to specific objects and to kinds. For the sake of simplicity, we will adhere to the term "noun."

(16.) Langacker (1991), for instance, assumes a cognitively motivated asymmetry between generic and nongeneric uses (i.e. "type" vs. "instance" uses in his terminology) in terms of a difference between what he calls "primary" and "nonprimary" domains of instantiation. For nouns denoting perceivable objects, space is said to be the primary domain, so that nongeneric uses allowing reference to particular, spatially bound objects are claimed to be cognitively prior to generic uses. There is a philosophical tradition proceeding from a comparable priority of "particulars" over "universals" (cf. Searle 1969). I do not want to deny that one may find philosophical or cognitive arguments in favor of the hypothesis of the priority of nongeneric uses. I do not believe, however, that it could be possible to support this claim by universally valid linguistic evidence.

(17.) Krifka (1995: 399) makes the claim that kinds are ontologically prior to specimens and linguistically basic in that "every language which allows for bare NPs at all uses them as expressions referring to kinds." The second assumption, which Krifka adopts from Gerstner-Link (1988), is empirically problematic. Hungarian could be a good counterexample. It allows bare singulars (for nonspecific uses) but strongly requires definite marking for prototypical generics, for example, generic subjects. Krifka himself tries to substantiate his claim with data from Chinese, where bare forms (forms without any classifier and identical to lexical stems) may be used for referring to kinds such as "bear." However, as in the case of English mass nouns such as gold, I will insist here as well on the difference, in principle, between lexical elements (which have no reference at all) and bare phrases. Incidentally, bare phrases in Mandarin Chinese can be interpreted not only generically but also as specific/definite (cf. Matthews and Pacioni 1997; for genericity in Vietnamese, cf. Behrens 2000).

(18.) Zwicky and Sadock (1975: 3, fn. 9), for instance, use the expression "understanding" as a neutral term to "cover those elements of 'meaning' (in a broad sense) that are coded in semantic representations and those that do not."

(19.) In European languages, TOPICS are normally realized as grammatical subjects. However, there are some exceptions, for example, experiencer arguments of psych-verbs are usually constructed as oblique phrases in some languages, which nevertheless appear as TOPICS. Furthermore, not all subjects in European languages should automatically be considered to be TOPICS but rather only subjects of "categorical utterances" (cf. Sasse 1987). Subjects of "thetic utterances" receive the feature specification ATTRIBUTE. Moreover, in specific cases, what may be called "secondary" TOPICS are also admitted (cf. Behrens and Sasse 2003).

(20.) Note that the term "non-discourse referent" is to be read as "not a discourse referent" rather than as "a referent unrelated to discourse."

(21.) Note that Kuno's distinction between referents listed in the "permanent registry" of discourse and those listed in the "temporary registry" of discourse is not identical with Karttunen's distinction between "permanent discourse referents" (introduced in referentially transparent contexts) and "short-term discourse referents" (introduced in referentially opaque contexts). On the contrary, DISCOURSE REFERENTS that are part of the "permanent registry" of discourse according to Kuno do not have to be introduced into the discourse at all.

(22.) Of course, the idea underlying the concept of "discourse referent" has also come to be known as the "familiarity theory" of definiteness. The following points must therefore be stressed. First, I consider the distinction between DISCOURSE REFERENTS and NON-DISCOURSE REFERENTS to be universally relevant. I do not assume, however, that the use of formally definite expressions in every individual language can be made completely predictable on the basis of a familiarity theory. Second, the term "familiar" should not be taken in its everyday sense. One anonymous referee of this article correctly noted that, in that sense, not even DISCOURSE REFERENTS that were introduced in the preceding sentence (e.g. as in Once upon a time there was a king ...) would be said to be "familiar." Another referee pointed to a more serious problem in the context of the present investigation. (S)he raised the question of how to deal with complex nominal phrases (e.g. professors who need to teach too many classes) that are combined with a characterizing predicate in a generic statement. Indeed, it seems doubtful that such generic TOPICS would have a "familiar" referent in whatever sense of this term. I think the problem is related to another well-known problem which concerns the distinction between "well-established kinds" and "ad-hoc-kinds." I would maintain that prototypical generics refer to "well-established kinds," which are--in that sense--familiar to the speech act participant. It is true that reference to "unfamiliar" ad-hoc kinds is also possible, but it often underlies language-specific restrictions regarding the grammatical devices allowed to be used. Furthermore, it could be the case that complex ad-hoc generics are mostly found within generic texts which elaborate as a whole on a "well-established kind" (e.g. about professors) a subtype of which the complex generics express. Finally, I would like to point to an interesting semantic effect in informant reports. Informants I tested for complex generics occurring in a grammatical environment, which is the standard way of expressing prototypical generics in the language in question and which, in the particular case, excludes a nongeneric interpretation (e.g. definite article combined with a characterizing predicate in Hungarian or Greek) reported having the impression that the generic statement would be made about something with particular relevance that they probably should know.

(23.) Considering English, there is a striking correlation between QUALITY value and mass phrases on the one hand, and OBJECT value and count phrases on the other. QUALITY uses exclude count determiners and quantifiers by definition and the lack of quantification in the case of QUALITY uses may be indicated by the total lack of determiners and quantifiers, for example, by singular mass phrases or by bare plural phrases, which share a number of commonalities with singular mass phrases. However, it would not be correct to say that QUALITY value universally maps onto mass phrases and OBJECT value onto count phrases. First, not every language has a grammaticalized mass/count distinction on the phrasal level similar to that found in English (cf. Behrens 1995; Behrens and Sasse 2003). Second, there exist phrasal types even in English that are formally neutral with respect to the mass/count distinction. The definite singular is a paradigm case here. In the current framework, a definite singular may take both a QUALITY and an OBJECT value, which is determined by the semantic interpretation in the sentence (i.e. by the existence of transnumerality effects). For example, the subject of the first sentence in (12b) (train) could also be constructed with a definite article; searching the internet, I have actually found attestations for both of them in the very same environment (... is a good way to travel). In the context of (12b), I analyze both the articleless phrase and the definite phrase as manifesting QUALITY value on the dimension of individuality.

(24.) The value names S-T ABSTRACT and S-T CONCRETE are abbreviations of SPATIOTEMPORALLY ABSTRACT and SPATIOTEMPORALLY CONCRETE. The distinction between these two values of the dimension of spatiotemporal location is a matter of sentence or utterance context and should not be confused with the common distinction between "abstract" and "concrete" nouns, which applies to the lexically determined meaning of the head nouns. According to its lexical meaning, bicycle, for instance, is certainly a "concrete" noun, that is, a noun which denotes a first-order entity. The phrases containing bicycle, in example (13) below, however, are assumed to have an S-T ABSTRACT interpretation. This is because phrasal meanings may also be determined by factors such as sentence aspect, which may force an abstraction of the spatiotemporal manifestation of the entities involved in the predication. In the case at hand, the S-T ABSTRACT interpretation is due to the habitual context.

(25.) Discussing definite pronotninalization, Carlson (1977a: 425) points to a difference between indefinite singular phrases and bare plural phrases in English. Normally, indefinite singular phrases having a nonspecific interpretation do not permit definite pronominalization, while bare plural phrases do. It is difficult to decide how to analyze the relation between the antecedent and the pronoun in (18) in English and in German, as a shift from a NON-DISCOURSE REFERENT to a DISCOURSE REFERENT or as pointing to the same DISCOURSE REFERENTS (namely to the same "kind") in both sentences. French, Greek, and Hungarian, which mark the antecedent with a definite article, would speak in favor of the latter analysis. As mentioned in Section 2, I proceed from the working hypothesis that sentences which are correct translations of each other have the same meaning. Accordingly, I would not consider plausible the assumption that corresponding phrases in generic statements systematically differ with respect to their meanings in different languages simply because they are marked differently, for example, without an article in one language and with a definite article in another. On the other hand, cross-linguistic investigations reveal the existence of semantic/pragmatic mismatches between language, that is, cases where a sentence of one language does not lend itself to translation into another language in such a way that the translation carries exactly the same semantic/pragmatic implications as the original sentence. As far as genericity is concerned, such cases are most likely to be expected on the fringes, so for example with ATTRIBUTES but not with TOPICS.

(26.) Conversely, zero-marking in English (a characteristic device for encoding genericity in English) was, in doubtful cases, not regarded as sufficiently indicative of a generic interpretation unless it was paralleled by definite marking in one of the other languages. The reason lies in the already-mentioned fact that bare forms (in particular: bare plurals) exhibit a low degree of distinctivity between a generic and a nongeneric interpretation.

(27.) Number categories were assigned on formal rather than semantic grounds. For instance, a form which agrees with a singular verb form (or would agree with a singular form if it were used as a subject) has been classified as a singular form. There is some debate about mass nouns in English concerning the question of whether they should be considered as neutral with respect to number distinctions and therefore analyzed as "nonsingulars" when occurring--as usual--without plural marking. I decided to analyze them as singulars for the following reasons. Almost every mass noun may take a plural suffix under appropriate pragmatic condition. From a morphological and syntactic point of view. there is a real opposition between singulars and plurals. Neutralization of the semantic opposition between singular and plural meanings will be expressed in terms of the semantic dimension of individuality in the current framework. In addition, not every language has a mass/count distinction. Therefore, considerations based on how the mass/count distinction works do not provide a particularly good base for the formal comparison of languages. Pure morphological marking and agreement seem to be more suitable for this purpose.

(28.) In addition to these two types, Behrens (2000) also introduces a third type, namely TOPIC-marking languages. Among the latter, languages such as Tagalog and (partly) Finnish are counted.

(29.) Arabic is a further DISCOURSE REFERENT-marking language. Here, the generalization of kinds as DISCOURSE REFERENTS has proceeded so far that the definite article is even used with (ascriptive!) predicates (Egyptian Arabic da id-dahab ('that's gold') in the sense of 'that's what the kind of gold is like' (cf. Behrens and Sasse 2003).

(30.) Note that in this example too, French and German employ zero-marking, as in (26). In the Greek sentence, we find a participle metanjomenos 'repentant'. It often happens that nominal modifiers based on abstract nouns have no nominal correspondence in one or more of the compared languages but are realized as adjectives or participles, as in this Greek sentence. This is a good piece of evidence for the assumption that abstract nouns take the QUALITY interpretation as their common meaning when they occur as nonheads of noun phrases.

(31.) In Hungarian, the possessor regularly appears in the dative in asserting possession and as such occupies the TOPIC position. There are a number of constructions in Hungarian which point to the fact that the argument hierarchy is sensitive to the animacy hierarchy, allowing the dative argument to take a higher-ranking position in the argument hierarchy than the grammatical subject.

(32.) Sometimes it is difficult to decide whether or not a particular verb (when used in the right grammatical form and in the right context) is ambiguous between an ordinary habitual reading and a reading with an additional attitude component. In Greek, for example, "eat"-verbs may be used either in the ordinary habitual sense or in the sense of liking the food in question. While under the first sense both definite and zero-marking are possible, the second sense with the attitude component allows only definite marking. German exhibits both senses of "eat" verbs as well, zero-marking the object in both cases. In Hungarian, "eat"-verbs cannot be understood in the attitude sense; here, the ordinary habitual sense is usually zero-marked.

(33.) Of particular interest in this connection is Pease-Gorrissen's (1980) article about "the use of the article in Spanish habitual and generic sentences." Spanish is clearly a DISCOURSE REFERENT-marking language and Pease-Gorrisen deals with the well-known puzzle that the great majority of Spanish transitive verbs exhibit a systematic alternation in that they can either take an object with the article in a generic reading or a zero-marked object. In order to explain the use of the definite article she refers to the concept of "scenario." As a condition for the fact that both the subject and the object are constructed with the definite article in habitual sentences, she postulates that both parts coincide in the antecedent of a scenario-structure, resulting in a "scenario-correlation." If I understand Pease-Gorrisen correctly, she suggests that this happens precisely in those cases where the kind realized as (syntactic) object also has current relevance in the respective situation and constitutes part of what some linguists call "shared knowledge." When we combine this with what has been said above about the difference between textually-established knowledge and general knowledge, the following assumption is corroborated: in generics, too, we have to differentiate between different kinds of knowledge, particularly between quite general encyclopedic knowledge on the one hand and textually or situationally reinforced general knowledge on the other.

(34.) Note, however, that the first mentions of "explorer" earlier in the text are not classical (i.e. specific/indefinite) introductions such as found in nongeneric texts. Rather, we have a predicative mention first, immediately followed by a use in the scope of negation.

(35.) This sentence in Hungarian literally means 'The thorns have no use.' Because Hungarian employs possessive constructions of the "mihi est" type, the translation equivalent of the thorns is realized as a dative phrase (a toviseknek) (cf. Note 31).

(36.) The first exception is the bare plural form virdgokkal 'with flowers' in the Hungarian sentence (40e). There is a strong contrastive accent on this phrase, that is, the sentence implies that the narrator occupies himself with all kinds of things except flowers. Contrastive TOPICS which do not refer to specific entities may be used in a particular construction in Hungarian, in which case they are zero-marked as seen here. Without this contrast, the use of a definite article would also be perfectly possible in Hungarian. The second exception is the bare plural form kotes 'chickens' in the Greek sentence (41d). Here, too, a definite phrase would be a possible alternative to the bare phrase.

(37.) It will be assumed that the hierarchy of syntactic realizations interacts language-specifically with the hierarchy of propositional functions.

(38.) The same problem holds for passive sentences in which both (agentive and patientive) arguments are overtly realized. Passive constructions do not therefore provide clarity in such languages as they do in English. Apart from this, some languages, such as Hungarian, do not have a productive passive construction that would allow the presence of agents. Only nonspecific subjects of intransitive verbs are unproblematic since they can be construed as ATTRIBUTES.

(39.) In some semantic fields (e.g. nationalities) within human-denoting nouns, choice of number has certain pragmatic implications. However, this has nothing to do immediately with how broad the basis of the generalization is or how exceptionless it is.

(40.) Of course, all of the five languages investigated in this article can use definite singular generics with SHAPE nouns as well, and Hungarian, as mentioned above, does so more frequently than all other languages. However, if there is a determiner difference in marking SHAPE nouns in the corpus such that some languages use a definite singular and the others do not, this difference either involves a contrast between a definite and an indefinite article, or it involves number differences as well, that is, we find a contrast between a definite singular and a definite or bare plural as shown in (46). The former cases will be discussed below in terms of the variation between "IND/SG" and "DEF/ SG"/"[empty set]/SG." The latter cases will not be pursued any further.

(41.) This holds true only for the standard language (High German). Certain dialects such as Bavarian prefer the definite article here--like French, Greek, and Hungarian.

(42.) Interestingly, Vietnamese displays the same kind of "lexical split" (cf. Behrens 2000). Strictly speaking, the term "mixed-type language" may be understood in two different ways. Using this term, we may simply indicate that two typological patterns are freely combined in a language, for example, the respective forms more or less appear as free variants in the same context. Or we may refer to the fact that two typological patterns are distributed in a principled way in a language, namely, according to some lexical or grammatical factors. German is a "mixed-type language" in both sense.

(43.) In Hungarian, for example, it is the contrastive construction described in Note 36 that is distinguished by zero-marking.

References

Abelson, Robert P. (1973). The structure of belief systems. In Computer Models of Thought and Language, Roger C. Schank and Kenneth M. Colby (eds.), 287-339. San Francisco: W. H. Freeman and Company.

Behrens, Leila (1995). Categorizing between lexicon and grammar. The MASS/COUNT distinction in a cross-linguistic perspective. Lexieology 1, 1-112.

--(1998). Ambiguitat und Alternation. Methodologie und Theoriebildung in der Lexikonforschung. Habilitationsschrift, Munich.

--(2000). Typological Parameters of Genericity. Arbeitspapier 37 (Neue Folge). Institut fur Sprachwissenschaft, Universitat zu Koln.

--; and Sasse, Hans-Jurgen (2003). The Microstructure of Lexicon-Grammar-Interaction: A Study of "Gold" in English and Arabic. Munich: Lincom Europa.

Bennett, William A. (1977). Verb and article colligation in French and English. International Review of Applied Linguistics in Language Teaching 15, 47-54.

Burton-Roberts, Noel (1976). On the generic indefinite article. Language 52(2), 427-448.

--(1977). Generic sentences and analyticity. Studies in Language 1, 155-196.

Carlson, Greg N. (1977a). A unified analysis of the English bare plural. Linguistics and Philosophy 1, 413-457.

--(1977b). Reference to Kinds in English. Bloomington: Indiana University Club.

--(1979). Generics and atemporal when. Linguistics and Philosophy 3, 49-98.

--(1989). On the semantic composition of English generic sentences. In Properties, Types and Meaning H, Gennaro Chierchia, Barbara H. Partee, and Raymond Turner (eds.), 167-192. Dordrecht: Kluwer.

--; and Pelletier, Francis Jeffry (eds.) (1995). The Generic Book. Chicago and London: University of Chicago Press.

Casadio, Claudia; and Orlandini, Anna (1991). On the interpretation of generic statements in Latin. In New Studies in Latin Linguistics, Robert Coleman (ed.), 349-364. Amsterdam: John Benjamins.

Chur, Jeannette (1993). Generische Nominalphrasen im Deutschen. Eine Untersuchung zu Referenz und Semantik. Tubingen: Niemeyer.

Croft, William (1990). Typology and Universals. Cambridge: Cambridge University Press.

--(1991). Syntactic Categories and Grammatical Relations. The Cognitive Organization of Information. Chicago and London: University of Chicago Press.

Cruse, D. Alan (1986). Lexical Semantics. Cambridge: Cambridge University Press.

Dahl Osten (1975). On generics. In Formal Semantics of Natural Language, Edward L. Keenan (ed.), 99-111. Cambridge: Cambridge University Press.

--(1985). Tense and Aspect Systems. Oxford: Basil Blackwell.

Dayal (Srivastav), Veneeta (1992). The singular-plural distinction in Hindi generics. In Working Papers in Linguistics 40, C. Barker and D. Dowty, 39-58. Columbus, OH: Ohio State University.

Declerck, Renaat (1986). The manifold interpretations of generic sentences. Lingua 68, 149-188.

--(1987). A puzzle about generics. Folia Linguistica 21(2-4), 143-153.

--(1991). The origins of genericity. Linguistics 29, 79-102.

Donnellan, Keith S. (1966). Reference and definite descriptions. The Philosophical Review 75, 281-304.

Epstein, Richard (1994). The development of the definite article in French. In Perspectives on Grammaticalization, William Pagliuca (ed.), 63-80. Amsterdam: John Benjamins.

Gelman, Susan; and Tardif, Twila (1998). A cross-linguistic comparison of generic noun phrases in English and Mandarin. Cognition 66, 215-248.

Gerstner-Link, Claudia (1988). Uber Generizitat. Generische Nominalphrasen in singularen Aussagen und generischen Aussagen. Unpublished doctoral dissertation, University of Munich.

Heim, Irene (1983). File change semantics and the familiarity theory. In Meaning, Use and Interpretation of Language, Rainer Bauerle, Urs Egli, and Arnim v. Stechow (eds.), 164-189. Berlin: Monton de Gruyter.

Jackendoff, Ray S. (1983). Semantics and Cognition. Cambridge, MA: MIT Press.

Jacobsson, Bengt (1998). Notes on genericity and article usage in English. Studia Neophilologica 69, 139-153.

Karttunen, Lauri (1976). Discourse referents. In Syntax and Semantics 7: Notes from the Linguistic Underground, James D. McCawley (ed.), 363-385. New York: Academic Press.

Kolmer, Agnes (1999). Zur MASS/COUNT-Distinktion im Bairischen: Artikel und Quantifizieurng. Arbeitspapier 34 (Neue Folge). Institut fur Sprachwissenschaft, Universitat zu Koln.

Krifka, Manfred (1987). An outline on genericity (partly in cooperation with Claudia Gerstner). Tubingen: SNS-Bericht 87-25.

--(1995). Common nouns: a contrastive analysis of Chinese and English. In The Generic Book, Gregory N. Carlson and Francis Jeffry Pelletier (eds.), 398-411. Chicago: University of Chicago Press.

--; Pelletier, Francis Jeffry; Carlson, Gregory N.; ter Meulen, Alice; Chierchia, Gennaro; and Link, Godehard (1995). Genericity: an introduction. In The Generic Book, Gregory N. Carlson and Francis Jeffry Pelletier (eds.), 1-124. Chicago: University of Chicago Press.

Kuno, Susumu (1972). Functional sentence perspective. Linguistic Inquiry, 3(3), 269-320.

Langacker, Ronald W. (1987). Foundations of Cognitive Grammar. Vol. I. Stanford, CA: Stanford University Press.

--(1991). Foundations of Cognitive Grammar. Vol. II. Stanford, CA: Stanford University Press.

Lee, Chungmin (1996). Generic sentences are topic constructions. In Reference and Referent Accessibility, Thorstein Fretheim and Jeanette K. Gundel (eds.), 213-222. Amsterdam: John Benjamins.

Lee, Ik-hwan (1992). A quantificational analysis of generic expressions in Korean. Korea Journal 32(3), 73-85.

Link, Godehard (1995). Generic information and dependent generics. In The Generic Book, Gregory N. Carlson and Francis Jeffry Pelletier (eds.), 358-382. Chicago: University of Chicago Press.

Lyons, John (1977). Semantics. Cambridge: Cambridge University Press.

Marmaridou, A. Sophia S. (1984). The study of reference, attribution and genericness in the context of English and their grammaticalization in M. Greek noun phrases. Unpublished doctoral dissertation, Darwin College, Cambridge.

Mattews, Stephen; and Pacioni, Patrizia (1997). Specificity and genericity in Cantonese and Mandarin. In The Referential Properties of Chinese Noun Phrases, Xu Liejiong (ed.), 45-59. Paris: Centre de Recherches Linguistiques sur l'Asie Orientale.

Miller, James (1985). Semantics and Syntax. Parallels and Connections. Cambridge: Cambridge University Press.

Oomen, Ingelore (1977). Determination bei generisehen, definiten und indefiniten Beschreibungen im Deutschen. Tubingen: Niemeyer.

Pease-Gorrissen, Margarita (1980). The use of the article in Spanish habitual and generic sentences. Lingua 51, 311-336.

Perniss, Pamela (2001). Numerus und Quantifikation in der Deutschen Gebardensprache. MA thesis, Universitat zu Koln.

Sasse, Hans-Jurgen (1987). The thetic/categorical distinction revisited. Linguistics 25, 511-580.

Schank, Roger C. (1980). Language and memory. Cognitive Science 4, 243-284.

--; and Abelson, Robert P. (1977). Scripts. Plans. Goals, and Understanding. Hillsdale, NJ: Lawrence Erlbaum.

Scheutz, Hannes (1988). Determinantien und Definitheitsarten im Bairischen und Standard-deutschen. In Festschrift fur Ingo Reiffenstein zum 60. Geburtstag, Peter K. Stein, Gerold Hayer, Renate Hausner, Ulrich Muller, and Franz Spechtler (eds.), 231-258. Goppingen: Kummerle.

Searle, John R. (1969). Speech Acts. An Essay in the Philosophy of Language. Cambridge: Cambridge University Press.

Smolska, Janina: and Rusiecki, Jan (1980). The generic noun phrase in English and Polish. Papers & Studies in Constrastive Linguistics 11 (Poznan), 39-57.

Vater, Heinz (1979). Das System der Artikelformen im gegenwartigen Deutsch. Tubingen: Niemeyer.

Wierzbicka, Anna (1995). A semantic basis for grammatical typology. In Discourse Grammar and Typology. Papers in Honor of John W. M. Verhaar, Werner Abraham, Tom Givon, and Sandra A. Thompson (eds.), 179-209. Amsterdam and Philadelphia: John Benjamins.

Zwicky, Arnold M.; and Sadock, Jerrold M. (1975). Ambiguity tests and how to fail them. In Syntax and Semantics, Vol. 4, John P. Kimball (ed.), 1-36. New York: Academic Press.
COPYRIGHT 2005 Walter de Gruyter GmbH & Co. KG
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2005 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Behrens, Leila
Publication:Linguistics: an interdisciplinary journal of the language sciences
Date:Mar 1, 2005
Words:30212
Previous Article:Focus in double object constructions *.
Next Article:Toward a semantic account of that-deletion in English *.
Topics:

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters