Cllenges of adopting gender-inclusive language in Slovene.

1 Introduction

This paper presents the possibilities for enabling discursive practices for increasing the visibility of different genders in Slovene, focusing on one of the possibilities in the analytical section of the paper. This possibility, i.e. the underscore, was introduced into the Slovenian discourse not only as an effort for a greater visibility of both genders, which was the usual practice of the initial attempts of employing gender--inclusive language, but also with the aim for doing away with the binary understanding of gender and using traditional linguistic elements innovatively, so that all genders are included in discourse.

Different languages have different ways of expressing gender pluralism. Unlike, for instance, English, Slovene is quite limited in this respect, also due to its highly inflective nature and its prescriptive tradition (cf. Toporisic 2004), which has not paid much attention to gender pluralism (and is, in part, perhaps wary of it even today). In addition, the introduction of gender--sensitive language has received much public scrutiny, especially from the right--wing media. (1) In the following section, we briefly describe the framework into which we place our research. Our intention is not to provide a holistic analysis of every attempt at introducing gender--sensitive discursive practices and the discussions arising from it, but to present them in order to enable better understanding of one of the attempts, which is at the core of our research.

1.1 In addition to the public scrutiny exercised focused on the use of innovative linguistic elements for enabling gender inclusion, there has recently also been an intensive public outcry in Slovenia as the Senate of the Faculty of Arts, University of Ljubljana, decided to use the feminine forms as generic and gender--inclusive in all its internal normative documents in the following three years. After this period, the feminine and masculine shall be used interchangeably. (2) Even though this has been the standard practice at the Faculty of Social Work, University of Ljubljana, for over 15 years and Slovenia already saw the publication of a scientific monograph (Hofman 2017) dealing with scientific careers from the perspective of gender (and this monograph employed the same discursive practice, more than one year prior to the Senate's decree), it was exactly the decision of the Faculty of Arts that spurred public debate. In actuality, it was the first public response to this decision that really incited a heated debate: this was a text marked by traditional linguistics as well as being explicitly manipulative in nature, published by one of the central Slovenian dailies--Delo (Ahacic 2018). (3) The author failed to take the actual decree and its impact on the entire genre into account, and instead deliberated on the possible consequences for language use that had not been intended nor made possible by the decree.

The ensuing debate failed to address merely the linguistic issues, but instead revealed to what extent the binary heteronormative model, imbued with patriarchalism, is rooted in the Slovenian society. In addition, the debate revealed the ideological positions within the Slovenian linguist community, especially within the field of Slovene studies in relation to its standard--language core. Namely, it became apparent that standard--language ideology is linked to other ideological positions within the society that contradict equality, not just of women but also of other social groups, e.g. the members of the LGBTQ+ community (Gorjanc 2018), whereas the linguistic discourse is merely a pretense for defending these ideological positions (Sorli 2018). In this paper, we do not deal with the analysis of this particular discursive practice, but instead aim to present an attempt of introducing an innovative use of traditional linguistic elements for a more gender--inclusive language use.

As we are focusing on a particular linguistic element, i.e. the underscore, within the scope of corpus analysis, we also present the background of adopting this means of expression as well as its reception in the Slovenian linguistic circles.

1.2 In Slovene, the slash has been the prevalent means of expressing gender binarism (i.e. student/-ka), but recently a number of institutions have begun using the underscore. Its function is to express gender pluralism, thus opening a space for the visibility of non--binary and other gender identities. For example, the TransAkcija Institute (4) use the underscore in a feminist manner, i.e., they use the generic feminine gender, or use the feminine root of the word wherever the feminine form is not directly derived from the masculine by merely adding the feminine suffix to it (cf. the past tense of the verb "(he/she) was": bil_a, wherein bil is the masculine and bila the feminine, and the form "they were" (bile_i); in the latter case, the root of the word is bil--, the male suffix is--i, and the feminine--e. Thus, the female form is that of precedence. As already pointed out, there have been monographs using the underscore published by the Scientific Research Centre of the Slovenian Academy of Sciences and Arts (Hofman 2017; Golez Kaucic 2018), while it is also used by the (left--leaning) political weekly in Slovenia, Mladina.

The implementation of the underscore as a means of adopting gender--inclusive language in Slovene has been met with severe opposition, vocal as well as silent, and this also holds true for the wider purview of the initiatives for a more gender--neutral Slovene. For this reason, we present the background of implementing gender--inclusive language in Slovenia following the independence, focusing mostly on the introduction of the underscore. We present the arguments for and against its adoption as well as argue that it is used extensively as well as consistently within the public dealing with LGBTQ+ rights. We demonstrate this by means of corpus analysis with two comparable corpora built specially for this purpose.

2 Adopting gender-inclusive language in Slovene

The Slovenian pre--structuralist (Bajec et al. 1956) as well as structuralist linguistics (Toporisic 1976) mostly dealt with the question of gender in terms of a grammatical category. The authors of the pioneering attempts aiming for a linguistic description surpassing the traditional view of gender by employing non--structuralist theoretical and methodological underpinnings, especially those dealing with pragmatics (Meckovska 1980: 211-212), were indeed recognized for introducing a new perspective, more closely linked to the practical use in texts. At the same time, the assessments of these new attempts were largely confined to the authors' own language-systemic conceptual beliefs (Toporisic 1981: 80-81). Language descriptions thus for the most part sporadically described its use, e.g. the precedence of the feminine forms when referring to women, as in Ana je arhitektka instead of Ana je arhitekt ('Ana is an architect'; Toporisic 1976: 202; Toporisic 2004: 266). Comparative studies between Slavic languages dealing with defining gender by means of feminine nouns in Slovene have shown that "almost in all cases feminine nouns are used, normally without any notion of stylistic or expressive undertone" (Meckovska 1980: 212). This was not the case with other Slavic languages, especially East Slavic languages (Meckovska 1980: 211).

Somewhat different to the mainstream approach in the grammatical description was the approach of the Slovenian lexicography. Its core members, comprising the team of lexicographers working on the Dictionary of Literary Slovene (1970-1991), based their work on the presumption of the generality of the masculine grammatical gender. At the same time, they described male actants with masculine classifications, and actually omitted women from linguistic description (Gorjanc 2005: 206), as they understood them merely as the feminine grammatical form and only referred to them with links to the existing descriptions for men with the explanation "the feminine form of" (Gorjanc 2017: 59). However, a section of Slovenian specialized lexicographers intentionally introduced feminine forms. This is especially true for those dictionaries in which actants are of special significance for defining terminology, such as the Theatre Terminology Dictionary (Susec Michieli et al. 2007) as well as several others (Trojar and Zagar Karer 2013: 460). In this respect, the Slovenian military terminology in the Military Dictionary (1977) represents a positive breaking point, as all the forms for actants within the military consistently include feminine forms (Korosec 1995: 28). Even though the lexicographic practice was amended for the second edition of the Dictionary of Literary Slovene (2014), as the dictionary gives individual explanations for female actants (Kern 2015: 148), in line with the practice of the Dictionary of Newer Words in Slovene (2014), it is still far away from being socially acceptable in terms of linguistic description of women, as it fortifies the stereotypical view of the differences between the genders all the while failing to consider the modern lexicographic practice or the discursive reality (Gorjanc 2017: 109).

The first big step in the study of gender in the Slovenian linguistics was taken at a discussion on non--sexist language in the Slovenian society in mid--90s. This was first and foremost a consequence of the Slovenian accession into European and Euro--Atlantic integrations, especially within the Slovenian efforts for joining the European Union. From the mid--1980s, when the Resolution on policy and strategies for achieving equality in political life and in the decision--making process (1986) was passed, the European political space discussed and passed political decisions that aimed for equality between the sexes, and the Slovenian discussion was a direct response to the Council of Europe's document Recommendation no. R (90) 4 of the Committee of ministers to member states on the elimination of sexism from language (1990), which addressed linguistic issues. The politically motivated discussion significantly transformed the linguistic discourse as well. Although the traditional view of the issue of gender among the main points of the discussion could not be avoided, these points nevertheless shifted the center of the discussion towards language use, problematized the generality of the masculine gender and explicitly addressed the issue of visibility of genders in texts, with the aim of finding solutions for gender--inclusive discursive practices (Zagar and Milharcic Hladnik 1995: 10-11). The members of the Slovene studies community who partook in the discussion did so in a constructive manner, albeit still with a pervasive understanding of language as a system (Stabej 1995: 28; Korosec 1995: 28). The unease of linguists dealing with Slovene studies in the widely open social discussion, which also addressed exceedingly sociological issues (Bahovec 1995), also resulted in the selection of the topic for the Seminar of the Slovenian Language, Literature and Culture, which put women in the center of interest of Slovene studies (Derganc 1997). This event, even to a much larger extent than the discussions of 1995, showed how the approaches of Slovene studies are exceedingly traditional as the lecturers for instance addressed the issues of etymology and onomastics (Jakopin 1997, Keber 1997), phraseology (Krzisnik 1997) or traditional lexicology (Vidovic Muha 1997). At the same time, the main linguistic position was recognized: that language is merely a reflection of the society, i.e. for linguistic change, we must first change the social reality (Vidovic Muha 1997). This was in direct contrast with the original idea of the 1995 discussion on the changes in language and discourse as being the drivers of social change (Bahovec 1995: 31).

The second half of the 1990s thus opened the discussion on the issue of sexual equality in language and the visibility of genders in discourse as well as established the foundations for a gender--inclusive use of Slovene, including the attempts to intervene in legislative discursive practices with the bill (that was never passed) proposing a two--gender version of the Slovenian legislation (Stabej 1997). One of the most tangible results of the 1990s discussions was the Standardized Classification of Professions, (5) in a version written for both genders. Even though this is an important document that introduced consistent job descriptions for both men and women, this is still the segment of language, i.e. naming female actants with feminine noun forms, with which Slovene had had very few issues in the first place (Meckovska 1980).

Since the end of the 1990s, gender linguistics has become an increasingly significant segment of linguistic discussions. Slovenian linguistics, especially the segment dealing with Slovene studies, is marked by traditionalism and has been trailing behind modern research activities in the humanities and social sciences in Europe (Gorjanc 2017: 12). And even though "linguists dealing with Slovenian have not been particularly interested in gender linguistics" (Doleschal 2015), this is actually more a problem of the entire post-structuralist field of Slovene studies as other topics covered by post-structuralist approaches are even more poorly researched. Recent linguistic research on gender thus touches upon interlinguistic comparisons (Kranjc and Ozbot 2013; Plemenitas 2014) as well as different segments of communication in Slovene (Scuka 2014, 2017). In the Nomotechnical Guidelines for Slovene (6) (2008) the issue of gender is also addressed, and the interdisciplinary Handbook for Adopting Gender--Sensitive Approaches in Research and Teaching (Mihajlovic Trbovc and Hofman 2016) also includes the field of gender--sensitive language, whereas some critical linguistic studies employ an explicitly engaged approach, aimed towards the community with a humanist desire to effect social changes for a greater equality of all people living in the Slovenian society. One of these is also the effort of trying to change discursive practices towards gender--inclusive language using the underscore (Vicar and Kern 2017a, Vicar and Kern 2017b).

We therefore deal with the underscore extensively in the following sections. However, before we can discuss the use of the underscore as an innovative means of conveying gender-inclusive language, we must first outline the main tenets of expressing gender in Slovene, and then place the underscore within (or onto) this typology.

3 Expressing gender in Slovene

There are a number of ways of combining masculine and feminine forms in Slovene, used when there is a scarcity of space in play; in other cases, such contractions are not advisable (Kern and Dobrovoljc 2017):

a) Slash + hyphen (avtor/-ica)

b) Slash (zaposlen/a)

c) Brackets + hyphen (rojen(-a))

d) Brackets (rojen(a))

e) Hyphen (stanujoc-a)

Traditionally, the masculine forms in Slovene are/were considered to be unmarked in terms of gender. However, this perspective has recently been met with criticism (Kern and Dobrovoljc 2017): (7)
In the last couple of decades, however, the unmarkedness of the
masculine is no longer universally accepted, as its social context is
ever more brought to the front. Thus, parallel feminine forms are
increasingly being used. This has probably also increased the awareness
that there is a difference between the natural (genetically defined)
sex and [one's] sexual identity, and those who write want this
difference expressed.

This quotation is of special significance as it was put forth by the de facto official language--standardizing body in Slovenia, the Fran Ramovs Institute of the Slovenian Language, Academy of Sciences and Arts of the Republic of Slovenia, albeit not in the traditional form of the normative language handbook (i.e. Pravopis 'orthography'), but on the official website of the Institute's Counseling Corner. (8) As the latest edition of the orthography (Toporisic et al. 2001) is severely out of date (the draft was published as early as 1981) and is the primary source of language codification, the Counseling Corner may be understood as a reference (and, in some cases, as a quick aid) for various types of linguistic difficulties that the language users of Slovene are facing.

All five options listed above and provided by the Counseling Corner assume that there are two sexual identities, i.e. the male and female identities within a binary system. This is recognized by Kern and Dobrovoljc (2017), who add that
[t]here are also persons who do not identify (exclusively) with the
male or female sex and whose sexual identity extends over the binary
sexual system. In the last couple of years, the underscore is being
adopted for the inclusion of non--binary sexual identities between the
forms for the male and feminine. [...] This is characteristic not only
for Slovene but for a number of languages in which gender is more
clearly expressed than in, say, English. Among others, this includes
the German language and Slavic languages.

All this means that the non--binary concept has been accepted or at least partly recognized by the central institutions dealing with language codification. However, when Kern and Dobrovoljc (2017) reported on the use of the underscore as a new phenomenon in Slovene and added the sociolinguistic dimension to the understanding of this punctuation sign, this spurred much controversy in the Slovenian cultural environment. The use of the underscore, its aim and purpose as well as its adoption and reception are dealt with extensively in the following section.

3.1 The underscore, its adoption and reception

In this section, we aim to provide a detailed background on the adoption and reception of the underscore as a means of employing gender-inclusive language.

As already stated, the underscore was introduced as a means of overcoming the prevalent binary understanding of the distribution of sexes, which was engendered by and reflected in the use of the slash (meaning 'or'), unless an individual is comfortable with using it for themselves (Koletnik and Grm 2017: 22). Thus, it is a matter of linguistic/activist intervention for all genders that exist or may become existent in the future (Koletnik and Grm 2017: 22):
Unlike the slash (/), which still only offers two options and thereby
creates a binary, the underscore connects both standard grammatical
forms as well as separates them and, at the same time, creates an empty
space between them, projecting endings that are not yet included in the
linguistic norm, but may come about in the future. We use the
underscore with grammatical forms that involve gender, in order to
include all sexes, also those that surpass the sexual binarism
woman--man. [...] The underscore is to be used if an individual uses
such gender--neutral language [to refer to] in Slovene themselves.

This means that the use of the underscore is merely a recommendation to be taken into account especially when dealing with non--binary identifying persons, if these persons have a desire to be referred to in terms of gender--inclusive language. By creating an open space between both traditional poles (male vs. female), the underscore includes all genders, existent or otherwise, both in the visual sense (i.e., by creating a visual gap between the forms) as well as in grammatical terms (by serving as a means of punctuation).

The underscore is used extensively by the Slovenian organizations as well as online outlets dealing with the rights of the LGBTQ+ community. The main organizations are the following: Legebitra, TransAkcija, Dih, Skuc, Narobe, Open, The Peace Institute, Spol, LGBTpravice.

In addition to the above--listed organizations, a number of other enterprises and initiatives have also introduced the underscore to their style of writing, most notably the left--leaning political weekly Mladina (in its editorial), the editorial of the Kralji ulice magazine, the Institute Pekarna Magdalenske mreze, the ZaZivali association (Kern and Dobrovoljc 2017) as well as in academia (cf. Hofman 2017 (ed.)). As stated by Sribar et al. (2016), in regard to introducing new ways of applying gender--inclusive language, it is "[e]xactly the fields of university and civil society that are most open and are opening further still to connect the democratizations of communities and language."

Due to the novelty (and, arguably, difficulty of use; see Stumberger 2018a, 2018b; Ferrari Stojanovic 2018) of using the underscore when writing (and reading) in Slovene, there have been several issues raised over the following two aspects of using the underscore (Kern and Dobrovoljc 2017):

--The sequence of giving masculine and feminine forms (before and following the underscore)

--The linguistic element to follow the underscore (i.e. the entire affix or merely parts of it)

In addition to these two issues raised, there have also been concerns about the placement of the underscore within the Slovenian normative system (Stumberger 2018a, 2018b; Ferrari Stojanovic 2018). We deal with these concerns in the following sections, here we focus merely on the manner of writing. As concerns the sequence, this is of no particular relevance. However, it would be prudent to use the masculine form first in cases where the feminine form is built using an affix to the male form (which is the root), as in ucitelj_ica ('teacher'), reziser_ka ('director') or where there is a reduction in vowel (prisel_a 'came') (Kern and Dobrovoljc 2017). However, this would require that those who write in Slovene possess the knowledge about affixes and word--formation. Therefore, it is understandable that only those segments of words that are subject to variation when forming the feminine forms are normally put behind the underscore (Kern and Dobrovoljc 2017):

--zdravnik_ca or zdravnica_k,

--plural: zdravniki_ce or zdravnice_ki,

instead of:

--zdravnik_ica, zdravnik_nica or zdravnica_ik, zdravnica_nik,

--plural: zdravniki_ice, zdravniki_nice or zdravnice_iki, zdravnice_niki

The use of the underscore was also supported by FemA (Sribar et al. 2016), who claim that the
[p]erpetuated use of the generic masculine and its virtual neutrality
subordinate women. It is dismissive towards persons who find themselves
outside of the social division of men and women according to
physicality. The language reflects reality, it refers to it and is, at
the same time, the means of its creation.

The members of FemA also state that the increased awareness of the issues arising from grammar and lexicon regarding gender changed the overall narrative and that gender--inclusive language has become a sign of "cultural sophistication" (Sribar et al. 2016), adding that Slovene is flexible enough for one to convey something using gender--inclusive language without having to step outside the bounds of standard language.

However, stepping outside the language rules is also advisable, as they are a matter of social consensus as well as its changes (Sribar et al. 2016). This is especially important due to the fact that language is of utmost importance in the Slovenian society. Mostly due to specific socio-political circumstances throughout the Slovenian history, the Slovenian society has developed a distinctive attitude towards language, marked mainly by conservatism and purism (cf. Urbancic 1961; Stabej 2000; Stabej 2012). This attitude had been further entrenched by the purist efforts of the last two centuries, and the notion of having to preserve the Slovenian language and protect it from powerful others persists to this day. As Thomas (1997: 133) states,
[i]t is hardly surprising, given the cultural and political history of
the Slovene people, that purism should have been such a salient factor
in the formation and development of the Slovene Standard Language. The
overwhelming presence over many centuries of the German language in
Slovene intellectual and everyday urban life on the one hand and more
recently the threat of competition from Serbo--Croatian in the
fulfilment of many socio-communicative functions on the other can
scarcely have failed to leave a profound impact on the linguistic
attitudes of the Slovene people.

This has also influenced the Slovenian language policy, which is directive in nature (for the distinction between directive and liberal linguistic communities see Skiljan (1999)). This is, of course, significant due to the novelty of using the underscore, and there has been significant backlash against its adoption, for various reasons. As stated by Kern and Dobrovoljc (2017):
In Slovene, the use of the underscore is a novelty, but there is no
need to deem it unacceptable in advance. According to those who are
trying to implement it, it is the only method of writing that includes
the entire society; [however] [l]nguistics has not yet deliberated upon
the new modes of expression as they are--from the standpoint of syntax
and use--poorly researched.

This has been one of the main foci of the discussion whether or not the underscore should be implemented or not, which again shows that the Slovenian cultural environment, in terms of language, is a very directive one. This means that the question of the underscore and its suitability should be given over to linguists who should deliberate on the matter and reach some sort of decision, as evidenced by the following passage (Stumberger 2018a) referring to the above--mentioned reply of the Counseling Corner (cf. Kern and Dobrovoljc 2017) of the Institute:
Uncritical assumption of the viewpoint of one of the activist groups
without naming a source is unacceptable in a reply of the Counseling
Corner. When judging upon the novelties in (literary) language, us
linguists need to take into account language use, its system, tradition
and language economy, not to become heralds of a certain activist group.

In addition, the author (Stumberger 2018a) also states that the Institute's "reply that the use of the underscore was 'at first present within the transsexual community but became more general recently' will also have an effect on the future use of punctuation and also convince those who had previously thought differently. Or will someone dare doubt that?"

This was a vocal sign of protest against the introduction of the underscore as a means of gender--inclusive language in the Slovenian environment. However, there was also a "silent protest" as the following excerpt demonstrates (Stumberger 2018a):
I dared to respond and [...] published a response (9) on the Slovlit
forum [...], read by linguists, literary historians and those dealing
with Slovene studies. Therefore, I was certain that the topic will
attract other people to join the debate. It seems I was wrong as all
the responses of my colleagues were delivered to me in private. I was
also congratulated for my courage and saw that experts do not dare
speak about the underscore, whereas the non--experts do not really know
about it.

The blowback from the initial, exceedingly positive stance towards the use of the underscore had also contributed to the fact that the Counseling Corner issued another reply in regard to this issue, (10) this time signed by the editorial board of the service, as a reply to the following question (our emphasis):
More and more often I see this sort of writings: Udelezenci_ke naj se
zberejo v avli ('The participants should meet in the lobby'). I'm
curious if that's really necessary.

The reply states that the discussion on the use of the underscore had received widespread attention and that the editorial board found it necessary to comment on the issue:
The purpose of the reply in the Counseling Corner is not to approve or
disapprove of these linguistic choices, but merely to clarify that they
exist and that they can be observed in various social groups. [...] We
also view public responses and voiced opinions as an important part of
research if we want linguistics to be truly based on data. In doing
that, neither forbidding certain means of expression nor imposing our
own reflects a mature stance towards language.

The editorial board further touches upon the character of the replies within the Counseling Corner and its own role in the Slovenian linguistic community:
[T]he Counseling Corner was introduced as a means of quick aid with
linguistic matters. The replies aim to be as authentic a summary of
what is known about a linguistic phenomenon of [your] interest. If this
knowledge contradicts the normative guides, that is explicitly stated.
If, due to linguistic conditions that contrast those in the current
normative guides, there is a change in the codification on the horizon,
we aim to clarify and substantiate that. If we are dealing with an
under-researched phenomenon, we explicitly state that. The replies in
the Counseling Corner also reflect the opinions of the signed authors
with which at least half of the editorial board agrees. The replies in
the Counseling Corner do not constitute linguistic codification.

With this paragraph, the editorial board of the service sought to relativize its previous stance on the use of the underscore as well as to relativize its own credentials in terms of codification. It should be pointed out that this is the only reply in which the editorial board of the Counseling Corner saw fit to define its role in the Slovenian linguistic and academic community; this information is not even provided on the Counseling Corner website. The reply ends with the brief answer to the questioned posed by the user (written in bold in the above quotation): "Using the underscore is of course not necessary and left to your discretion. As already stated in the description of the guidelines, it depends on your beliefs and the personal message [you wish to convey] to the society."

4 Corpus analysis

In this section, we present the analysis of the two corpora built specifically for the purpose of studying the underscore as a new means of employing gender--inclusive language in Slovene.

4.1 Study design

Firstly, it needs to be pointed out that corpora represent a specific reality as they are built to represent a specific reality, although the representativeness is factored in the corpus--linguistic (Baker 1998: 50). This is very true also in our case as we are aiming to portray the use of a single linguistic element; what is more, we deliberately wish to include only those texts in the corpus that we know (or we at least presume) include what we are searching for. This means that we are building a corpus of Slovene that is representative as a whole of the object in question, i.e. the underscore. However, we still hold this language resource to be valid as it stands the test of all linguistic descriptions that come about on the basis of researching and analyzing corpora (Gorjanc and Fiser 2010: 10):

--It aims to portray linguistic reality.

--It does not rely on intuition, even when faced with unexpected results.

--It includes several bits of information on the typical text environment and on the general reality of communication.

Thus, bearing these points in mind, to provide a solid research foundation for this paper, we constructed two (parallel) corpora using the WebBootCat function of the SketchEngine platform. Naturally, it would be prudent to use any of the existing corpora for Slovene. However, this was not possible as the existing corpora do not cover the most recent texts, whereas using the underscore is a very recent phenomenon in Slovene. This is especially relevant due to the fact that there have been attempts of analyzing the underscore and its frequency of use (see Stumberger 2018b) by searching the Slovene reference corpus Gigafida for the use of the underscore, thereby ignoring the fact that this particular corpus only contains texts from 1991 up to 2011. Wishing to avoid any subjectivity, we thus concluded that building a new language resource is an unavoidable step.

For this reason, we automatically harvested texts from various online sources that use the underscore. These are the following: Legebitra (11), TransAkcija (12), Dih (13), Skuc (14), Narobe (15), Open (16), The Peace Institute (17), (18), and LGBTpravice (19).

We constructed the first corpus (termed EquiCorpus) end of March 2018, and the second one (EquiCorpus 2) end of September 2018, in order to allow six months of time to determine:

a) whether or not the use of the underscore is prolific

b) whether or not the use of the underscore is consistent

c) whether or not different organizations and the key actors in the LGBTQ+community (and/or the supporters thereof) had been using the underscore comparatively extensively

We aim to provide answers to these research questions in the following section.

4.2 Corpora analysis

Both corpora were constructed in exactly the same way, containing exactly the same sources. In this way, we tried to ensure that the results are as valid as possible. The characteristics of both corpora are given in Table 1.

As Table 1 demonstrates, the increase in size from EquiCorpus to EquiCorpus 2 in tokens and words is significant (roughly 350,000 and 300,000, respectively), even though a significant decrease in number is discernible in the categories of paragraphs and documents, and especially when taking into account the fact that we were not able to harvest the texts from one of the websites due to technical restrictions. (20) On the basis of the figures given in Table 1, we can conclude that even though the structure of the websites in question had (apparently) changed and/or the missing domain contributed a high number of documents (with a high number of paragraphs), the content of the websites grew considerably. This difference in size (in terms of words and tokens) shows that all domains are very much active and thus provide a suitable source of information for our research. (21)

As the domains grew in size in terms of words and tokens, it is vital to research whether or not the number of words used with the underscore grew by (at least) the same margin. Therefore, the frequency of tokens used with the underscore is given in Table 2.

As Table 2 demonstrates, there has been a significant increase (63.8%) in the number of the words used with the underscore in EquiCorpus compared to the original corpus, especially when we consider that the total word count of the corpus only grew by a 14.6% margin. This means that the use of the underscore is prolific and that its rate of use (speaking at least for the time period in question) is rapidly increasing.

A further point to be considered is the variability of use of the underscore. This can be observed in the category Tokens in Table 2. Tokens are unique (i.e. singular) words used with the underscore. The number of tokens grew by more than half (i.e. 50.1%), signifying that the use of the underscore is not only prolific, but also diversified as there have been a number of words used with the underscore that had not yet been used thusly in EquiCorpus. For this reason, we are supplying the most frequent words used in EquiCorpus and EquiCorpus 2 in Tables 3 and 4, respectively. (22)

As Tables 3 and 4 demonstrate, the most frequent word used with the underscore remained the same in both corpora (vabljene_i '[you are] invited'). The frequency of this word grew considerably (i.e. by 78.9%), which can be attributed to the fact that it is a very common expression used in addressing people in invitations and descriptions of events. If we observe the ten most common words used with the underscore, we can notice several differences, however:

As Table 5 demonstrates, there has been a notable increase in the pronoun vse_i (all) and function words in general (written in bold in Table 5). What should also be noted is the high occurrence (in both corpora) of words pertaining to individuals (posameznice_ka, posameznic_kov) and their occupations (ustvarjalkam_cem 'artists', delavk_cev 'workers').

This is also reflected in both corpora: if we look at both lists as a whole (cf. Tables 3 and 4 as well as the whole lists of words used with the underscore), we can identify roughly the following categories (in terms of function and topic):

1. function words and auxiliary verbs

2. verbs dealing with cognitive functions or functions related to cognizance (vprasale_i, zapisale_i)

3. nouns relating to individuals

4. nouns relating to occupations

5. nouns relating to migrants and other marginalized groups as well as the people acting on their behalf

6. nouns relating to healthcare (in terms of providing access to healthcare and health services for said marginalized groups

7. nouns relating to marital/relationship statuses

As we can deduct from the list above, the topicality of the words (and, hence, texts) using the underscore is closely connected to the LGBTQ+ community, its members and main focus points of their organizing and activist efforts, as well as with other groups that have been faced with hardships and social injustice.

4 Conclusions

In this paper, we tried to depict the efforts of employing gender--inclusive language in the Slovenian cultural environment in the last two decades. These have been marred with significant progress in several fields (as in the introduction of the standardized nomenclature of all vocations for men and women as well as in the successful shift of the discussion towards benevolence in matters dealing with the rights of the LGBTQ+ community and other marginalized groups). However, there are several issues that have been brought to light, especially with the latest developments in adopting gender--inclusive language. One of them is the exceedingly traditionalist views of gender, especially in the field of Slovene studies, in which the key figures do not shy away from employing manipulative acts to push the traditionalist doctrine. The second issue that is discernible in the field of Slovene studies is that of methodology as Slovene studies is marred by regressionist quasi--structuralist approaches, both in their view of gender (which is normally limited to being merely a grammatical category) as well as in their view of the role that language plays in the society (i.e. that it is by no means the driver of social change, merely its reflection).

We provided ample background to describe the state of affairs in the Slovenian cultural environment, focusing mainly on the recent developments, especially those connected to the introduction of the underscore used inventively to make the use of Slovene more gender--inclusive. To this end, we described the role of the underscore in detail, as well as its adoption and reception. As already pointed out, the underscore caused much controversy in Slovenia and was faced with significant resistance. As one of the major counter--arguments for the introduction of the underscore was its supposed disuse, we furnished our research by means of corpus analysis, in order to determine whether or not

1. the use of the underscore is prolific;

2. the use of the underscore is consistent and whether or not different organizations and the key actors in the LGBTQ+ community (and/or the supporters thereof) had been using the underscore comparatively extensively.

As concerns the prolific use of the underscore, we may safely say that it is very much being used in the LGBTQ+ community and several other cultural/media outlets. This is attested by the significant increase in number of words and tokens in the last six--month period. Drawing from the fact that both corpora that we constructed portray the most up--to--date version of Slovene (albeit, in a particular community, and in a pre--defined selection of sources), we may also assume that the ubiquity of this linguistic element accounts for its acceptance in said community. Of course, this does not mean that the underscore is a means of general written communication in Slovene, and this could hardly be proven with the manually constructed corpora we built for the purposes of this research. However, it is a relevant means of using gender-inclusive language within the community that is especially vocal in this respect and can hardly be expected to be adopted by a wider margin of speakers of Slovene within such a limited time span.

In terms of consistency, we can say that--upon looking at the wordlists obtained from the corpora--the writers who use the underscore use it remarkably consistently as we can hardly find different versions/spellings of words written with an underscore. Thus, one of the main criticisms of the underscore (that it is hard to use) seems to be invalid, at least in our database. On the other hand, the frequency lists show that there is some variation at the level of lexical choices as some of the most frequent words in EquiCorpus 2 are different to those in the original corpus, which shows that the underscore, at least in the community that is covered by our database, is a dynamic linguistic element, performing its intended function.


Izazovi uvodenja rodno osjetljivoga jezika u slovenskome jeziku

U ovome radu predstavljamo izazove uvodenja rodno osjetljivoga jezika u slovenski jezik u proteklim dvama desetljecima. Slijedom toga usredotocili smo se na progresivne pomake na podrucju uvodenja rodno osjetljivoga jezika i predstavit cemo cjelovitu pozadinu toga razvoja, sto ukljucuje i suprotstavljenu stranu (i njihove argumente), koja radi na tome da ponisti napore ulozene u uvodenje i primjenu rodno osjetljivoga jezika u slovenskome drustvu. U radu cemo pruziti pregled glavnih tendencija te jezicnih elemenata kojima se koristi s ciljem laksega uvodenja i usvajanja rodno osjetljivoga jezika u slovenskome drustvu, s glavnim naglaskom na upotrebu donje crte u pismu. U radu se analiziraju dva rucno izradena korpusa tekstova prikupljenih sa slovenskih mreznih stranica onih organizacija koje upotrebljavaju donju crtu radi primjene rodno osjetljivoga jezika. Analizom tih podataka pokazat cemo da se donja crta sve cesce upotrebljava unatoc naporima da se umanji ucinak njezine primjene tako sto se dovodi u pitanje upotrebljivost i prisutnost toga elementa.

Keywords: gender linguistics, discourse studies, Slovene

Kljucne rijeci: lingvistika roda, istrazivanja diskursa, slovenski jezik

Damjan Popic, Vojko Gorjanc

University of Ljubljana, Faculty of Arts,

Prihvaceno za tisak: 10. rujna 2018.

(1) Cf. the heading of an article in response to the publication of the Guidelines for Reporting on Transsexualism (Koletnik and Grm 2017), which was published in the right-wing news outlet NOVA24: "Degenerate Left Lecturing the Media How to Write about Transsexuals" ( (Last accessed: 9 October 2018).

(2) The decree of the Faculty of Arts served as a basis for the decision of the Equal Opportunities in Science Commission at the Ministry of Education, Science and Sport of the Republic of Slovenia, which understood the decree as "ethically motivated attempt of an institution to challenge the users of these texts to rethink their attitude towards unwanted sexually discriminatory features in language, as well as to remind the society as a whole of its permanent commitment to eradicate morally unjustified discrimination, wherever possible, by using public choices and policies." The commission is therefore "welcoming the decree and sees it as an important contribution to finding non-discriminatory language uses in public institutions of the Republic of Slovenia." (; last accessed: 22 November 2018).

(3) As the number of responses of members of the traditional academic institutions bearing the standardizing power in Slovenia (cf. Ahacic 2018) as well as that of traditional linguists (cf. Stumberger 2018a) demonstrates, linguistic arguments in the framework of standard language ideology are also used for manipulative purposes. The publication in the most influential Slovenian daily newspaper, in which the author is also a regular contributor, was of key importance for further discussion. This was namely the use of discourse from the position of authority of the media outlet in which the author is a regular columnist and used the medium to gain discursive authority, i.e. a situation in which it is the easiest for the readers to accept the discursive positions of the author as self-evident (Katnic Bakarsic 2012: 6-7).

(4) (Last accessed: 9 October 2018)

(5) The current Standardized Classification of Professions from 2011 is available at

(6) Available at

(7) As stated by one of the faculty members of the Faculty of Arts, University of Ljubljana, at a senate meeting, she "refused to be [further] reduced to a footnote", thereby referring to the usual practice of providing a disclaimer upon the first mention of a masculine form with a footnote saying that the masculine forms used in the document are neutral and pertain to both genders.

(8) See

(9) See Stumberger (2018b).

(10) Available at










(20) We were unable to harvest texts from the domain This accounts for a loss of at least 112,842 words in EquiCorpus 2 compared to EquiCorpus; however, it is very likely that the loss is greater as it would be natural to expect that the domain grew in size in the last six months. In spite of that, we elected not to perform any other procedures that would include the texts from the original corpus in EquiCorpus 2, as this would compromise the integrity of our research data.

(21) The complete wordlists from both corpora are available for download and further inspection/research in the form of Microsoft Excel spreadsheets. The EquiCorpus wordlist is available at, and the EquiCorpus 2 wordlist at

(22) The example that represents noise (i.e. the example banner_lj_exh) is marked with an asterisk. This also points to a potential flaw of the asterisk as a new means of expressing gender--inclusive language, as it is very frequent in filenames, web addresses, etc.
Table 1: Characteristics of EquiCorpus and EquiCorpus 2.

Category     EquiCorpus   EquiCorpus 2   Difference (+/-%)

Tokens       2,057,800    2,408,421      +14.6
Words        1,644,925    1,936,544      +15.1
Sentences       86,238       93,941       +8.2
Paragraphs      28,565       19,061      -33.3
Documents        1,766        1,282      -27.4

Table 2: Words containing the underscore in EquiCorpus and EquiCorpus 2.

Category                   EquiCorpus   EquiCorpus 2   Difference (+/-%)

Words with the underscore  1689         4667           +63.8
(total frequency)
Items (unique words) with   177          355           +50.1
the underscore
Average                    /            /              +56.95%

Table 3: The most frequent words containing the underscore in

Word                Count

vabljene_i          54
partner_ka          42
zapisale_i          39
vprasale_i          37
migrantke_i         33
posameznice_ka      32
bile i              28
posameznic_kov      25
vse i               24
partnerja_ke        23
akterk_jev          21
same i              20
sam a               20
bil a               19
partnerja_ki        17
zdravnici_ku        17
dobrodosle i        16
begunk_cev          16
imele i             16
*banner_lj_exh      15
zdravnice_ka        15
delavk_cev          15
aktivistke_i        15
dedic_nja           14
ustvarj_alkam_cem   14
zdravnice_ki        14
mladostnice_ki      14
obiskovalke_ci      13
naleteli_e          13
rad_a               13
posameznice_ke      12
razglasile_i        11
udelezenke_ci       11
sodelavci_kami      10
bralke_ce           10
begunkam_cem        10
partnerjev_ic       10
pripravile_i        10
sodelavci_ke        10
zdravnik_ca         10
partnerju_ki        10
izvedle_i           10
nekatere_i          10
transspolne_i       10
imel_a              10
dopolnil_a          10
obiskovalec_ka      10
same_ga             10
vlozile_i            9
slisali_e            9

Table 4: 50 most frequent words containing the underscore in EquiCorpus

Word                     Count

vabljene_i               255
vse_i                    159
delil_a                  132
posameznice_ka            68
ustvarj_alkam_cem         67
same_i                    57
delavk_cev                53
me_i                      53
bile_i                    51
zapisale_i                49
srecna_en                 48
posameznic_kov            45
zdravnice_ka              45
partner_ka                42
udelezile_i               42
tiste_i                   41
bil_a                     39
vprasale_i                38
obiskovalke_ce            36
sam_a                     34
gledalkam_cem             33
akterk_jev                33
slikarje_ke               32
plesalce_ke               32
gledaliscnike_ce          32
glasbenike_ce             32
zablestele_i              32
udelezenke_ci             32
lgbtq--ustvarj_alce_ke    31
imele_i                   31
vsak_a                    30
obiskovale_i              29
posameznice_ki            29
nekatere_i                28
partnerja_ke              24
dobrodosle_i              24
pripravile_i              22
aktivistke_i              22
prostovoljke_ci           21
zdravnici_ku              21
delavke_ci                21
imel_a                    21
posameznice_ke            21
ve_i                      21
migrantke_i               20
pogovarjale_i             20
dobile_i                  19
prostovoljk_cev           19
delavkami_ci              18
poskusale_i               18

Table 5: 10 most common words with the underscore in both corpora.

Word (EquiCorpus)   Count   Word (EquiCorpus 2)   Count

vabljene_i          54      vabljene_i            255
partner_ka          42      vse_i                 159
zapisale_i          39      delil_a               132
vprasale_i          37      posameznice_ka         68
migrantke_i         33      ustvarjalkam_cem       67
posameznice_ka      32      same_i                 57
bile_i              28      delavk_cev             53
posameznic_kov      25      me_i                   53
vse_i               24      bile_i                 51
partnerja_ke        23      zapisale_i             49
