Printer Friendly

Collocational differences between L1 and L2: implications for EFL learners and teachers.


It has long been established that differences in the structures of first and second languages may produce interference problems for L2 learners, and the similarities between them will probably (but not always) contribute to facilitation of learning (Corder, 1981). Following World War II, when it was believed that the best teaching materials for foreign languages should draw on a careful comparison of a "scientific description of the language to be learned" with "a parallel description of the native language of the learner" (Fries, 1945, p. 9), and when the discipline of contrastive analysis "was considered as the panacea for language teaching problems" (Keshavarz, 1999, p. 1), many studies were conducted to investigate the differences between a native language and a target language, which was usually English (Yarmohammadi, 1965; Oller & Ziahosseiny, 1970; Buteau, 1970).

One potential area of contrast that has not, however, been given due attention by researchers is the differences and/or similarities between two languages in terms of collocations. Multi-word expressions including collocation, although "an important component of fluent linguistic production" (Hyland, 2008, p. 4), are a problematic aspect of L2 learning that has been largely neglected in SLA research (Nesselhauf, 2003; Shei & Pain, 2000). According to Xiao and McEnery (2006), although research on collocation has recently seen a growth of interest, "there has been little work done on collocation ... [in] languages other than English" and "less work has been undertaken contrasting the collocational behaviour ... in different languages" (p. 103). Although the findings of a few studies on contrastive analysis of collocations between some languages have appeared in the literature (Bartning & Hammarberg, 2007, between Swedish and French; Xiao & McEnery, 2006, between Chinese and English; Wolter, 2006, between English and Japanese; Nesselhauf, 2003, between German and English), no published research seems to be available with respect to collocational differences or similarities between English and Persian. Indeed, a search of the Iranian Research Institute for Scientific Information and Documentation database, which files the abstracts of all master's and doctoral theses produced by Iranian researchers at home or abroad, returned no results.

My own experience as a high school teacher, language center tutor, university lecturer, and teacher trainer in various locations in Iran strongly suggests that a good number of syntactic and semantic errors by learners (and sometimes their teachers) may stem from a discrepancy between collocational patterns in the L1 and the language that they are struggling to master. Because according to Nesselhauf (2003), comprehension of collocations does not normally produce problems for learners so that identifying learners' problems "must mean analyzing their production of collocation" (p. 224), this study was an attempt to understand whether collocational differences between two languages (i.e., Persian and English) might lead to inaccuracies in the production of the target language for low-, mid-, and high-proficient EFL learners (namely, Iranian high school students and university learners majoring in EFL). Another equally important aim of the study was to determine the proportion of collocational errors directly caused by L1 interference. More precisely, the study was conducted to answer the following research questions.

1. Do collocational differences between Persian and English lead to inaccuracies in the production of the latter?

2. What proportion of collocational errors in the L2 (English) are directly caused by L1 (Persian) interference?

Although the collocational patterns investigated here draw on Persian as the L1, the findings may be generalizable to other contexts, especially where the learners' L1 has much in common with the L1 here (i.e., Persian) as in Arabic, Azari, Kurdish, Turkish, and Urdu.

After clarifying the meaning of collocation and indicating its importance in learning a foreign language, I present the method used to answer the above questions and discuss the findings. I also provide implications for language learners and their instructors and educational authorities.

The Meaning of Collocation

The word collocation is a relatively new addition to the lexicon of English. It first emerged in the writing of Jesperson (1924) and Palmer (1925) and was formally introduced to the discipline of linguistics by Firth (1957, cited in Hyland, 2008); it was further developed and publicized by Halliday and Sinclair during the 1960s (Krishnamurthy, Sinclair, Jones, & Daley, 2004). Collocation has been technically defined slightly variably by scholars, and as Gairns and Redman (1986) noted, "There are inevitably differences of opinion as to what represents an acceptable collocation" (p. 37). Cruse (1986), for example, defined it as "sequences of lexical items which habitually co-occur, but are nonetheless fully transparent in the sense that each lexical constituent is also a semantic constituent" (p. 40). Cruse distinguished collocations and idioms, reminding readers that in collocations (such as heavy rain or heavy smoker) there is a kind of semantic cohesion such that "the constituent elements are, to varying degrees, mutually selective" (p. 40) and that in "bound collocations" like foot the bill, "the constituents do not like to be separated" (p. 41).

Similarly, Carter (1998) used the term collocation to refer to "a group of words which occur repeatedly in a language" (p. 51) with the patterns of co-occurrence being either lexical (where co-occurrence patterns are probabilistic) or grammatical (where patterns are more fixed) with categorical overlaps in numerous instances. Colligation is a similar term that shows a general relation between the constituents in a construction as that between an adjective and a noun in He is a chain smoker (Matthews, 2007). For Carter, any lexical item of English (or node) can theoretically keep company with any other lexical item (or its cluster), but with varying degrees of probability; however, only those clusters with a high probability of co-occurrence with the node make a collocation. Carter categorized collocations further into four types moving from looser to more determined: unrestricted, semirestricted, familiar, and restricted. A more general and non-technical definition has been given for collocation by the Oxford Collocations Dictionary for Students of English (Lea, 2002): "the way words combine in a language to produce natural-sounding speech and writing" (p. vii). Krishnamurthy et al. (2004) and Lewis (2000) set a condition for the combination of words before they may be regarded as collocations: the co-occurrence of words should be statistically significant. Such a statistical view of collocation, which originated with Firth (1957), is essentially quantitative and has been accepted by many corpus linguists including Halliday (1966), Sinclair (1991), and Hoey (1991, cited in Xiao and McEnery, 2006). This statement implies that if a set of words occur together by chance, such an arrangement cannot necessarily guarantee that the elements so combined will produce a collocation. In other words, as Jackson and Ze Amvela (2000) put it, based on the principle of "mutual expectancy," "the occurrence of one word predicts the greater than chance likelihood that another word will occur in the context" (p. 114), which is essentially the same claim as that made by Hoey (1991): two lexical items may be regarded as an instance of collocation when one occurs with the other "with greater than random probability in its (textual) context" (p. 7). For example, in the above sentence, the co-occurrence of the words "of collocation when one" does not bind us to see it as a collocation. Predictability of pattern (Graney, 2000) is, therefore, a prerequisite for a set of words to be recognized as a collocation. Habitual co-occurrence of the elements (Shei & Pain, 2000) denotes a similar concept whereby replacing a word with a similar one will make the collocation less acceptable. What seems to be important in a discussion of collocations, therefore, is a shift of focus from single lexical items to strings of words or multiword expressions otherwise referred to as multiword units, formulaic expressions, prefabricated chunks, or ready-made utterances (Wang, 2005; Boers, Eyckmans, Kappel, Stengers, & Demecheleer, 2006), and clusters or bundles (Hyland, 2008).

According to Gairns and Redman (1986), there are two motives behind the creation of a collocation: "Items may co-occur simply because the combination reflects a common real world state of affairs" (p. 37) such as the words pass and salt in Pass the salt please! or because of "an added element of linguistic convention" such as lions roaring rather than bellowing. Thinking of collocations mainly in lexical terms, Jackson and Ze Amvela (2000) envisaged collocation as a structural or syntagmatic meaning relationship between predictably co-occurring words ("meaning relations that a word contracts with other words occurring in the same sentence or text"), comparing it with the notion of paradigmatic or substitutionary meaning "concerned with words as alternative items" (p. 113).

Nesselhauf (2003) provided a detailed account of what may or may not be counted as a collocation (as in take a picture) comparing it with free combinations (like want a car) and idioms (such as sweeten the pill). Although sophisticated measures are available to test what is and what is not a collocation (through tests of collocability, Xiao & McEnery, 2006), for our purposes, collocation simply refers to a set of words that can occur together in varying degrees of predictable patterns in various contexts. These combinations can range from loose associations--what Carter (1998) called unrestricted collocations such as eat an apple/breakfast/a piece of cake (where all three noun phrases can conveniently replace each other)--to completely fixed phrases, which Carter (1998) has termed restricted collocations like to put the cart before the horse or cash and carry (which in most cases have obtained the status of fixed expressions or idioms). Conducting text analyses with the help of corpus linguistics can inform us of the degree of collacability: whether varied lexical items can sit together, and if so with what degree of frequency (Widdowson, 2007).

The Significance of Collocations

Collocations seem to be important in learning a language because words are learned and used in context, and without knowing the proper co-text with which a word can be used, little claim can be made to have mastered that word. Knowing a word is indeed knowing how and where to use it (Phythian-Sence & Wagner, 2007) and without successfully employing its companions, out-of-context learning of word lists will be ineffective for achieving communicative competence, which should be regarded as the final end of all language-learning and teaching encounters (Canale & Swain, 1980).

The importance of collocations has long been stressed by scholars involved in teaching foreign languages (Xiao & McEnery, 2006). According to Boers et al. (2006), the importance of such word groups was recognized long ago by Palmer (1925). However, Firth (1957) is the most quoted scholar to claim that one knows a word by the company it keeps, implying that if a student knows the other words with which a lexical item can be used, he or she knows that word (and those with which it collocates); and that on the contrary, a student may not be thought of as knowing the language and using it properly if he or she knows the meaning of all entries in a dictionary but has problems in using such seemingly synonymous words as happy and glad in the sense that the first is used both attributively and predicatively, but the second only predicatively, so that whereas the former collocates with a following noun, the latter cannot although both can collocate with a preceding linking verb (Eastwood, 1999). As another example, although the words happy and merry can replace each other when used with Christmas, there are occasions where such substitution is not possible as in happy birthday but not *merry birthday (where * customarily shows a linguistically unacceptable structure). Knowledge of these co-occurrence restrictions is vital for communication to be successful.

Richards, Platt, and Platt (1992) similarly gave examples of words that "are used together regularly" (pp. 62-63) such as high, which collocates with probability but not with its synonym chance, which instead collocates with good. Some of these co-occurence patterns are so subtle that even advanced language users, including EFL teachers, may have problems with them, and this contributes to their inefficiency in tackling high-order communication tasks. Sonaiya (1988), therefore, seems to have been right in claiming that it is by the choice of words that effective communication is most hindered. The issue gains more significance when collocational patterns in various languages are compared. The following example from Wanner, Bohnet, and Giereth (2006) is a good starting point for thinking of cross-language collocational differences.
   In English one makes or takes a decision, in French and Italian one
   "takes" but does not "make" it (prendre/*faire une decision,
   prendere/*fare una decisione), in German, one "meets it" (eine
   Entscheidung treffen), in Spanish one "adopts" or "takes" it
   (adoptar/tomar una decision), and in Russian one "hosts" it
   (prinjato resenie); in English one gives a lecture--as in French
   (donner un cours) and Spanish (dar una clase)--in German and
   Italian one "holds" a lecture (eine Vorlesung halten, tenere una
   lezione), and in Russian one "reads" it (citato lekciju); etc.

Considering the role that mastery of collocations plays in communicative competence, teaching and learning them will gain immediate significance. In support of their importance in language education, Lewis (1997), for example, contended that competence and proficiency in a language is a matter of acquiring fixed or semifixed prefabricated items. Moudraia (2001) argued that multi-word lexical units are a form of collocations that plays a vital role not only in first language acquisition, but also in learning any second or foreign language; this illuminates how seriously teaching and learning these multi-word expressions should be taken. A look back at the previous sentence reveals that linguistic messages are at most made of word groups or chunks (such as lexical units, play a vital role, not only, but also, first language acquisition, second or foreign language, and the like): indeed, Altenberg (1998) suggested that "as much as 80% of natural language could be patterned in this way" (cited in Hyland, 2008, p. 6). It is, therefore, apparent that to achieve competence (linguistic or communicative), the learner will need to master semifixed and fixed expressions. Examples of more recent studies focusing on the collocational interference in varied contexts, or on the significance of learning and teaching collocations, include Bahumaid (2006), Mahlberg (2006), Xiao and McEnery (2006), Baker and McEnery (2005), and Teubert (2004).



A total of 76 participants took part in this project. The main criterion for the selection of participants for the study was to include EFL learners at various levels of proficiency. Random selection of participants was not possible; however, all university students majoring in EFL in Islamic Azad University of Salmas (a small town in West Azarbaijan province in Iran, bordering Turkey) and students from two randomly selected classes from a high school in Salmas served as the research participants. For logistical reasons, intact group design was accordingly used for the study.

The first group was a cohort of 30 male high school students in their final year. This group was regarded as the low-proficient group because although they had started learning English five years earlier in middle school, they received only three hours of EFL instruction a week during the semester. The two other groups of participants came from Islamic Azad University of Salmas, where I served as an invited lecturer. These latter groups (first-year/freshmen and third-year/juniors) were majoring in English language teaching at the undergraduate level and served roughly as mid-proficient and high-proficient groups respectively. Although a valid proficiency test such as TOEFL or IELTS could have been used to group participants, the fact that participants in each group received similar English education under almost the same conditions eases concerns about inappropriate proficiency levels. However, one high school student and three junior university students attended private English classes, which might suggest higher proficiency; nevertheless, they were not considered outliers in their groups because the data elicited indicated that their performance was comparable to the performance of their group-mates. The sex variable was not regarded as a moderator because the sample was opportunistic, and whereas the university students were predominantly women, all members of the high school sample were boys. Taking into consideration that the independent variable of the study (Persian as L1) was the same for all the participants, the sex variable does not seem to account for a significant difference (if any) for transfer problems as far as other important variables such as language proficiency and L1 are shared by members of each group.

It should further be noted that the first language of most participants was Azari (rather than Persian), but because all had been educated in Persian rather than their own L1--for which no formal education exists in Iran, and from which transfer to English, either negative or positive, is less probable than transfer from Persian as discussed below--Persian is regarded as the assumed as opposed to real L1 in the discussions that follow. Table 1 shows the composition of the participants included in the final analysis.


In order to answer the above research questions, in the absence of any valid and reliable measure already available for such a purpose, I designed a test of collocations for the specific purpose at hand by consulting the high school English books in Iran, a few other vocabulary and grammar books including English Vocabulary in Use (Redman, 2003), Check Your Vocabulary for English for the IELTS Examination (Wyatt, 2002), Oxford Practice Grammar (Eastwood, 1999), and some of the major books that are studied by EFL majors in Iranian universities or language centers to identify a list of semifixed and fixed collocational expressions (100 in total). Then 60% of the most frequent and productive collocations (those that I had seen causing most problems during my past teaching experience) were finally selected for inclusion in a test of collocations. The process of selecting which items to include in the initial list and in the final test of collocations was subjective and intuitive in line with Bachman's (1990) perception of the role of subjectivity in the test-construction process.

The final version of the test had 60 items. In each case the Persian equivalent of the collocation was given as the stem of the item, and participants were asked to choose the correct English counterpart from four choices offered. In writing distracters, a deliberate attempt was made to include at least one choice that was a literal translation of the Persian version of the collocation being tested. Most other alternatives also had an element traceable to the Persian language or showing an intra-lingual transfer. For example, for the Persian equivalent of make a mistake, the following distracters were provided: *do wrong, *do a mistake, and *make wrong. Each of these if chosen would have indicated negative transfer of various kinds from Persian or intra-lingual interference in English where mistake and wrong could be confused. In Persian there is one common word, eshtebah, for both wrong and mistake, along with two other less common words, khata and ghalat, which can translate as mistake and wrong respectively, but with somewhat different collocational meanings when used with the Persian kardan or to do; and in Persian, mistakes are done rather than made, a difference that is expected to produce problems in production at least for novice EFL learners. However, because of the high frequency with which this collocation is used (as the data below show), even the lowest-proficient students seem to have gained a good mastery of this particular collocational item. Features of varied item types in the test of collocations, that is, lexical versus grammatical, following Benson (1985), are shown in Table 2, and the full test appears in the Appendix.


After reviewing the test several times for possible faults, I administered the final draft separately to the three groups of participants as mentioned above during their normal classroom hours. The time to complete the test was set at 20 minutes initially, but more time was allowed for those who requested it.

The participants were asked to make their best guess and to choose the best English alternative for the Persian collocation. They were also assured that the task was for research purposes and that any incorrect replies or lower marks would not have any negative effect on their final achievement. The test was accompanied by a small questionnaire where the participants were asked to provide demographic information regarding age, sex, L1, L2, extracurricular English language education, and including a note on the purpose of the research and the confidentiality of the personal information they provided.


Table 3 displays descriptive statistics on the magnitude of the difficulty experienced by various groups in the study in choosing acceptable English counterparts for Persian collocations. For the data to be easily understandable, the raw numbers have been changed to percentages, and the reported figures show the mean of unacceptable responses for each group. Furthermore, as the focus of the study was to understand the magnitude of the apparent negative transfer from L1 to L2, the percentages are reported for incorrect rather than correct responses.

A surface look at Table 3 indicates that as expected, low-proficient learners experienced more problems in L2 collocations; the high school sample produced more errors (72.1%) than the university sample (mean 57.6%), their difference being significant with an observed chi-square value of 3.54, which exceeds the critical value of 2.706 at the probability level of 0.1. A more surprising finding, however, is that about two thirds of the collocations (more than 62%) proved problematic for EFL students on average, which means that only about one third of some of the most frequent collocations posed no challenge to EFL learners. This observation alone points to an answer for the first research question: Differences in collocational patterns between the L1 and the L2 (Persian and English respectively) do seem to produce problems of production for L2 learners, especially at lower levels of proficiency. However, the problems experienced by mid- and high-proficient participants were not significantly different (with an observed chi-square value of 0.07, much lower than the relevant critical values), presumably because most university students majoring in EFL in Iran spend the first two years of their education studying mainly general English (such as grammar, reading, and conversation), where they are quite likely to encounter and focus on collocations. However, when they move to their third year, they become largely preoccupied with more specialized issues such as teaching methodology, linguistics, and language testing, where the opportunity to attend to collocations decreases to a great extent.

Not surprisingly, a large proportion of collocational problems (about 85%) for all groups in the study, with no significant difference between groups as shown by an observed chi-square value of 0.016, was traceable to L1 interference, an observation that suggests an answer for the second question of the study, that is, the proportion of errors traceable to interference from Persian language.


The data presented above seem to indicate that differences between L1 and L2 collocational patterns contribute substantially to errors in the production of L2 collocations for proficient as well as less proficient EFL learners. It was also revealed that most collocation problems can be attributed to negative transfer from L1, an observation that supports the findings of Nesselhauf (2003), who noticed that negative transfer from L1 German to L2 English was significantly high, with 56% of all collocational errors in L2 written production attributable to L1 interference. A seemingly unexpected finding, however, is the unsatisfactory performance of participants supposed to be highly or relatively proficient in English: even relatively advanced university EFL students seemed to lack language proficiency if general language proficiency is considered to be a function, at least in part, of knowledge of target-language collocations.

Interpretation of the findings is facilitated by noting that although most research participants had Azari as their L1 and a limited number spoke Kurdish as their L1--with Persian as a mother tongue for none of them--the L1 of the participants was considered Persian rather than Azari or Kurdish (with actual first-language background not being included as a moderating variable) for a number of reasons. First, neither the Azari nor the Kurdish language has written forms in Iran, and no L1 speaker of these languages receives formal education in Iran's public or private system of education. This implies that although these two languages (and especially the former) are widely spoken by people in society, when it comes to the school or university context, only the official language of the country, Persian, is allowed (and favored). All schoolchildren receive their formal primary and secondary education in Persian, and Persian is the language of media, commerce, and formal encounters so that it is widely spoken by all non-Persian speakers (as L1) in Iranian society; thus all literate, school-educated individuals in Iran are competent enough in Persian to be regarded as near-native speakers (despite the Azari accent of some Azari Persian speakers). Furthermore, nowadays an increasing number of Azari parents prefer to teach their children Persian along with (or at times without) their own first language. All this means that for the participants in this research, Persian may conveniently be regarded as a first language; especially because of the formal education Iranians receive in its structure and vocabulary, Persian is people's first option in academic encounters (as in the case of this research), rather than their Azari or Kurdish L1, the grammar or structure, the lexicon, and even the alphabet of which are never learned, discussed, or consciously reflected on. So although this needs to be verified empirically, the understanding is that Azari or Kurdish learners of English in Iran, although there are too few of the latter to affect the findings of this study adversely, will turn to Persian as their first resource when comparing English collocation patterns (or by the same token, any other structure) simply because it may be difficult (if not impossible) for them to activate the relevant schemata in their own L1. However, this does not mean that there is no possibility of transfer from the participants' real L1.

Second, considering Persian as the L1 in this study does not rule out the possibility of participants' original L1 playing a role in either facilitating or hindering the parallel English collocations that they chose. However, due to similarities in pattern in Persian, Azari, and Kurdish between most of the collocations studied here, only slight differences in the findings would be expected if the real L1 of the participants were to be looked at as the independent variable. For example, as in Persian, in both Azari and Kurdish mistakes are done rather than made. The ultimate decision on the source of inter-language transfer (whether it originated from Azari/Kurdish as L1 or Persian) was not, however, investigated in this research for the reason provided above and also because the test of collocations used the Persian language as its starting point. Participants were requested to guess English parallels to Persian collocations, which probably forced them to be preoccupied with and have recourse to Persian in the first place rather than their own L1. However, the problem of tracing inter-lingual interference might probably be more appropriately tackled by integrating a qualitative element into the study (which may be taken up as further research) such that participants might be invited to elaborate on the nature and sources of the errors they made.

Another striking piece of evidence revealed by the data relates to the relationship between the amount of inter-lingual transfer and the proficiency of the participants. Based on the moderate version of the contrastive analysis hypothesis (Oller & Ziahosseiny, 1970), the expectation was that the more proficient the students became in the L2, the lower the amount of negative transfer from L1 (and possibly the higher the rate of intra-lingual transfer), but the data did not substantiate such an assumption. Table 3 reveals that strangely enough, as the participants gained English language proficiency (at least based on their current placements in high school or in the various university groups), the magnitude of transfer from their first language also increased (insignificant though the difference might be), which means that L1 transfer was a factor in producing incorrect L2 collocations at all proficiency levels.

It is also worth noting that in all items for all groups, except for two items for university juniors (make a mistake and give sb a smile), there was some degree of incorrect response ranging from 3-100%. This means that almost all collocations studied here challenged all EFL learners in some way. The reason why the above two items did not challenge the junior participants may be because these two high-frequency expressions are learned early by even beginning EFL learners and are also encountered often in a variety of oral or written materials in university. Possible extraneous variables such as options chosen by chance and the problem of cheating are difficult and at times impossible to control and should not, of course, be neglected. The problem of cheating is significant in many of Iran's education centers even when the tests have nothing to do with the testees' classroom achievement/scores. This is not simply my biased observation, but is confirmed by many others with whom I have had informal contacts.

The data also indicated which items were the most problematic (over 90% inaccuracy level), and which were unchallenging (less than 10% inaccuracy level) for each group, as well as those that were easy for one group and difficult for others or vice versa. Tables 4 and 5 list the items with such characteristics.

Table 4 shows that a good number of items (over 18%) proved difficult for the low-proficient group, and indeed no participant was able to provide the English parallel for the Persian collocation meaning go bankrupt. One reason why the participants could not choose the correct option may be because the word bankrupt itself is a low-frequency word that high school students had probably not previously encountered. However, the point under investigation was the verb that accompanied bankrupt (with the adjective being repeated in each option), as the adjective was itself repeated in all choices. In Persian as well as in Azari and Kurdish, companies may *get bankrupt or *become bankrupt rather than go bankrupt. This item was among the most difficult for all the groups, with an average inaccuracy level of 95.5%. Table 4 also indicates which other items were most problematic for each group or jointly for more than one group.

Examination of the most difficult items shared by all proficiency levels gives us some clues about the magnitude of L1 transfer. For example, in the case of tell the difference, about 90% of the incorrect responses can be traced to L1 vocabulary (with Azari and Kurdish having a pattern similar to Persian), and regarding nine-to-five job almost the same rate applies. However, whereas the transfer in the former case draws on semantic or vocabulary differences between English and Persian (as well as Azari and Kurdish), in the latter instance, pragmatic or cultural differences may have led to the mistake: whereas a full-time job in a typical Western context involves starting at 9:00 a.m. and continuing to 5:00 p.m., the parallel Iranian working hours are 8:00-4:00 or morning to afternoon, and it would be culturally inappropriate to envisage a person (especially in the public sector in Iran) starting work at 9:00 a.m. As far as sooner or later is concerned, Persian-speaking Iranians prioritize late over soon and use the simple form of the words rather than their comparative versions, factors that can justify almost all incorrect answers. Noting that Azari speakers use soon or late rather than late or soon as Persian speakers do, and observing that nearly 67% of the participants (of whom more than 90% spoke Azari as their L1) selected late or soon rather than soon or late as the parallel collocation for English sooner or later, it appears that Azari learners of English initially turned to Persian rather than to their own language as a way to find what for them seemed to be the most appropriate parallel for the English collocation sooner or later; also, only a few (l7.5%) opted for soon or late (the Azari counterpart of the collocation in question), which seems to offer further evidence for the proportion of transfer from Azari as their real L1 compared with Persian as their assumed L1.

The reverse side of the story is depicted in Table 5. Surprisingly, although a good proportion of items proved challenging for each group, the number of unchallenging items was negligible for low- and mid-proficient participants, and only a few items proved unchallenging even for the most advanced group. The only item that was easy for at least two groups (for low- and high-proficient, but not so easy for mid-proficient participants) was heavy rain, for which the distracters given--*hard rain and *great rain, which are both directly connected with the L1, where the words hard and great can conveniently replace the literal translational value of heavy--produced little problem. The reason this item was easy even for high schoolers or low-proficient participants has to do with its frequency of use in Iranian high school English textbooks, as well as the possible positive transfer from the Azari language where rain can be heavy, rather than in Persian where it is usually great or hard rather than heavy. The fact that Persian is assumed to be the L1 of the participants does not rule out the possibility of some degree of transfer from the participants' real L1--Azari or Kurdish--however small this may be.

The data were also checked for items that functioned variably for varied proficiency groups. On the whole, university participants performed differently from lower-proficient high school participants. When dealing with such items as civil war or brother and sister, low-level participants paired with mid-level ones, for both of whom these items produced greater difficulty than for the more advanced group. The percentage following each item in Table 6 refers to the rate of inaccurate responses.

In the first four items above, university students, the two relatively high-proficient groups, showed a higher rate of correct answers, with high school participants as the low-proficient group experiencing considerable problems. The last item, however, seems to be inconsistent in that it proved easy for the low-level and high-level participants but was bothersome for the mid-proficient group. Although adding a qualitative component would perhaps have provided a more acceptable explanation, a tentative solution for this pattern may be that freshmen (taking an English vocabulary course simultaneously) were probably preoccupied with the Persian equivalents and synonyms of heavy rather than thinking of which might collocate with rain

Conclusion and Implications

This study was conducted to determine how far differences in collocational patterns between two languages may lead to difficulties in the use of L2 collocations and the amount of L1 interference in this process. The major findings are that collocational differences between the L1 and the L2 produce challenges for L2 learners (a finding that substantiates the arguments made by Wolter, 2006) and that, as expected, compared with low-proficient participants, relatively high-proficient participants had a higher collocations repertoire (although not for all collocations studied here). A further finding deserving some consideration is that university students seem to lose their collocational competence as they move toward later years of study, and only a small and insignificant difference between mid-proficient and high-proficient EFL majors was found in this respect. These observations have far-reaching implications for EFL learners, teachers, materials developers, educational authorities, and policymakers, not only in Iran, but in many other similar contexts. As far as learners are concerned, the bulk of the evidence produced here suggests that most of the participants experienced difficulties in identifying the right collocation equivalents between the two languages in question. More problems are to be expected when the real use of language is involved, especially in spontaneous oral communication where there is little chance of monitoring. Undoubtedly in an ESL situation, the expected problems in both the recognition and the production of correct collocations will decline considerably as learners consciously or unconsciously learn English even beyond the formal learning context. Although this needs to be validated empirically, no matter what their L1 background, EFL learners may be expected to face at least some problems of the kind presented above to the extent that there are differences between the collocation patterns of their L1 and the language they are learning, unless they have become highly proficient in the target language. Wolter's (2006) two claims "that learners will often make collocational errors even when they are familiar with the words that comprise the 'proper' collocation," (p. 746) and that "we do ... quite often see 'syntagmatic' mistakes in the form of inappropriate collocations" (p. 747) both point to the relative difficulty that learners at most proficiency levels will have with correct L2 collocations.

The immediate implication is a need for beginning learners consciously to learn high-frequency collocations and for intermediate and advanced EFL learners to learn less frequent ones (in addition to highly frequent ones); also, advanced EFL learners should not feel that having moved away from general English courses, they need no longer worry about learning more collocations or maintaining and using those they already know. EFL teachers will, therefore, be wise to pay deliberate attention to the explicit teaching of such expressions and to providing sufficient practice opportunities both inside and outside the classroom. Although such efforts may seem to be the duty of teachers in vocabulary, reading, or grammar classes, another element appears to be necessary: integration of a specific course on collocations alongside extension of vocabulary, grammar, reading, writing, conversation, and collocation courses into the final years of language education, and initiation of special courses earlier in the program. Interestingly, in private language centers where the focus is only on language education (rather than on specialized issues such as linguistics and teaching methodology), the problem of decreasing language proficiency (including knowledge of collocations) appears to occur less noticeably than in universities. In Iran's undergraduate EFL education at the university level, the emphasis is on language proficiency for at most the first two years, and during the third and fourth years when the focus is on more technical issues, the emphasis on language proficiency declines, sometimes severely. However, considering that EFL graduates from such universities will most probably be recruited as English-language teachers at a variety of levels, it is vital to keep learners up to date not only in teaching methodology, but also in general English proficiency throughout their postsecondary studies.

Explicit instruction of collocations is especially important in the light of Carter's (1998) claim that "collocational mismatches are frequent in the language production of second-language learners since learners never encounter a word or combinations of words with sufficient frequency" (pp. 73-74). In fact Boers et al.'s (2006) study with 32 college students majoring in English indicated that explicit teaching (or what they call noticing) of formulaic expressions, including collocations, led to better fluency and accuracy in L2 oral communication, a finding that prompted them to conclude that "helping learners build a repertoire of formulaic sequences can be a useful contribution to improving their oral proficiency" (p. 245). Xiao and McEnery (2006), whose contrastive study of Chinese and English collocations revealed that "a contrastive analysis of collocation ... would be useful to L2 learners" (p. 125), drew a similar conclusion.

Accordingly, the recommendation for policymakers concerned with university-level English education is to plan courses dealing with aspects of language proficiency, including collocational knowledge, throughout the period of English study rather than only during the first few semesters. Even if it is impossible to introduce courses dealing exclusively with collocations, developers may be able to revise existing materials so as to allocate time for the conscious presentation and teaching of collocations, which would serve the purpose at hand at least partly. Fortunately, a handful of excellent books are already available on the market such as Marks and Wooder (2007), McCarthy and O"Dell (2005), Dixson (2004), Koster and Limper (2000), and Rudska, Channell, Putseys, and Ostyn (1981), which can be of great value for teachers and learners. Also, more generally communicative books, especially those drawing on studies of cross-language collocations or informed by research on intra-lingual interference, would be desirable additions to the list. Although the question of which collocations to include and which to exclude in such books is thorny simply because it is "clearly impossible to teach all (or even most) of the collocations in a language" (Nesselhauf, 2003, p. 238), such criteria as frequency, range, and learnability may provide guidance in the selection process (Koprowski, 2005), as well as congruence and restriction. Nesselhauf, McAlpine, and Myles (2003) interestingly proposed developing an online searchable dictionary of collocations in which "all the more or less fixed expressions ... [cohere] around a node word" (p. 81), with special value in situations when a learner would like to know what clusters are commonly used with a variety of nodes.

To conclude, for EFL/ESL learners to achieve an acceptable level of language proficiency and communicative competence, sound knowledge and flexible use of collocations in English seem indispensable; and in order for them to reach such a goal, the significance of collocations should receive increased attention from teaching experts and curriculum specialists, who should in turn promote teaching them and include them in syllabi. Offering explicit instruction on target language collocations (especially in EFL contexts), focusing more on the use and usage of collocations by both teachers and learners, and building more practice activities on collocations into relevant EFL coursebooks at all proficiency levels are among the most immediate implications with value for EFL practitioners. One of the most defensible conclusions on the teaching of collocations has been offered by Nesselhauf (2003), who asserts that "an L1-based approach to the teaching of collocations seems highly desirable" (p. 240).

Finally, as Xiao and McEnery (2006) rightly emphasized, "there is a pressing need for the cross-linguistic study of collocation ... to be pursued by researchers" (p. 127). The EFL/ESL research community can contribute by first identifying word combinations in the L2 (using findings from corpus linguistics) that are sufficiently predictable or statistically significant, and then by drawing on contrastive analysis, among other resources, to investigate various avenues by which the teaching and learning of collocations may be accomplished in the most cost-effective, convenient, and productive manner.
Appendix: The Test of Collocations

Choose the best English equivalent for the Farsi phrase.


a. do a mistake       b. do wrong          c. make a mistake
d. make wrong


a. take a shower      b. have a shower      c. get a shower
d. catch a shower


a. eat cold           b. get cold           c. get a cold
d. catch cold


a. take a decision    b. decide a decision   c. get a decision
d. catch a decision


a. hit sb a smile     b. give sb a smile     c. offer sb a smile
d. smile sb a smile


a. in a hurry         b. with a hurry        c. in hurry
d. with hurry


a. take time          b. get time            c. catch time
d. carry time


a. surprised at       b. surprised with      c. surprised from
d. surprised of


a. Sit a seat.        b. Do a seat.          c. Take a seat.
d. Use a seat.


a. take an exam       b. give an exam        c. sit an exam
d. do an exam


a. go by car          b. go with car         c. go by a car
d. go in car


a. tell a lie         b. lie a lie           c. say a lie
d. tell lie


a. give a guess       b. guess a guess       c. hit a guess
d. have a guess


a. get a brake        b. put on brakes       c. catch brakes
d. do brakes


a. get a diet         b. take a diet         c. go on a diet
d. catch a diet


a. throw a look at sth   b. drop a look at sth   c. give a look at sth
d. have a look at sth


a. lose the bus       b. miss the bus        c. give the bus
d. sell the bus


a. turn on the car    b. light the car       c. begin the car
d. start the car


a. serious rain       b. hard rain           c. great rain
d. heavy rain


a. from the one hand  b. on the one hand     c. from one side
d. on one side


a. get married with sb   b. marry with sb    c. get married to sb
d. marry sb


a. go for a walk      b. go a walk           c. go walking
d. go for walking


a. black tea          b. bold tea            c. colourful tea
d. dark tea


a. in dark glasses    b. with dark glasses   c. by dark glasses
d. in dark glass


a. by mistake         b. with mistake        c. mistakely
d. by wrong


a. make friends with sb      b. make a friend with sb
c. become a friend with sb   d. find friends with sb


a. make noise         b. do noise            c. make a noise
d. do a noise


a. leave a message b. put a message c. lay a message d. place a message


a. go bankrupt b. become bankrupt c. get bankrupt d. fall bankrupt


a. wear perfume b. hit perfume c. have perfume d. beat perfume


a. have a lot in common b. have a lot of commons
c. have many in common d. have many commons


a. as a result b. in result c. in a result d. at result


a. take drugs b. have medicine c. take medicine d. have drugs


a. Many thanks to you. b. Many thanks from you.
c. I thank from you a lot. d. Thanks you a lot.


a. tell the difference b. say the difference
c. put the difference d. place the difference


a. identical to each other b. identical each other
c. identical with each other d. identical as each other


a. shake hands with sb b. give hands with sb
c. give a hand with sb c. shake a hand with sb


a. broad shoulders    b. four shoulders      c. broad shoulder
d. four shoulder


a. broad-minded       b. broad-mind          c. light-minded
d. light-mind


a. put on weight      b. grow weight         c. add weight
d. increase weight


a. traffic lights     b. red light           c. guide light
d. green light


a. speed limit        b. allowed speed       c. limit speed
d. speed allowed


a. nine-to-five job   b. eight-to-four job  c. morning-to-afternoon job
d. seven-to-three job


a. civil war          b. inner war           c. internal war
d. internal fight


a. Ladies and Gentlemen   b. Gentlemen and Ladies   c. Sirs and Madams
d. Madams and Sirs


a. boarding card      b. flight card         c. airplane card
d. airport card


a. more or less       b. less or more        c. little or much
d. much or little


a. late or soon       b. soon or late        c. sooner or later
d. later or sooner


a. deep yellow        b. dark yellow         c. colourful yellow
d. bold yellow


a. pale orange        b. light orange        c. colourless orange
d. bright orange


a. dress a salad      b. decorate a salad    c. cover a salad
d. make a salad


a. strawberry plant   b. strawberry tree     c. strawberry bush
d. strawberry vine


a. apple pips         b. apple stones        c. apple seeds
d. apple nuclei


a. grow flowers       b. train flowers       c. bring up flowers
d. educate flowers


a. raise an objection   b. give an objection   c. do an objection
d. have a objection


a. calm voice         b. low voice           c. soft voice
d. relaxed voice

a. vinegar salty      b. salt vinegar        c. salt and vinegar
d. vinegar and salt


a. sister brother     b. brother and sister  c. sister and brother
d. brother sister


a. keep one's promise  b. do one's promise   c. operate one's promise
d. apply one's promise


a. first class information   b. brand-new information
c. first-hand information    d. business class information

Thanks very much for your cooperation.


I acknowledge Mohammad Saei Nia's sincere assistance in helping me to collect data from the high school sample.


Bachman, L.F. (1990). Fundamental considerations in language testing. Oxford, UK: Oxford University Press.

Bahumaid, S. (2006). Collocations in English-Arabic translation. Babel, 52, 133-152.

Baker, P., & McEnery, T. (2005). A corpus-based approach to discourses of refugees and asylum seekers in UN and newspaper texts. Journal of Language and Politics, 4, 197-226.

Bartning, I., & Hammarberg, B. (2007). The functions of high-frequency collocation in native and learner discourse: The case of French c'est and Swedish det iir. IRAL, 45, 1-43.

Benson, M. (1985). Collocations and idioms. In R. Ilson (Ed.), Dictionaries, lexicography and language learning (pp. 61-68). Oxford, UK: Pergamon.

Boers, F., Eyckmans, J., Kappel, J., Stengers, H., & Demecheleer, M. (2006). Formulaic sequences and perceived oral proficiency: Putting a lexical approach to the test. Language Teaching Research, 10, 245-61.

Buteau, M.F. (1970). Students' errors and the learning of French as a second language: A pilot study. IRAL, 8, 133-145.

Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1, 1-47.

Carter, R. (1998). Vocabulary: Applied linguistic perspectives (2nd ed.). London: Routledge.

Corder, S.P. (1981). Error analysis and interlanguage. Oxford, UK: Oxford University Press.

Cruse, D.A. (1986). Lexical semantics. Cambridge, UK: Cambridge University Press.

Dixson, R.J. (2004). Essential idioms in English: Phrasal verbs and collocations. London: Longman.

Eastwood, J. (1999). Oxford practice grammar. Oxford, UK: Oxford University Press.

Firth, J.R. (1957). Papers in linguistics: 1934-1951. London: Oxford University Press.

Fries, C.C. (1945). Teaching and learning English as a foreign language. Ann Arbor, MI: University of Michigan Press.

Gairns, R., & Redman, S. (1986). Working with words: A guide to teaching and learning vocabulary. Cambridge, UK: Cambridge University Press.

Graney, J.M. (2000). Review of the article Teaching collocation: Further developments in the lexical approach. TESL-EJ, 4(4), 1-3.

Hoey, M. (1991). Patterns of lexis in text. Oxford, UK: Oxford University Press.

Hyland, K. (2008). As can be seen: Lexical bundles and disciplinary variation. English for Specific Purposes, 27, 4-21.

Jackson, H., & Ze Amvela, E. (2000). Words, meaning and vocabulary: An introduction to modern English lexicology. London: Continuum.

Keshavarz, M.H. (1999). Contrastive analysis and error analysis. Tehran: Rahnama Publications.

Koprowski, M. (2005). Investigating the usefulness of lexical phrases in contemporary coursebooks. ELT Journal, 59, 322-32.

Koster, J., & Limper, P. (2000). Exercises in collocational English. Munster: Aschendorffu.

Krishnamurthy, R., Sinclair, J., Jones, S., & Daley, R. (2004). English collocation studies: The OSTI report. London: Continuum.

Lea, D. (Ed.). (2002). Oxford collocations dictionary for students of English. Oxford, UK: Oxford University Press.

Lewis, M. (Ed.). (1997). Implementing the lexical approach: Putting theory into practice. Hove, UK: Language Teaching Publications.

Lewis, M. (2000). Teaching collocation: Further developments in the lexical approach. Hove, UK: Language Teaching Publications.

Mahlberg, M. (2006). Lexical cohesion: Corpus linguistics theory and its application in English Language Teaching. International Journal of Corpus Linguistics, 11, 363-83.

Marks, J., & Wooder, A. (2007). Check your vocabulary for natural English collocations. London: A & C Black.

Matthews, P.H. (2007). Oxford concise dictionary of linguistics (2nd ed.). Oxford, UK: Oxford University Press.

McAlpine, J., & Myles, J. (2003). Capturing phraseology in an online dictionary for advanced users of English as a second language: A response to user needs. System, 31, 71-84.

McCarthy, M., & O"Dell, F. (2005). English collocations in use. Cambridge, UK: Cambridge University Press.

Moudraia, O. (2001). Lexical approach to second language teaching. Washington, DC: ERIC Clearinghouse on Languages and Linguistics. (ERIC Document Reproduction Service No. EDO-FL-01-02)

Nesselhauf, N. (2003). The use of collocations by advanced learners of English and some implications for teaching. Applied Linguistics, 24, 223-42.

Oller, J.W., & Ziahosseiny, S.M. (1970). The contrastive analysis hypothesis and spelling errors. Language Learning, 20, 183-89.

Phythian-Sence, C., & Wagner, R.K. (2007). Vocabulary acquisition: A primer. In R. K. Wagner, A.E. Muse, & K.R. Tannenbaum (Eds.), Vocabulary acquisition: Implications for reading comprehension (pp. 1-14). New York: Guilford Press.

Redman, S. (2003). English vocabulary in use: Pre-intermediate and intermediate (2nd ed.). Cambridge, UK: Cambridge University Press.

Richards, J.C., Platt, J., & Platt, H. (1992). Longman dictionary of language teaching and applied linguistics (2nd ed.). London: Longman.

Rudska, B., Channell, J., Putseys, Y., & Ostyn, P. (1981). The words you need. Oxford, UK: Macmillan.

Shei, C., & Pain, H. (2000). An ESL writer's collocational aid. Computer Assisted Language Learning, 13, 167-182.

Sonaiya, C.O. (1988). The lexicon in second language acquisition: A lexical approach to error analysis. Unpublished doctoral dissertation, Cornell University. Dissertation Abstracts International, 49(2), 247A.

Teubert, W. (2004). Using meaning, parallel corpora, and their implications for language teaching. In U. Connor & T.A. Upton (Eds.), Applied corpus linguistics: A multidimensional perspective (pp. 171-89). Amsterdam: Rodopi.

Xiao, R., & McEnery, T. (2006). Collocation, semantic prosody, and near synonymy: A cross-linguistic perspective. Applied Linguistics, 27, 103-29.

Wang, S. (2005). Corpus-based approaches and discourse analysis in relation to reduplication and repetition. Journal of Pragmatics, 37, 505-40.

Wanner, L., Bohnet, B., & Giereth, M. (2006). Making sense of collocations. Computer Speech and Language, 20, 609-24.

Widdowson, G.H. (2007). Discourse analysis. Cambridge, UK: Cambridge University Press.

Wolter, B. (2006). Lexical network structures and L2 vocabulary acquisition: The role of L1 lexical/conceptual knowledge. Applied Linguistics, 24, 741-47.

Wyatt, R. (2002). Check your vocabulary for English for the IELTS examination. Teddington, UK: Peter Collins Publishing.

Yarmohammadi, L. (1965). A contrastive study of modern English and modern Persian. Unpublished doctoral dissertation, Indiana University. Dissertation Abstracts: Section A. Humanities and Social Science, 28, 219A-220A.

The Author

Karim Sadeghi holds a doctorate in applied linguistics (language testing) from the University of East Anglia in Norwich, UK. His main research interests include alternative assessment, research evaluation, reading comprehension, and error analysis. He has presented papers at many international conferences and has also published in The Reading Matrix, Asian EFL Journal, Journal of Adolescent and Adult Literacy, and Reading in a Foreign Language. He was chosen as the Best Researcher of the Year by Urmia University Research Council in January 2007 and is currently serving as an editor of the Chinese EFL Journal and as an editorial member of the Iranian Journal of Language Studies.
Table 1
Characteristics of the Participants Involved in the Study

Group         Number    M     F   Azari     L1

High School     30     30     0    26        4
Freshmen        24      3    21    23        1 (M)
Juniors         22      0    22    20        2
Total           76     33    43    69        7

              Persian   Persian    Age
Group           L1        L1      range

High School      0        30      17-18
Freshmen         0        24      18-23
Juniors          0        22      20-26
Total            0        76      17-26

Table 2
Linguistic Categorization of Items

Item Type                     Number   Percentage

Lexical                         48       80
Grammatical                     12       20
Verb + Noun                     27       45
Adjective + Noun                14       23.33
Adjective + Adjective            1        1.67
Noun and Noun                    3        5
Adjective or Adjective           2        3.33
Verb + Adverb + Prep + Noun      1        1.67
Prep + Noun                      5        8.33
Verb + Prep                      4        6.67
Adjective + Prep                 2        3.33
Noun + Prep                      1        1.67
Total                           60      100

Table 3
Performance of Participants on the Test of Collocations

                          % of incorrect    % of incorrect
                            responses         responses
                          traceable to     traceable English
                % of      Persian as L1     (intra-lingual)
              incorrect   (interlingual      interference)
Groups        responses   interference)    or other factors

High School     72.1          83.75              16.25
Freshmen        58.48         84.47              15.53
Juniors         56.83         86.68              13.32
Average         62.47         85.07              14.93

Table 4
Items Presenting the Greatest Challenge (over 90% Inaccuracy Level)

             % of incorrect
Group        items            Items incorrect

Low-                          go bankrupt (100%), by mistake (96%),
proficient   18.5             put on weight (96 %), brother and
                              sister (96%), in a hurry (93%), in
                              dark glasses (93%), have a lot in
                              common (93%), get on a diet (92%),
                              civil war (92%), sooner or later
                              (92%), dress a salad (91%)

Mid-                          many thanks to you (100%), tell the
proficient   12               difference (100%), speed limit (96%),
                              nine-to-five job (91%), on the one
                              hand (91%), go bankrupt (91%), sooner
                              or later (91%)

High-                         black tea (95%), go bankrupt (95%),
proficient   12               tell the difference (95%), nine-to-five
                              job (95%), pale orange (95%), deep
                              yellow (91%), get a cold (91%)
Shared       Tell the difference (97.5%),
             (incorrect choices: put the difference: 78%, say the
             difference 10.5%, place the difference: 9%)
             go bankrupt (95.5%),
             (incorrect choices: get bankrupt: 34%, fall bankrupt:
             33%, become bankrupt: 28.5%)
             nine-to-five job (93),
             (incorrect choices: morning-to-afternoon job: 65%,
             eight-to-four job: 21%, seven-to-three job: 7%)
             sooner or later (91.5%)
             (incorrect choices: late or soon 66.5%, soon or late
             17.5%, later or sooner 8%)

Low proficient: High School group; mid-proficient: Freshmen group;
high proficient: Juniors group.

Table 5
Items Presenting the Least Challenge (less than 10% Inaccuracy Level)

Group             % of incorrect items   Items incorrect

proficient        1.5                    heavy rain (8%)
mid-proficient    1.5                    take a shower (8%)
high-proficient   8                      make a mistake (0%), give
                                         sb a smile (0%), heavy
                                         rain (5%), Ladies and
                                         Gentlemen (9%), grow
                                         flowers (9%)
Items             heavy rain (6.5%)
shared            (incorrect choices: hard rain 4.5%, great rain 2%)

Low proficient: High School group; mid-proficient: Freshmen group;
high proficient: Juniors group.

Table 6
Items Functioning Variably for Diverse Groups

Item                   Difficult for   Easy for

Take a seat            HS: 78%         Others: 26.5%
Ladies and gentlemen   HS: 67%         Others: 17.5%
Take a shower          HS: 78%         Others: 10%
Leave a message        HS: 79%         Others: 41%
Civil war              Others: 88%     JU: 36%
Brother and sister     Others: 85%     JU: 48%
Heavy rain             FU: 63%         Others: 6.5%

HS: High School group; FU: Freshmen group; JU: Juniors group.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2009 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Sadeghi, Karim
Publication:TESL Canada Journal
Date:Mar 22, 2009
Previous Article:A case for faculty involvement in EAP placement testing.
Next Article:Paper partners: a peer-led talk-aloud academic writing program for students whose first language of academic study is not English.

Terms of use | Copyright © 2017 Farlex, Inc. | Feedback | For webmasters