Learning and Teaching L2 Collocations: Insights from Research.

Collocations (e.g., make a mistake or strong coffee) are usually defined as "the occurrence of two or more words within a short space of each other in a text" (Sinclair, 1991, p. 170) and together with other types of multiword units (MWUs) they represent a larger phenomenon called formulaic language (Schmitt, 2004; Wood, 2015; Wray, 2002). It is generally accepted that such units enhance communication and play a crucial role in successful language use. However, developing L2 phrasal or collocational competence is a slow process that poses serious challenges for language learners. As explained by both Henriksen (2013) and Boers, Lindstromberg, and Eyckmans (2014), there are several issues that compound the difficulty of acquiring L2 collocational knowledge and these include, to name just a few, a lack of perceptual salience and deceptive transparency of many MWUs, cross-linguistic variability of collocational forms (e.g., delexicalized phrases such as make a mistake), irregular spacing of encounters with phrases, and a traditional focus on teaching individual words rather than MWUs. Consequently, one of the key questions for contemporary applied linguists is establishing the optimal conditions for L2 learners' acquisition of phrasal vocabulary (Ellis, Simpson-Vlach, Roemer, O'Donnell, & Wulff, 2015; Schmitt, 2010; Wood, 2015), and the subsequent paragraphs aim to synthesize the main findings in this area. The first section deals with intentional learning of L2 collocations in relation to different forms of classroom-based teaching. It is followed by a discussion of incidental learning and the role of repetition in developing L2 phraseological knowledge. The subsequent two sections discuss the learning and use of L2 collocations from the perspective of psycholinguistic and corpus research. Finally, the article outlines pedagogical implications and highlights avenues for future work.

Intentional Learning of Collocations

Intentional learning of collocations and other MWUs is an increasingly popular topic (e.g., Boers, Demecheleer, Coxhead, & Webb, 2014; Laufer & Girsai, 2008; Peters, 2016; Webb & Kagimoto, 2009), and there are a number of issues that researchers in this area have been addressing. One of them is the effectiveness of specific exercises when it comes to enhancing L2 collocational knowledge. A good illustration of such work is a recent study by Boers, Dang, and Strong (2017). Working with EFL learners in Vietnam, the authors explored the effects of three kinds of fill-in-the-blank exercises (selecting appropriate verbs as constituent elements of collocations, completing verbs within collocations with first-letter cues provided, and choosing whole phrases) on the acquisition of verb-noun collocations. Results indicated that studying collocations as holistic units, rather than attending to their individual components, was the most effective way of fostering L2 collocational knowledge, leading to gains at the level of form recall and meaning recall. Moreover, having compared 10 popular EFL textbook series in relation to their coverage of collocations, Boers and colleagues also revealed that as much as 23.5% of exercises in their sample took the form of reassembling broken phrases, which is a worrying finding given the fact that these activities had the lowest effectiveness in terms of increasing L2 learners' collocational knowledge. On the whole, then. this research suggests that teachers and language specialists need to pay more attention to the presence of phrasal vocabulary in textbooks and teaching materials as well as to the types of phrase-focused activities that are used to promote this aspect of language.

Another important line of collocational research has focused on the impact of input modifications on fostering L2 collocational knowledge. By way of example, Sonbul and Schmitt (2013) examined the acquisition of collocations by ESL learners in three different conditions: instructed (collocations taught in isolation), enhanced (collocations presented in red font and bolded), and unenhanced. In the treatment itself, the learners read a short passage seeded with 15 medical collocations (each of them occurring three times), followed by an immediate and delayed posttest. Notably, the study was innovative in that it included measures of both implicit and explicit collocational knowledge: implicit knowledge was measured via a priming lexical decision task, while explicit knowledge via traditional, pen-and-paper tests. Results indicated that the learners' explicit knowledge improved under all the treatment conditions, on both the immediate and delayed posttest. However, implicit gains occurred only in the enhanced group and were not retained on the delayed posttest, indicating the difficulty of acquiring such knowledge in classroom conditions.

In a similar study, Szudarski and Carter (2016) explored the effects of input flood and input enhancement on Ll-Polish EFL learners' acquisition of collocations. The former was operationalized as an increased amount of exposure (6 or 12 repetitions) to verb-noun and adjective-noun phrases found in stories, while the latter involved reading the same texts but with the additional benefit of underlining as a way of increasing the perceptual salience of the target phrases. Delayed posttests conducted two weeks after the treatment revealed that the input enhancement group outperformed the input flood group, demonstrating collocational gains at both the receptive (form recognition) and productive (form recall) level. Crucially, these gains were made despite the fact that the target collocations contained low-frequency words, a factor that is known to compound the difficulty of building L2 phrasal vocabulary (Zhang, 2017). Thus, taking the reviewed studies together, it can be said that modifying the qualities of input can have a positive impact on learning L2 collocations, but the effectiveness of specific treatments is largely dependent on the aspects and types of collocational knowledge that are targeted.

Lastly, research into intentional learning of L2 collocations has also investigated the potential of raising learners' awareness of the phonological and orthographic properties of MWUs. Boers, Lindstromberg, and Webb (2014), for instance, documented that drawing learners' attention to alliteration within collocations (sound repetition within phrases as in green grass) can enhance the learning process, particularly in relation to fostering the knowledge of form. However, using a similar design, Boers, Eyckmans, and Lindstromberg (2014) failed to find evidence in favour of the facilitative role of sound patterning in promoting L2 collocations and compounds. There is thus a clear need for more research in this area in order to ascertain whether pedagogical interventions based on highlighting alliteration and rhymes within phrases can lead to gains in L2 phraseological competnce and also, how applicable such treatments are to various types of MWUs.

Incidental Learning and the Role of Repetition

Intentional learning is often contrasted with incidental learning, usually defined as learning something without awareness by engaging in meaning-focused communicative activities such as reading or listening (Hulstijn, 2003). While research into incidental learning of L2 collocations is still relatively scarce, there are several studies that have shed some light on the nature of this process. One of them is Webb, Newton, and Chang's (2013) classroom experiment based on EFL learners' reading of graded readers. For the duration of 15 weeks, L1-Chinese learners in Taiwan read and listened to a reader called New Yorkers, focusing their attention entirely on understanding the text. Additionally, by including different numbers of encounters with the target collocations (1, 5, 10, 15), the study explored not only the extent of incidental learning of collocations but also the role of repetition in the process. Results revealed that L2 collocations could be learned incidentally from reading-while-listening, with more repetition leading to significantly higher gains. However, given that only immediate posttests were employed, little is known about the durability of these gains and it is crucial that this issue is addressed in future research.

Pellicer-Sanchez (2017) is another study looking at incidental gains in L2 collocations. The author presented ESL learners at a British university with a text containing four or eight occurrences of adjective-pseudoword collocations (e.g., small berrow meaning small bowl) and asked them to read it for comprehension. One week after this treatment, the learners were given a combination of pen-and-paper and interview tests, and they revealed increased collocational knowledge at the level of form recall and recognition. Crucially, when it comes to the role of repetition, no significant differences were found between the gains after four and eight occurrences. This is an intriguing finding that contrasts the results of Webb et al. (2013) or Peters (2014), who documented that more repetition led to higher collocational gains. Until more research becomes available, it can only be hypothesized that this lack of consistent repetition effects was caused by the difficulty of individual words constituting the target items or their uneven distribution across the texts presented to the students.

On the whole, however, the relationship between incidental learning of L2 vocabulary and repetition seems to be more complex than establishing the required threshold of encounters that would guarantee success. To be precise, while there is ample evidence showing that frequency plays a key role in the acquisition and processing of both words (e.g., Elgort, Brysbaert, Stevens, & Van Assche, 2017; Mohamed, 2017) and MWUs (e.g., Kim & Kim, 2012; Sonbul, 2015; Wolter & Gyllstad, 2013; Yi, Lu, & Ma, 2017), other factors need to be considered as well. As Ellis (2016) explains, "any frequency effect is modulated by context, recency, and salience" (p. 249), resulting in a complex and dynamic relationship of a range of variables including the order of encounters with phrases (both across texts and over time), their perceptual salience, and the prototypicality of their functions (for review, see Ellis & Wulff, 2015).

In light of these arguments, some researchers (e.g., Godfroid et al., 2017; Schmitt, 2010) have argued that the incidental learning of vocabulary is driven not only by repetition but also by learners' level of engagement with L2 material or, to use psycholinguistic terminology, the depth of its processing. Some evidence in favour of this hypothesis can be found in Gonzalez-Fernandez and Schmitt's (2015) recent study into factors affecting L2 collocational learning. Focusing on productive levels of mastery, the authors demonstrated that L1-Spanish EFL learners knew a sizeable number of collocations (out of 50 items, 56.6% were known), but, interestingly, this knowledge was found to correlate only moderately with the frequency of these phrases in the COCA corpus (.45). Instead, it was immersion in English-speaking countries and informal language usage (e.g., watching TV or social networking) that exhibited higher correlations with the learners' performance (.64 and .56 respectively), suggesting that the acquisition of L2 collocations and, arguably, other MWUs "relies on more than just frequency of exposure" (Gonzalez-Fernandez & Schmitt, 2015, p. 112). At the same time, however, it must be remembered that frequency information from general corpora such as COCA, while extremely helpful in both research and teaching, should be treated only as a proxy for L2 speakers' overall experience with language. As Conklin (in press) points out, L2 classroom input might be a better predictor of L2 performance, in particular in EFL contexts, where exposure to authentic English is often limited. In summary, while the frequency of encounters with phrasal vocabulary remains a strong predictor of incidental L2 learning, it is important that gains in L2 phraseological knowledge are also explored in relation to L2 learners' engagement with MWUs as well as the quality of the input they are exposed to.

Psycholinguistic Approaches

L2 collocational competence has also been the object of many psycholinguistic studies, including methodologies such as eye-tracking, tests of memory, or priming. Such approaches tap into attentional mechanisms underlying language processing and learning and therefore are able to provide more fine-grained insights into the changing nature of L2 learners' lexicon. Given the growing importance of such research, the following paragraphs present studies that represent the main strands of empirical work in this area.

The first example to be discussed is Choi's (2017) eye-tracking study examining the influence of input enhancement on L2 learners' knowledge of collocations. Using two groups of Ll-Korean EFL learners, Choi recorded their eye movements as they read two versions of a text: a baseline and an enhanced one in which collocations were boldfaced. Crucially, the learners' task was to focus on understanding the meaning of the text, and their attention was not drawn to the presence of collocational pairs. One week after this treatment, a testing session followed, consisting of a form recall test of unenhanced content words and a form recall test of collocations. The eye-tracking data demonstrated that the enhanced collocations attracted more attention from the learners, resulting in higher scores on the collocation test. At the same time, however, the baseline (unenhanced) group outperformed the enhanced group in terms of recalling content information, pointing to a tradeoff between collocational gains and an ability to recall the textual content. This is in line with previous research (e.g., Han, Park, & Combs, 2008; Lee, 2007) and suggests that directing L2 learners' attention to specific language forms through input enhancement, while beneficial in some regards, can also have a detrimental effect on the understanding of meaning.

The work of Foster, Bolibaugh, and Kotula (2014) is another example of psycholinguistic research into the development of L2 phraseology. Using a battery of four instruments, the authors studied L2 learners' receptive mastery of MWUs as dependent on the effects of phonological short-term memory alongside a range of other key variables including learning context (ESL vs. EFL), age of onset (early vs. late starters), motivation, and engagement with the target language community. Participants consisted of 20 native and 79 non-native speakers of English (39 living in the UK and 40 living in Poland); their phraseological competence was tested by means of a detection test asking them to identify odd-sounding or unnatural expressions. Age of onset and learning context were found to be the strongest predictors of the L2 learners' performance, with only those learners who came to the UK at or before the age of 12 performing similarly to the native users. As regards the results of the memory test, they were also mediated by the effects of age and learning context, suggesting that for late L2 starters a good phonological short-term memory and immersion are two necessary conditions for achieving nativelike phraseological competence.

Another line of psycholinguistic research has examined the effects of congruency, or the presence of direct translational L1-L2 equivalents, on learning MWUs. To illustrate, Wolter and Gyllstad (2011) employed a primed lexical decision task to compare the processing of congruent and incongruent collocations by native and non-native (L1-Swedish) speakers of English. Whereas the former processed both types of phrases in a similar manner, the latter required significantly more time to deal with the incongruent collocations, indicating that L2-specific phrases with no direct equivalents in the learners' mother tongue constituted a particular processing burden. Moreover, L2 learners' problems with such collocations were also confirmed by the results of a pen-and-paper receptive test that showed that the incongruent items were less likely to be recognized as felicitous English phrases. Difficulty in the processing and learning of L2-spetific collocations were also reported by Yamashita and Jiang (2010) and Szudarski and Conklin (2014), indicating that such phrases need to be given a special kind of attention in language teaching. Naturally, while designing exercises that highlight phraseological differences between English and other languages should be relatively simple in largely homogeneous groups of EFL learners, it might prove much more problematic in mixed and heterogeneous ESL contexts, where learners represent different L1s. However, the selection of difficult, pedagogically relevant phrases can be informed by corpus-based findings as well as teachers' intuitions and ratings (e.g., Martinez & Schmitt's, 2012, PHRASE List or Simpson-Vlach & Ellis's, 2010, Academic Formulas List).

Corpus Approaches

Since the advent of corpora, L2 collocational development has also been studied by juxtaposing the language of learners with that of native or expert users (for review, see Granger, 2015, or Ebeling & Hasselgard, 2015). Such comparisons have focused mainly on under-, over-, and misuse of collocations and others MWUs in learners' output, and this section aims to summarize the key research findings.

While there are numerous ways in which collocations can be extracted from corpora (Gablasova, Brezina, & McEnery, 2017), t-scores and mutual information (MI) have been the most frequently used measures; the former highlights the frequency of phrases, while the latter refers to the strength of association between two words. For instance, focusing on academic writing, Durrant and Schmitt (2009) showed that Turkish and Bulgarian EFL learners resembled native speakers with regard to the use of frequent collocations with high t-scores (e.g., good example, long way, hard work). However, they tended to underuse strongly associated high-MI collocations, which feature prominently in the language of L1 users (e.g., densely populated, bated breath, preconceived notions). Similarly, Bestgen and Granger (2014) assigned t-score and MI values to each pair of contiguous words found in ESL learners' essays and contrasted them with L1 data from a reference corpus. A longitudinal analysis (a comparison of the first and last essays written over one semester) revealed that even though the learners made improvement in terms of reducing the number of high-frequency items (e.g., I think, you know), they still produced fewer high-MI, tightly-associated collocations (e.g., alcoholic beverage, traffic jam). Thus, these findings show key differences in the use of phraseology by L1 and L2 users and indicate that even ESL learners, who are exposed to large amounts of natural English, may struggle to achieve L1 users' levels of collocational competence.

The difficulty of acquiring L2 collocations has also been highlighted by corpus-based research examining the appropriateness or accuracy of learner production. By way of example, using a corpus of learner essays, Laufer and Waldman (2011) found that not only did Israeli EFL learners produce far fewer instances of collocations as compared to L1 users but they also made many collocational errors. Unfortunately, similar problems with L2 collocations have been reported by many other studies representing different groups of L2 learners (e.g., Nesselhauf, 2005; Paquot, 2008; Wang & Shaw, 2008). Furthermore, corpus research has also revealed that miscollocations in learner English continue to persist even after many years of learning English (Levitzky-Aviad & Laufer, 2013; Nesselhauf, 2005), with negative transfer or interference from learners' L1 being one of the main factors contributing to the erroneous use of L2 phrases. This confirms yet again that cross-linguistic differences in the use of phraseology deserve a special kind of attention in classroom instruction, pointing to a clear need to focus our efforts on raising L2 learners' awareness of different types of phraseological patterns.

It is also worth pointing out that corpus-based studies of collocations and MWUs have analyzed the use of L2 phraseology as a marker or token of proficiency. To illustrate, Granger and Bestgen (2014) reported that intermediate EFL learners, when compared to advanced learners, used a significantly lower number of tightly bound, high-MI collocations. Likewise, Parkinson's (2015) analysis showed that ESL learners were significantly more accurate in the use of collocations than EFL learners. This shows that the language produced by learners at higher proficiency levels is characterized by a more appropriate and varied use of MWUs and, as a growing body of research attests (e.g., Chen & Baker, 2016; Lenko-Szymanska, 2015), there is a clear relationship between the use of phraseology by L2 users on the one hand and the way their performance is assessed on the other. To sum up, the reviewed studies not only point to the usefulness of corpora for analyzing the quality of learner English, but, perhaps more crucially, they underline the importance of collocations and other formulaic units as a key element of successful L2 use.

Pedagogical Implications and Future Research

This article has presented the main research findings related to the learning and use of L2 collocations. It has demonstrated that acquiring such vocabulary is a multifaceted process that is dependent on a number of factors including the repetition and salience of such phrases in L2 input, the amount of exposure to English, context of learning, and phraseological differences between learners' L1 and L2. Moreover, using insights from psycholinguistic and corpus research, the article has also discussed L2 learners' difficulty in processing and producing collocations and other MWUs.

In terms of practical implications, no universal solutions can be offered, but the above overview demonstrates that L2 collocations can be acquired both incidentally and intentionally, with intentional learning resulting in bigger and faster gains. However, given the sheer volume of different types of MWUs, it is clear that most of such units need to be acquired incidentally via usage-based learning and only a small selection can be taught explicitly. It appears that incongruent collocations or tightly bound phrases with key discourse functions should be regarded as strong candidates for classroom instruction. Moreover, the reviewed studies reveal that the choice of specific treatments for promoting collocations is largely determined by the level of mastery that is targeted and, while pedagogical approaches to phrasal vocabulary are likely to vary in local contexts, it is vital that the needs of specific groups of learners remain a key factor that influences our pedagogical decisions. As summarized by Meunier (2012), different teaching approaches might be required to target the various facets of L2 phraseological competence: "simple input enhancement for some, receptive focus on others, and more productively oriented approaches for some others" (p. 122). Lastly, when it comes to future research, it is vital that empirical work in the area of formulaic language continues to be interdisciplinary and multistrand, paying particular attention to learners' phraseological development over time (e.g., Siyanova-Chanturia, 2015; Yoon, 2016) and exploring phraseological competence in relation to other aspects of L2 proficiency such as age or learning context (e.g., Lenko-Szymanska, 2014). It is only when we gain a better understanding of the complex nature of learning L2 collocations and other MWUs that we will be able to offer more effective and research-informed language teaching.

The Author

Pawet Szudarski's research is based in the area of teaching English as a second/foreign language (TESOL) and corpus linguistics. More specifically, he focuses on the acquisition and use of vocabulary by second/foreign language learners. In the past, he was actively involved in the development of large-scale language assessment tools for English language learners in the USA. His other interests include corpus-based linguistic analysis, the status of English as a global language, and digital pedagogy.


