A function-first approach to identifying formulaic language in academic writing
There is currently much interest in creating pedagogically-oriented descriptions of formulaic language. Research in this area has typically taken what we call a 'form-first' approach, in which formulas are identified as the most frequent recurrent forms in a relevant corpus. While this research continues to yield valuable results, the present paper argues that much can also be gained by taking a 'function-first' approach, in which a corpus is first annotated for communicative functions and formulas are then identified as the recurrent patterns associated with each function. We demonstrate this approach through a comparative analysis of introductions to student essays and research articles. Focusing on one particularly common communicative function, the analysis demonstrates that (1) this function is more common in student essays than in articles; (2) both the choice to use the function and the choice of linguistic forms that realize the function vary across subject areas in research articles, but not in student essays; (3) research articles tend to be more formulaic in expressing the function than student essays; and (4) some parts of the forms used are highly formulaic, while others are more open. The key formulas are described and suggestions made regarding their pedagogical presentation.