Skip to main content


This is the preliminary schedule.  See further below for the abstracts.

9.20AM Welcome and Opening Remarks


Isobelle Clarke, Lancaster University

‘Approaching Discourse in Corpora using Multiple Correspondence Analysis’


Serge Sharoff, University of Leeds

‘Does Size Matter? Statistics from Large Corpora’


Eric Atwell, University of Leeds

‘AI4AI: Artificial Intelligence for Arabic and Islamic Corpus Linguistics’

11.10 REFRESHMENT BREAK (30 mins)


Fiona Douglas, University of Leeds

‘The Divisive Language of Union from Indyref to Brexit’


Richard Badger, University of Leeds

‘Transparency and Trustworthiness: a Study of the UK Government’s Press Conferences on Covid-19’


Mel Evans, University of Leeds

‘Unprecedented Communities, Diachronic Discourses, and Evolving Identities of #LongCovid’

1.20PM LUNCH (including BAAL SIG AGM) (1 hour)

Attendees to make own food arrangements


David Wright, Nottingham Trent University

‘Corpora and the Evolution of Forensic Linguistics’


Alison May, Mashael AlAmr, Yan Chen, Maram AlRabie and Min (Kayla) Wang, University of Leeds

‘Corpus-based Research in Forensic Linguistics: Data, Methods, and Knowledge Creation’



James Wilson, University of Leeds

‘“The corpus class”, “from ad-hoc corpora to personalised keyword lists” and “students as linguistic detectives”: A Look at How IntelliText Has Been Integrated in Russian Language Teaching at Leeds’


Yen Dang, University of Leeds

‘How Can Corpus Linguistics Help to Tackle the Lexical Challenge of Academic Listening?’


Alice Deignan and Duygu Candarli (Dundee University)

‘Using Corpora to Support UK School Students’



Abstracts booklet

Abstracts for the keynote talks:

Approaching Discourse in Corpora using Multiple Correspondence Analysis

Isobelle Clarke

Lancaster University

In this talk, I will introduce a new approach to the analysis of keywords (Clarke et al. 2021) and demonstrate how this approach can be used to examine variation in discourses over time (Clarke et al. forthcoming). Keywords can offer analytical signposts to discourses in large corpora, yet keywords do not map straightforwardly onto discourses. One major challenge with keyword analyses is aggregation – the keywords may all be associated with discourses present in the corpus but disaggregating the discourses in the keywords is a matter for the analyst. To make analysis more manageable, keywords are often grouped manually into semantic or thematic categories. Different approaches have been taken to achieve this; however, all approaches have limitations, and the creation of meaningful categories and the assignment of keywords to those categories often involves some element of compromise, especially as keywords may contribute to more than one discourse. Determining where this happens is, again, a matter for the analyst. Moreover, the processes for choices made in categorisation may not be explicitly documented or implied rather than conforming to a set method.  Another persistent issue with keyword studies is their focus on their presence rather than absence, yet absence can be as meaningful as presence in discourse analysis (Schroeter & Taylor 2018) and patterns of presence and absence across a corpus may meaningfully interact (Partington 2014).

As a result, I developed a new approach with Tony McEnery and Gavin Brookes, called ‘Keyword Co-occurrence Analysis (KCA)’, which draws on Multiple Correspondence Analysis to group keywords statistically based on how they co-occur with each other in the texts of the corpus. My talk will show how KCA overcomes many of the issues in traditional keyword analyses, including absence, and has proven to be effective for providing a more nuanced account of keywords that is sensitive to the various senses and discourses that a single keyword can exhibit. Moreover, I will demonstrate the potential of the approach for investigating discourses over time.

Clarke, I., McEnery, T & Brookes, G. (2022). Keywords through time: Tracking changes in press discourses of Islam. International Journal of Corpus Linguistics, forthcoming.
Clarke, I., McEnery, T., & Brookes, G. (2021). Multiple Correspondence Analysis, Newspaper Discourse and Subregister: A Case Study of Discourses of Islam in the British Press. Register Studies 3(1): 144—171.
Partington, A. (2014). Mind the gaps: The role of corpus linguistics in research absences. International Journal of Corpus Linguistics 19(1): 118—146.
Schroeter, M. & Taylor, C. (2018). Exploring Silence and Absence in Discourse: Empirical Approaches. London: Palgrave Macmillan.


Corpora and the evolution of forensic linguistics

David Wright

Nottingham Trent University, UK

This talk examines the role that corpora and the use of corpus linguistic approaches have played in the expansion and development of forensic linguistics since the 1990s. As with many disciplines of language analysis, forensic linguistics has benefitted from access to larger datasets and the effective combination of quantitative and qualitative analyses of these datasets. This is especially true for areas of the field for which large amounts of data are readily available. However, many types of data that forensic linguists are often interested in can be difficult to access due to their sensitive and personal content or their general rarity in the public domain. This means that researchers committed to using corpus linguistics in forensic contexts are faced with challenges that are not as profound in other corpus-assisted disciplines. This talk will give an overview of how this apparent challenge has in fact given rise to developments that have advanced the study of forensic linguistics, including the use existing general reference corpora and publicly available data, efforts underway to make new forensically relevant corpora available to researchers, and, most notably, the emergence of new directions for the field that widen the scope of ‘forensic linguistics’. In relation to the last point, a case study is presented in which corpus-assisted discourse analysis is used to examine how the proposal and passage of new laws has been reported in the British national press in the last twenty years. This case study is offered as an example of how forensic linguistics may be considered a socio-legal discipline, situating the study of language and law within its broad social, political and cultural context.