Updating AntConc for spoken and educational language data

Dates: 2018–19
Funding body: Japan Society for the Promotion of Science
Primary investigator: Robbie Love (Education)
Co-investigators: Laurence Anthony (Waseda University)
Language at Leeds satellites: Corpus Linguistics
Language at Leeds themes: Language and Computation

In corpus linguistics, a range of tools are available to researchers for the computational analysis of language data, both online and offline. These tools are designed primarily for the analysis of written language, rather than spoken language; corpus linguistics has traditionally been more interested in written language, because it is easier to gather and convert into a digital format for corpus analysis. I and other language researchers at the University of Leeds regularly use tools such as these to analyse written and spoken data, including educational materials, interview transcripts and transcripts of recorded lesson observations. However, the writing-centric nature of these tools means that analysing spoken data is more complex, time consuming, and ultimately more expensive than analysing written data. As a result, the full potential of analysing spoken data cannot always be reached.

Prof Anthony is the creator and developer of AntConc – a freeware, multiplatform, corpus analysis toolkit for concordancing and text analysis – which is one of the most widely used corpus tools in the world. In this project, Dr Love and Prof Anthony worked together to add new search and processing functions to AntConc to facilitate easier, and more effective processing of spoken language data.