It is not uncommon to hear linguists lamenting the mis-representation of language whenever linguistic issues are taken up by the media. Ironically, however, we have relatively little systematic understanding of the ways in which language is actually dealt with in, and by, those media. This paper focuses on methodological issues that arose in the context of a project that aimed to explore the ways in which themes relating to language and linguistics are represented in a corpus of articles gathered from two British newspapers, The Times and The Guardian.
Using the software programme WordSmith Tools (Scott, 2004) to identify those ‘key’ keywords that were most likely to occur in conjunction with the ‘node terms’ <language>, <languages>, <linguistic> and <linguistics>, this corpus-based methodology revealed a number of interesting ways in which language-related issues are debated in this particular sector of the print media. At the same time, as will be discussed in this paper, our study raised some important methodological concerns in relation to the use of WordSmith Tools, the British National Corpus, and the construction of keyword lists.