Skip to main content

Corpora and corpus tools

The Centre for Translation Studies has developed and hosts a range of large representative corpora in a variety of languages including English, Arabic, Chinese, French, German, Italian, Japanese, Spanish, Polish and Russian. Some corpora are available in-house only, while others can be accessed freely. For more information, go to Please note that these corpora are temporarily unavailable outside the University of Leeds and we hope that access can be restored soon. 

The School of Computing has developed and hosts a number of Arabic corpora, including the Corpus of Contemporary Arabic ( and the Quranic Arabic Corpus

For a range of corpus processing tools developed at Leeds, go to