Strathy Corpus of Canadian English

The Strathy Corpus of Canadian English is now available online in two different formats. Anyone can search the corpus at English-Corpora.org where it is hosted alongside other English corpora including the Corpus of Contemporary American English (COCA) and the British National Corpus (BNC). Researchers who would like to download a full digital copy of the corpus can request access through Borealis. (You will need to create an account and follow the instructions associated with the dataset.)

What is the Strathy Corpus of Canadian English?

The first director of the Strathy Language Unit, W.C. Lougheed, was determined that the unit's research on Canadian English have a strong descriptive base. To that end, and with great technological foresight, he began to build a corpus of Canadian English, a planned sample of authentic language, in the early 1980s, stored as an electronic database. The original organizational scheme was based on the Brown-LOB Corpora.

Today the Strathy Corpus contains around 50 million words of written and spoken Canadian English, covering the years 1970-2010. It includes newspapers, magazines, biographies, historical writings, academic theses and journals, transcripts of university classes, Internet news and so on. Canadian authors who have generously allowed their fictional and nonfictional texts to be entered into the database include Margaret Atwood, Max Braithwaite, J.K. Chambers, Robertson Davies, Eugene Forsey and Makeda Silvera. Publishers who have made use of the Strathy Corpus in creating Canadian English dictionaries include Oxford University Press, Thomson-Nelson (formerly Gage) and HarperCollins.