University of Essex
University of Essex


SouthEastAnglia Corpus

The SouthEast Anglia (SEA) corpus contains 349 sociolinguistic interviews, digitally recorded, from 38 different locations within and outside of the UK. The corpus contains recordings of 192 females and 157 males, born between the years of 1904-1997, with half of the speakers born from 1980 onwards. Additional demographic background information about the speaker is available for 53% of the recordings while 52% of all of the recordings have an accompanying transcript covering a portion of the interview.

201 of the interviews are with speakers from the SEA region, details of which can be seen in the map below.

Colchester, which is the location of 165 of the interviews, can be regarded as the main focus of the corpus. Indeed it was the first intention of the corpus upon its creation in 1998 to provide a resource for the study of English in Colchester, and recordings from the town have been added to the corpus almost every year since then. For more information on the history of the corpus click here.

The corpus is used for research and teaching purposes both by individual staff and students and by collective class groups. Two of the most recent studies using corpus data include an undergraduate linguistics project titled ““He was like stop talking”- A Comparison of Rapidly Changing vs. Stable Variables in Colchester” by Hannah Hughes which looked at the(ing) variable and Quotative variable in Colchester speakers and a class project (LG 254, 2013/14) titled "Some[f]ing interesting in Colchester: th-fronting by more educated speakers with non-local parents"” whose findings on voiceless th-fronting in Colchester were presented in poster form at the 10th annual UK Language Variation and Change Conference in September 2015

While a vast number of the interviews were conducted within the SEA region, it is a reflection of the varied interests of the contributors to the corpus that it also represents speakers from the Isle of Wight, Northern Ireland, Scotland, Wales, the Midlands and South(western) England as can be seen on the map below. Finally there are additional recordings from the east coast of America.

All of the interviews in the corpus have been conducted by students at the University of Essex, with new interviews being added annually. All interviews are available for the use of University of Essex students and can be accessed with their computer log in details. Non-University of Essex students will not have permission to view the corpus by log in; though can request permission by contacting Dr. Chand, SEA Corpus Manager at

It is possible to search for interviews which fit specific regional, speaker age or recording year criteria through the website. Alternatively, narrower criteria can be sorted for using this spreadsheet which details location, age, gender, birth year and recording year, as well as finer details such as the sound quality and length of the audios and whether the recordings have an accompanying transcript, transcription conventions or style shifting data. An additional "how to" guide has been written to help you use the SEA corpus; this can be downloaded from here.

Downloadable files