BAWE British Academic Written English


BAWE (British Academic Written English) and BAWE Plus Collections

Overview of BAWE

The British Academic Written English (BAWE) corpus was created through a project entitled 'An investigation of genres of assessed writing in British Higher Education' from 2004 – 2007. This project was funded by the Economic and Social Research Council (Project number RES-000-23-0800) and was a collaboration between the Universities of Warwick, Reading and Oxford Brookes.

The BAWE corpus contains 2761 pieces of proficient assessed student writing, ranging in length from about 500 words to about 5000 words. Holdings are fairly evenly distributed across four broad disciplinary areas (Arts and Humanities, Social Sciences, Life Sciences and Physical Sciences) and across four levels of study (undergraduate and taught masters level). Thirty-five disciplines are represented.

The assignments have been annotated using a system devised in accordance with the TEI guidelines. The header for each file includes factual information such as gender and year of birth and also contains some research findings from the initial team such as genre family. There is a dtd file which must be kept in the same folder as the corpus files, named tei_bawe.dtd and the holdings are described in an Excel spreadsheet 'BAWE.xls'. The transcription and mark-up conventions are described in the BAWE manual document, which is in PDF format.

The corpus is available free of charge to non-commercial researchers who agree to the conditions of use and who register with the Oxford Text Archive. The BAWE corpus can be accessed through the Oxford Text Archive ( as resource number 2539. It includes text files, a spreadsheet with contextual information, and a corpus manual.

For more information about the BAWE corpus, please email

BAWE British Academic Written English.rar