The WAC Corpus Collection is a curated repository of academic and professional writing corpora gathered from a range of institutions, disciplines, and instructional contexts. Launched as an initiative of the WAC Clearinghouse Associate Publishers New Scholar Fellowship (2025-26) and housed within the WAC Clearinghouse, this collection is designed to increase access to large text datasets for researchers, teachers, and students interested in writing studies and writing analytics
The corpora in this collection come from a range of institutions and represent a range of writing contexts. Some corpora (such as the University of South Carolina First-Year English corpus) are stored directly on this site, while others (such as Michigan Corpus of Upper-Level Student Papers (MICUSP) and British Academic Written English (BAWE) Corpus) are hosted externally and linked here for ease of access.
Each corpus page provides:
The WAC Corpus Collection makes writing corpora more visible, accessible, and pedagogically valuable. By centralizing these resources, the project:
Developed through the WAC Clearinghouse Associate Publishers New Scholar Fellowship, this collection advances the initiative's mission to expand shared research infrastructure for writing studies.
If you are interested in contributing a corpus to this collection, please contact Megan Kane (Assistant Editor, The Journal of Writing Analytics) at megan.kane@shu.edu, Duncan Buell (Co-Editor in Chief, The Journal of Writing Analytics) at duncan.buell@gmail.com, and Laura Aull (Research Leave 2025-2026) at laull@umich.edu. We welcome corpora representing a variety of genres and disciplines.