Language Learners' English Essays and Feedback Corpus (LEAF)

About the Corpus

The Language Learners’ English Essays and Feedback Corpus (LEAF) is a dataset of English essays paired with detailed feedback. Compiled in 2024 as a collaboration between university researchers and the Educational Testing Service (ETS), the corpus contains approximately 6,000 essay–feedback pairs collected from the EssayForum platform, where English language learners submit essays and receive written feedback from educators and reviewers.

In addition to human-written feedback, the corpus also includes AI-augmented feedback created by revising GPT-4 feedback using human comments to improve specificity and instructional value. The corpus is split into training, development, and test sets and can be used to study writing development and feedback effectiveness.

Accessing the Corpus

The LEAF corpus is available in JSONL format through its GitHub repository: https://github.com/shabnam-b/LEAF

The JSONL version of the LEAF dataset can be converted into an analysis-ready CSV using the Python script provided in this repository: https://github.com/mkane968/exploring_leaf

The script downloads the dataset and exports a structured CSV file with columns including:

essay_title
essay_text
human_feedback_text
AI-augmented_feedback_text
split (train/dev/test)

Analyzing the Corpus

LEAF is well-suited for computational studies of student writing and feedback, especially for analyzing feedback effectiveness, writing development, and automated feedback generation.

Basic NLP with Voyant Tools (no coding required): Useful for exploring word frequency, key terms, collocations, and lexical patterns in essays and feedback. Researchers can compare human vs AI feedback, identify common instructional phrases, and examine patterns such as references to grammar, organization, or argument. https://voyant-tools.org
Corpus Exploration and Feedback Analysis with Python: Enables loading the JSONL corpus, converting it to CSV, and analyzing essay and feedback length, structure, and distribution. Useful for comparing human and AI-generated feedback and preparing datasets for NLP analysis. https://github.com/mkane968/exploring_leaf
Semantic Similarity and Retrieval with Python: Supports embedding essays and feedback to retrieve similar examples based on semantic similarity. Useful for studying feedback consistency and building retrieval-augmented feedback systems. https://www.sbert.net/examples/sentence_transformer/applications/semantic-search/README.html
Corpus Analysis with R: Supports tokenization, keyword analysis, n-grams, and comparison of linguistic patterns across essays and feedback types. Useful for identifying common feedback themes and analyzing instructional language. https://tutorials.quanteda.io
Text Mining and Feedback Pattern Analysis with R: Enables analysis of word frequency, bigrams, and distinctive vocabulary in human vs AI feedback. Useful for studying differences in specificity, tone, and feedback focus. https://www.tidytextmining.com

Selected Research

Behzad, S., Kashefi, O., & Somasundaran, S. (2024). LEAF: Language Learners’ English Essays and Feedback Corpus. Proceedings of NAACL 2024, pages 433–442. https://aclanthology.org/2024.naacl-short.36.pdf