As a part of the ORAAL project we have developed the first public corpus of AAL data, the Corpus of Regional African American Language (CORAAL). CORAAL features recorded speech from regional varieties of AAL and includes the audio recordings along with time-aligned orthographic transcription.
CORAAL is a long-term corpus-building project conceived of in terms of several components. The first two components of CORAAL focus on AAL in Washington DC, the nation’s capital, a city with a long-standing African American majority, and the site of much early research on AAL (e.g. Fasold 1972). In April 2018, the first additional component, CORAAL:PRV, was released, making available data for 16 speakers from a rural community in central North Carolina. In October 2018, we are pleased to release the newest component, CORAAL:ROC, a subset of sociolinguistic interviews conducted by Sharese King as a part of her dissertation project in Rochester, NY (King 2018). CORAAL v.2018.10.06 includes corrections (mostly minor edits for consistency) to transcripts throughout the entire dataset and all users are urged to update to this newest version.
Together, CORAAL include data from over 140 sociolinguistic interviews from speakers born between 1891 and 2005 and over a million words of accurate time-aligned transcription of conversational speech.
All interviews have been anonymized and orthographically transcribed with time-alignment at the utterance level. Audio is available in high-quality uncompressed (.wav) format, and transcripts are available in three formats, Praat TextGrid (.TextGrid) files, ELAN (.eaf) files, and as plain text (.txt) files with tab-delimited fields.
CORAAL is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike (4.0) International license (https://creativecommons.org/licenses/by-nc-sa/4.0/). It is available for free and is downloadable from the above link.
More information is available in the User Guide, and we suggest you read that document for full information about the corpus. As a part of their work on their forthcoming paper in American Speech, "Contextualizing the Corpus of Regional African American Language DC: AAL in the Nation's Capital", Charlie Farrington and Natalie Schilling prepared an extensive reference list of publications related to AAL in Washington, DC; you can access the reference list here.
Updates to CORAAL are planned approximately quarterly, with the next update scheduled for early 2019. As for the ORAAL project, CORAAL was developed with support from the National Science Foundation (Grant No. BCS-1358724), and the University of Oregon.
How to cite CORAAL
Kendall, Tyler and Charlie Farrington. 2018. The Corpus of Regional African American Language. Version 2018.10.06. Eugene, OR: The Online Resources for African American Language Project. http://oraal.uoregon.edu/coraal
See the CORAAL User Guide for information about citing CORAAL’s individual components.
Contact the CORAAL team
Contact the CORAAL development team with any questions or comments about CORAAL.