This page provides an index to TalkBank CA data. These files include both traditional CA ("Heritage") transcripts and transcripts in the newer CA/CHAT format. All of the materials referenced here are naturalistic conversations amenable to CA analysis. To download the corpus, click on the link in the Corpus column. To download the media, click on the link in the Media column.

We use stars to rate the extent to which the transcripts capture detailed conversational phenomena. Transcripts with only one star have not yet been transcribed in CA, but include interesting materials that could eventually be transcribed in CA format.

Corpus Description Media Rating
SamtaleBank Danish CA corpora from the DK-CLARIN Project. Register here for access to protected data. SamtaleBank *****
CallFriend Phone calls in English, Spanish, French, and Japanese contributed by LDC. CallFriend **
Jefferson NB and Watergate transcriptions by Gail Jefferson. Also MS-Word versions of the Watergate transcripts. Jefferson *****
SBCSAE The Santa Barbara Corpus of Spoken American English. A wide variety of interactional types. SBCSAE ****
GulfWar Radio call-in show discussions during the first day of the first Gulf War. GulfWar ***
Sakura Videotaped conversations of groups of 4 Japanese college students, not yet in CA format. Sakura ***
SCoSE The Saarbrücken Corpus of Spoken (American) English. SCoSe *
JOC Eight conversations from a special issue of the Journal of Communication. JOC *
MOVIN Conversations in Danish, German, French, English, and Italian. MOVIN ***
ClassBank A wide variety of videotaped classroom lessons and work group interactions. ClassBank *
CMU Conversations collected by students at CMU. These can only be used for teaching purposes. CMU *
Grimshaw An hour-long dissertation defense. Grimshaw *
LIDES Six corpora from multilingual groups engaged in code-switching. LIDES **