Tools
for Analyzing Talk
Part
1: The CHAT Transcription Format
Brian
MacWhinney
Carnegie
Mellon University
October 10, 2024
https://doi.org/10.21415/3mhn-0z89
When citing the use of TalkBank
and CHILDES facilities, please use this reference to the last printed version
of the CHILDES manual:
MacWhinney,
B. (2000). The CHILDES Project: Tools
for Analyzing Talk. 3rd Edition.
Mahwah, NJ: Lawrence Erlbaum Associates.
This allows us to track usage of the
programs and data systematically through scholar.google.com.
1 Introduction.. 5
2 The
CHILDES Project 7
2.1 Impressionistic
Observation. 7
2.2 Baby
Biographies. 8
2.3 Transcripts. 8
2.4 Computers. 9
2.5 Connectivity. 10
3 From
CHILDES to TalkBank. 11
3.1 Three
Tools. 11
3.2 Shaping
CHAT.. 12
3.3 Building
CLAN.. 12
3.4 Constructing
the Database. 13
3.5 Dissemination. 13
3.6 Funding. 14
3.7 How
to Use These Manuals. 14
3.8 Changes. 15
4 Principles. 16
4.1 Computerization. 16
4.2 Words
of Caution. 17
4.2.1 The Dominance of the Written Word.. 17
4.2.2 The Misuse of Standard Punctuation.. 18
4.2.3 Working With Video.. 18
4.3 Problems
With Forced Decisions. 19
4.4 Transcription
and Coding. 19
4.5 Three
Goals. 19
5 minCHAT.. 21
5.1 minCHAT
– the Form of Files. 21
5.2 minCHAT
– Words and Utterances. 21
5.3 Analyzing
One Small File. 22
5.4 Next
Steps. 22
5.5 Checking
Syntactic Accuracy. 23
6 Corpus
Organization. 24
6.1 File
Naming. 24
6.2 Metadata. 24
6.3 The
Documentation File. 26
7 File Headers 28
7.1 Hidden
Headers. 28
7.2 Initial
Headers. 29
7.3 Participant-Specific
Headers. 36
7.4 Constant
Headers. 36
7.5 Changeable
Headers. 39
8 Words. 43
8.1 The
Main Line. 44
8.2 Basic
Words. 44
8.3 Special
Form Markers. 44
8.4 Unidentifiable
Material 47
8.5 Fragments,
Fillers, and Nonwords. 48
8.6 Incomplete
and Omitted Words. 49
8.7 De-Identification,
Anonymization, and Pseudonyms. 50
8.8 Standardized
Spellings. 51
8.8.1 Letters. 51
8.8.2 Compounds and Linkages. 51
8.8.3 Capitalization and Acronyms. 52
8.8.4 Numbers and Titles.. 53
8.8.5 Kinship Forms.. 53
8.8.6 Shortenings.. 53
8.8.7 Assimilations and Cliticizations.. 54
8.8.8 Communicators and Interjections.. 55
8.8.9 Spelling Variants.. 56
8.8.10 Colloquial Forms.. 56
8.8.11 Dialectal Variations.. 56
8.8.12 Baby Talk.. 57
8.8.13 Word separation in Japanese.. 58
8.8.14 Abbreviations in Dutch.. 58
9 Utterances 60
9.1 Children’s
Utterances. 60
9.2 Adult
Utterances. 61
9.3 Satellite
Markers. 62
9.4 Discourse
Repetition.. 62
9.5 C-Units,
sentences, utterances, and run-ons. 63
9.6 Basic
Utterance Terminators. 63
9.7 Separators. 64
9.8 Tone
Direction.. 65
9.9 Prosody
Within Words. 65
9.10 Local
Events. 66
9.10.1 Simple Events. 66
9.10.2 Interposed Word &*.. 67
9.10.3 Complex Local Events.. 68
9.10.4 Pauses.. 68
9.10.5 Long Events.. 68
9.11 Special
Utterance Terminators. 69
9.12 Utterance
Linkers. 71
10 Scoped Symbols 73
10.1 Audio
and Video Time Marks. 73
10.2 Paralinguistic
and Duration Scoping. 74
10.3 Explanations
and Alternatives. 75
10.4 Retracing,
Overlap, Exclusions, and Clauses. 76
10.5 Error
Marking. 80
10.6 Precodes
and Postcodes. 80
11 Dependent Tiers 82
11.1 Standard
Dependent Tiers. 82
11.2 Synchrony
Relations. 88
12 CHAT-CA Transcription. 90
13 Disfluency Transcription. 93
14 Transcribing
Aphasic Language. 95
15 Arabic and Hebrew Transcription. 99
16 Specific Applications. 102
16.1 Code-Switching. 102
16.2 Elicited
Narratives and Picture Descriptions. 103
16.3 Written
Language