Tools
for Analyzing Talk
Part
1: The CHAT Transcription Format
Brian
MacWhinney
Carnegie
Mellon University
January 13, 2023
https://doi.org/10.21415/3mhn-0z89
When citing the use of TalkBank and CHILDES
facilities, please use this reference to the last printed version of the
CHILDES manual:
MacWhinney,
B. (2000). The CHILDES Project: Tools
for Analyzing Talk. 3rd Edition.
Mahwah, NJ: Lawrence Erlbaum Associates.
This allows us to track usage of the
programs and data systematically through scholar.google.com.
1 Introduction.. 5
2 The
CHILDES Project 7
2.1 Impressionistic
Observation. 7
2.2 Baby
Biographies. 8
2.3 Transcripts. 8
2.4 Computers. 9
2.5 Connectivity. 10
3 From
CHILDES to TalkBank. 11
3.1 Three
Tools. 11
3.2 Shaping
CHAT.. 12
3.3 Building
CLAN.. 12
3.4 Constructing
the Database. 13
3.5 Dissemination. 13
3.6 Funding. 14
3.7 How
to Use These Manuals. 14
3.8 Changes. 15
4 Principles. 16
4.1 Computerization. 16
4.2 Words
of Caution. 17
4.2.1 The
Dominance of the Written Word.. 17
4.2.2 The
Misuse of Standard Punctuation.. 18
4.2.3 Working
With Video.. 18
4.3 Problems
With Forced Decisions. 19
4.4 Transcription
and Coding. 19
4.5 Three
Goals. 19
5 minCHAT.. 21
5.1 minCHAT
– the Form of Files. 21
5.2 minCHAT
– Words and Utterances. 21
5.3 Analyzing
One Small File. 22
5.4 Next
Steps. 23
5.5 Checking
Syntactic Accuracy. 23
6 Corpus
Organization. 24
6.1 File
Naming. 24
6.2 Metadata. 24
6.3 The
Documentation File. 26
7 File Headers 28
7.1 Hidden
Headers. 28
7.2 Initial
Headers. 29
7.3 Participant-Specific
Headers. 36
7.4 Constant
Headers. 36
7.5 Changeable
Headers. 39
8 Words. 43
8.1 The
Main Line. 44
8.2 Basic
Words. 44
8.3 Special
Form Markers. 44
8.4 Unidentifiable
Material 48
8.5 Fragments
and Fillers. 49
8.6 Incomplete
and Omitted Words. 49
8.7 De-Identification,
Anonymization, and Pseudonyms. 50
8.8 Standardized
Spellings. 51
8.8.1 Letters. 51
8.8.2 Compounds
and Linkages. 51
8.8.3 Capitalization and Acronyms. 52
8.8.4 Numbers
and Titles.. 53
8.8.5 Kinship
Forms.. 53
8.8.6 Shortenings.. 53
8.8.7 Assimilations
and Cliticizations.. 54
8.8.8 Communicators
and Interjections.. 55
8.8.9 Spelling
Variants.. 56
8.8.10 Colloquial
Forms.. 56
8.8.11 Dialectal
Variations.. 56
8.8.12 Baby
Talk.. 57
8.8.13 Word
separation in Japanese.. 58
8.8.14 Abbreviations
in Dutch.. 58
9 Utterances 60
9.1 Children’s
Utterances. 60
9.2 Adult
Utterances. 61
9.3 Satellite
Markers. 62
9.4 Discourse
Repetition.. 62
9.5 C-Units,
sentences, utterances, and run-ons. 63
9.6 Retracing. 63
9.7 Basic
Utterance Terminators. 63
9.8 Separators. 64
9.9 Tone
Direction.. 65
9.10 Prosody
Within Words. 65
9.11 Local
Events. 66
9.11.1 Simple
Events. 66
9.11.2 Interposed
Word &*.. 67
9.11.3 Complex
Local Events.. 68
9.11.4 Pauses.. 68
9.11.5 Long
Events.. 69
9.12 Special
Utterance Terminators. 69
9.13 Utterance
Linkers. 72
10 Scoped Symbols 73
10.1 Audio
and Video Time Marks. 73
10.2 Paralinguistic
and Duration Scoping. 74
10.3 Explanations
and Alternatives. 75
10.4 Retracing,
Overlap, and Clauses. 76
10.5 Error
Marking. 80
10.6 Precodes
and Postcodes. 80
11 Dependent Tiers 82
11.1 Standard
Dependent Tiers. 82
11.2 Synchrony
Relations. 88
12 CHAT-CA Transcription. 90
13 Disfluency Transcription. 93
14 Transcribing
Aphasic Language. 95
15 Arabic and Hebrew Transcription. 99
16 Specific Applications. 102
16.1 Code-Switching. 102
16.2 Elicited
Narratives and Picture Descriptions. 103
16.3 Written
Language. 103
16.4 Sign
and Speech.. 104
17 Speech Act Codes. 106
17.1 Interchange
Types. 106
17.2 Illocutionary
Force Codes. 107
18 Error Coding. 110
18.1 Word
level error codes. 110
18.1.1 Phonological
errors [* p]. 110
18.1.2 Semantic
errors [* s]. 110
18.1.3 Neologisms
[* n]. 111
18.1.4 Morphological
errors [* m:a]. 111