A full definition of the CHAT format. Developed by Romeo Anghelache, from the CHAT specifications, released under the GNU Public License, 2001. Continuing development by Franklin Chen. CHAT manual section on this topic... A single CHAT transcript. CHAT manual section on this topic... List of the participants in the transcript along with their individual attributes. Every utterance in the transcript must be identified by a unique listed participant. CHAT manual section on this topic... Begin a gem (requires matching end of gem). CHAT manual section on this topic... Label for a begin/end gem. CHAT manual section on this topic... End a gem (requires earlier begin of gem). CHAT manual section on this topic... Label for a begin/end gem. CHAT manual section on this topic... Begin a lazy gem; does not require a matching end gem, but its scope is up to the next lazy gem header, or the end of the transcript if there is no further lazy gem header. CHAT manual section on this topic... Label for a lazy gem. CHAT manual section on this topic... Id for transcript, usually derived from the file name. CHAT manual section on this topic... Version of the XML Schema this transcript was created for. CHAT manual section on this topic... Date of transcription. Note that there can only be one date. A session spread out over multiple dates must be split into multiple CHAT transcripts. CHAT manual section on this topic... Every transcript must be part of a corpus. CHAT manual section on this topic... The transcript may be associated with at most one media file. CHAT manual section on this topic... The transcript may be associated with at most one media file. CHAT manual section on this topic... The main languages used in the transcript. (Other languages used only in specific words do not need to be listed.) CHAT manual section on this topic... CHAT manual section on this topic... Information about text color mappings for use by the CLAN editor. CHAT manual section on this topic... The font to be used for display in the CLAN editor. CHAT manual section on this topic... Key to ensure uniqueness among u elements. These IDs are used externally for double-blind transcription. Key to ensure all participants have unique ids. CHAT manual section on this topic... KeyRef to ensure that utterances refer to an actual participant. CHAT manual section on this topic... an AIF document, see http://morph.ldc.upenn.edu/AG/doc/xml/ administrative descriptions, reused from Dublin Core Unspoken segment in a word. CHAT manual section on this topic... CHAT manual section on this topic... Unscoped complex local events in the middle of an utterance. CHAT manual section on this topic... Code that can only occur at the end of an utterance. Currently arbitrary information, although there are some conventions. CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... Scoped annotation that applies to a whole utterance. CHAT manual section on this topic... Open-ended user-specifiable annotation subtype. CHAT manual section on this topic... Allows for identification of a user who made this annotation. (Not currently supported in CHAT.) Scoped annotation that applies to a group. CHAT manual section on this topic... A comment header. CHAT manual section on this topic... The allowable types of comment headers. CHAT manual section on this topic... Activities. CHAT manual section on this topic... Bck. CHAT manual section on this topic... Date of transcription. CHAT manual section on this topic... Exceptions. CHAT manual section on this topic... Interaction type. CHAT manual section on this topic... Number. CHAT manual section on this topic... Recording quality. CHAT manual section on this topic... Transcription. CHAT manual section on this topic... Blank. CHAT manual section on this topic... Thumbnail. CHAT manual section on this topic... Comment. CHAT manual section on this topic... Mark the beginning of a transcript-wide scope in which all material is in the specified language (which must be a proper ISO code listed in the languages header). CHAT manual section on this topic... Location. CHAT manual section on this topic... New episode. CHAT manual section on this topic... Room layout. CHAT manual section on this topic... Situation. CHAT manual section on this topic... Tape location. CHAT manual section on this topic... Time duration. CHAT manual section on this topic... Time start. CHAT manual section on this topic... Transcriber for this transcript. CHAT manual section on this topic... Warning. CHAT manual section on this topic... Page. CHAT manual section on this topic... Retracing and other markers. CHAT manual section on this topic... [!] CHAT manual section on this topic... [!!] CHAT manual section on this topic... [?] in CHAT, ( text ) in CA CHAT manual section on this topic... [/] in CHAT CHAT manual section on this topic... [//] in CHAT, - in CA CHAT manual section on this topic... [///] in CHAT CHAT manual section on this topic... [/?] CHAT manual section on this topic... [/-] CHAT manual section on this topic... CA-style overlap CHAT manual section on this topic... Integer label to distinguish among different overlaps over the same text. CHAT manual section on this topic... Start or end of overlap CHAT manual section on this topic... Start CHAT manual section on this topic... End CHAT manual section on this topic... The first of a set of overlaps. CHAT manual section on this topic... The second (or third, etc.) of a set of overlaps. CHAT manual section on this topic... Mark a scope for overlaps. CHAT manual section on this topic... Integer label to distinguish among different overlaps over the same text. CHAT manual section on this topic... [>] CHAT manual section on this topic... [<] CHAT manual section on this topic... [*] or [* text] CHAT manual section on this topic... Tag marker, used in both main line and %mor. CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... A tag marker on %mor line. CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... Nonverbal event. CHAT manual section on this topic... 0 CHAT manual section on this topic... &=; happening, such as sneeze CHAT manual section on this topic... A nonempty string. Allowable media name. CHAT manual section on this topic... A list of languages, using the official ISO codes. CHAT manual section on this topic... The unit of a %mor line corresponding to a single non-compound word on the main line. CHAT manual section on this topic... %mor part of speech CHAT manual section on this topic... %mor POS subcategory CHAT manual section on this topic... GRASP data for a single word CHAT manual section on this topic... A group of words in %mor or %trn. CHAT manual section on this topic... %mor unit of one-to-one correspondence with main line. A single word or a compound word or a terminator. CHAT manual section on this topic... mor preclitic CHAT manual section on this topic... mor postclitic CHAT manual section on this topic... a compound word in mor using + CHAT manual section on this topic... A group of material that is annotated. May be nested, i.e., a group may contain groups as well as words and other material. CHAT manual section on this topic... What language(s) a word is in (if not the one in default scope). CHAT manual section on this topic... Word is to be interpreted in a single language. CHAT manual section on this topic... Word is a combination of many languages. CHAT manual section on this topic... Word can be interpreted as one of many languages. CHAT manual section on this topic... A word. Note that there are lexical restrictions on what characters are allowed in the text of a word. CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... word# indicates the word is a separated prefix CHAT manual section on this topic... @z:code user-specified code CHAT manual section on this topic... -s and similar after a form marker CHAT manual section on this topic... Form marker: an attribute for a word. CHAT manual section on this topic... @b CHAT manual section on this topic... @c CHAT manual section on this topic... @d CHAT manual section on this topic... @f CHAT manual section on this topic... @fp CHAT manual section on this topic... @fs CHAT manual section on this topic... @g CHAT manual section on this topic... @i CHAT manual section on this topic... @k CHAT manual section on this topic... @l CHAT manual section on this topic... @n CHAT manual section on this topic... @nv CHAT manual section on this topic... @o CHAT manual section on this topic... @p CHAT manual section on this topic... @pm CHAT manual section on this topic... @q CHAT manual section on this topic... @sas CHAT manual section on this topic... @si CHAT manual section on this topic... @sl CHAT manual section on this topic... @t CHAT manual section on this topic... @u CHAT manual section on this topic... @x CHAT manual section on this topic... @wp CHAT manual section on this topic... Optional attribute for a word. xx CHAT manual section on this topic... xxx CHAT manual section on this topic... yy CHAT manual section on this topic... yyy CHAT manual section on this topic... www CHAT manual section on this topic... 0 CHAT manual section on this topic... 0word CHAT manual section on this topic... 00word CHAT manual section on this topic... &; phonological fragment CHAT manual section on this topic... Utterance initiators or linkers; they indicate the way to fit the current utterance with an earlier one. CHAT manual section on this topic... +" CHAT manual section on this topic... +^ CHAT manual section on this topic... +< CHAT manual section on this topic... +, CHAT manual section on this topic... ++ CHAT manual section on this topic... +≋ CHAT manual section on this topic... +≈ CHAT manual section on this topic... Used before a terminator. Media bullet is allowed at the end of an utterance only if after a terminator. CHAT manual section on this topic... A pointer to a selection in the single video/audio file associated with the transcript. CHAT manual section on this topic... CHAT manual section on this topic... The start time for the selection. CHAT manual section on this topic... The end time for the selection. CHAT manual section on this topic... The unit of time used for the selection. CHAT manual section on this topic... Whether the CLAN editor should skip upon playback. CHAT manual section on this topic... Type of external media referenced. CHAT manual section on this topic... The time unit. frame second millisecond byte character word# CHAT manual section on this topic... =word (English translation) CHAT manual section on this topic... morphemes CHAT manual section on this topic... suffix marker, CHAT equivalent is -suffix CHAT manual section on this topic... suffix fusion marker, CHAT equivalent is &suffix CHAT manual section on this topic... morphological category, CHAT equivalent is :suffix CHAT manual section on this topic... Information about a participant CHAT manual section on this topic... Speaker's id. CHAT manual section on this topic... Speaker's role. CHAT manual section on this topic... CHAT manual section on this topic... Speaker's name. CHAT manual section on this topic... Speaker's age, start of range during transcript. CHAT manual section on this topic... Speaker's age, end of range during transcript. CHAT manual section on this topic... Speaker's group. CHAT manual section on this topic... Speaker's sex. CHAT manual section on this topic... Speaker's SES. CHAT manual section on this topic... Speaker's education. CHAT manual section on this topic... Custom field for additional information about speaker. CHAT manual section on this topic... Speaker's birth date. CHAT manual section on this topic... Speaker's list of languages. Actually redundant because duplicated from transcript's list of languages. CHAT manual section on this topic... Speaker's first language (note that this does not need to be listed in the languages header). CHAT manual section on this topic... Speaker's birthplace. CHAT manual section on this topic... Prosody: stress, blocking etc. CHAT manual section on this topic... : CHAT manual section on this topic... ^ internal CHAT manual section on this topic... ^ at beginning CHAT manual section on this topic... Pause at a point in an utterance. CHAT manual section on this topic... [x number] in CHAT CHAT manual section on this topic... Separator or tone direction marker. CHAT manual section on this topic... CHAT manual section on this topic... , CHAT manual section on this topic... ; CHAT manual section on this topic... : CHAT manual section on this topic... [c] clause-delimiter; CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... Terminator for an utterance. CHAT manual section on this topic... Period. CHAT manual section on this topic... Question mark. CHAT manual section on this topic... Exclamation point. CHAT manual section on this topic... +. CHAT manual section on this topic... +... CHAT manual section on this topic... +..? CHAT manual section on this topic... +!? CHAT manual section on this topic... +/. CHAT manual section on this topic... +/? CHAT manual section on this topic... +//. CHAT manual section on this topic... +//? CHAT manual section on this topic... +"/. CHAT manual section on this topic... +". CHAT manual section on this topic... For heritage only CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... Main line terminator with optional %mor information. CHAT manual section on this topic... CHAT manual section on this topic... Terminator on %mor line, important for %gra. CHAT manual section on this topic... Group purely for sign annotation purposes. Note that there are no main line annotations allowed for the group. Grouping brackets 〔 and 〕 may be used. CHAT manual section on this topic... Atomic unit on the %sin tier. CHAT manual section on this topic... CHAT manual section on this topic... Group purely for phonetic annotation purposes. Note that there are no main line annotations allowed for the group. CHAT manual section on this topic... Syllable stress. Either primary or secondary. Primary stress, unicode 0x02c8 Secondary stress, unicode 0x02cc CHAT manual section on this topic... CHAT manual section on this topic... Phonetic transcriptions of orthographic forms. CHAT manual section on this topic... Phonetic transcription for a word. CHAT manual section on this topic... A phone in the IPA transcription. A phone may consist of one or more unicode characters. Specifies a syllable constituent. The type is one of constituentTypeType. Each constituent can constist of one or more phones identified by zero-based index of the parent phonetic rep. The syllable constituent type for this phone. Each phone is required to have a locally unique id. i.e., sibling ph elements cannot have the same id. Used when two ph elements with sctype of 'N' are adjacent. If hiatus is true, each nucleus is the root of its own syllable. If hiatus is false, the pair of nuclei are considered a diphthong. Valid syllable constituent labels. Syllable boundary marker ('.') Syllable stress (i.e., primary or secondary) Deprecated: use stress attribute of syll_start instead Left appendix Onset Nucleus Coda Right appendix Onset of an empty headed syllable Ambisyllabic Unknown This type represents the alignment of two phonetic representations. Clitic or compound marker inside a word. CHAT manual section on this topic... compound, CHAT + CHAT manual section on this topic... clitic, CHAT ~ CHAT manual section on this topic... A group of utterances having something in common, usually the speaker. A single utterance, along with all dependent information. CHAT manual section on this topic... The language the entire utterance is in (unless individual words' languages are overridden explicitly). CHAT manual section on this topic... The speaker of the utterance. CHAT manual section on this topic... A unique ID is provided for each utterance in a transcript, for use by tools. Note that the text format of CHAT does not currently support this, and CLAN does not know about it. CHAT manual section on this topic... Utterance annotation type. CHAT manual section on this topic... %add CHAT manual section on this topic... %act CHAT manual section on this topic... %alt CHAT manual section on this topic... %cod; general purpose coding CHAT manual section on this topic... %coh; cohesion tier CHAT manual section on this topic... %com; comments by investigator CHAT manual section on this topic... %eng CHAT manual section on this topic... %err; error coding CHAT manual section on this topic... %exp; [= text] CHAT manual section on this topic... CHAT manual section on this topic... %flo CHAT manual section on this topic... %fac CHAT manual section on this topic... %gls CHAT manual section on this topic... %gpx CHAT manual section on this topic... %int CHAT manual section on this topic... %lan CHAT manual section on this topic... %ort CHAT manual section on this topic... %par: CHAT manual section on this topic... %def; on the main line, not recommended CHAT manual section on this topic... %sit CHAT manual section on this topic... %spa CHAT manual section on this topic... %tim CHAT manual section on this topic... Arbitrary annotation of the form %xfoo, intended as an extension mechanism for the user. CHAT manual section on this topic... Type of group annotation on main line. CHAT manual section on this topic... [%act: text] CHAT manual section on this topic... [=? text] CHAT manual section on this topic... [% text] CHAT manual section on this topic... [= text] CHAT manual section on this topic... [=! text] CHAT manual section on this topic... [%sdi: text] CHAT manual section on this topic... [%sch: text] CHAT manual section on this topic... [%sxx: text] CHAT manual section on this topic... Length of a pause, in nonnumeric terms. CHAT manual section on this topic... (.) CHAT manual section on this topic... (..) CHAT manual section on this topic... (...) CHAT manual section on this topic... Pause length, in numeric terms. CHAT manual section on this topic... For use for delimited material. A workaround for lack of overlapping elements in XML. Begin delimited material End delimited material For use for delimited material. A workaround for lack of overlapping elements in XML. Begin delimited material End delimited material Begin and end delimited material (degenerate case) Mark as underlined arbitrary content, for presentation purposes in CLAN. CHAT manual section on this topic... Mark as italicized arbitrary content, for presentation purposes in CLAN. CHAT manual section on this topic... Long event. CHAT manual section on this topic... Nonvocal material. CHAT manual section on this topic... Either begin, end, or simple. CHAT manual section on this topic... CA delimited material with begin/end. CHAT manual section on this topic... CA subwords that must occur inside a word. CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... ˈ CHAT manual section on this topic... ˌ CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... § CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... ° CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... Ϋ CHAT manual section on this topic... [: word1 ...]; indicate replacement of a word by one or more words instead. [:: word1 ...] to indicate that the word is a real word CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... CHAT manual section on this topic... [:: word1 ...] indicates that the word was real and MOR should analyze it CHAT manual section on this topic... ["]; quoted material. CHAT manual section on this topic... Legal speaker ID for identifying utterances. CHAT manual section on this topic... Transcript-scoped option that affects the interpretation of the transcript. CHAT manual section on this topic... Allows CA features and restriction relaxations. CHAT manual section on this topic... Turns off checking of time sequence of bullets. CHAT manual section on this topic... Allows a transcript to be accepted even though it is not properly parsable yet. CHAT manual section on this topic... Purely for the display purposes of CLAN, in order to have CLAN handle multiple media bullets in a single utterance. CHAT manual section on this topic... IPA. CHAT manual section on this topic... Sign language. CHAT manual section on this topic... Allow the use of capital letters inside words, relaxing the usual restrictions (which have their own exceptions depending on the language in scope). CHAT manual section on this topic... CHAT manual section on this topic... A reference to a graphics file. CHAT manual section on this topic... &*WHO=word; word spoken by someone else during an utterance. CHAT manual section on this topic... Morphological category CHAT manual section on this topic... Morphological stem, alphanumeric CHAT manual section on this topic... Hack for CA heritage; unparsed text. CHAT manual section on this topic... Language code restricted to only three instead of one to eight characters. CHAT manual section on this topic... Allowable roles. CHAT manual section on this topic...