The current version defines two types of documents: the global elements below... The global types are available for reusing through schema type extension/restriction. The most up to date document definition is CHAT, it is also the richest in structure. Ideally, each group should develop a schema module defining the structure of their specific (class of) annotations, this schema should be an assembly of their definitions. Developed by Romeo Anghelache, from the CHAT specifications, released under the GNU Public License, 2001. Continuing development by Franklin Chen. structure of a CHAT document @Participants; a structure enumerating the beings participating 31 March 1999 is formatted as 1999-03-31 an AIF document, see http://morph.ldc.upenn.edu/AG/doc/xml/ administrative descriptions, reused from Dublin Core () in a word unscoped code in the middle of an utterance; CHAT {...} postcode at the end of an utterance; CHAT [+ ...] allows semi structured extensions to the current set of annotations allows for identification of a user who made this annotation inlined annotations, the conventional CHAT symbols are listed too [!] [!!] [?] in CHAT, ( text ) in CA [/] in CHAT [//] in CHAT, - in CA [///] in CHAT [/?] [/-] quicker tempo, no CHAT equivalent, used in CA slower tempo, no CHAT equivalent, used in CA larger volume, louder, no CHAT equivalent, used in CA lower volume, no CHAT equivalent, used in CA CA-style overlap fmc fmc fmc fmc mark overlap scoping [>] [<] [*] or [* text] ,, for %mor For %mor non verbal happenings 0 0word 0*word 00word &; phonological fragment &=; happening, such as sneeze &*WHO=word; word spoken by someone else a reference to a point/portion of a mute/action signal, e.g. 0 intended as a feature of a word, see also the CHAT conventional notations @ap @b @c @cue @d @f @fp @fs @g @i @inf @ins @k @l @m @n @nv @o @p @pm @pr @q @sc @sas @si @sl @t @u @x @wp a nonempty string list of languages syntactic structure the unit of a %mor line corresponding to a word (this element belongs to a word element, but, if the precise correspondence is not yet established, these elements will be present at the utterance level (contained in an utterance); %mor part of speech omitted, CHAT equivalent is 0 subcategory a group of words in %mor or %trn; can be empty if associated with separator or terminator a single word or a compound word a compound word structure used to let annotations to belong to more than one word, can be recursive, although unnecessary: one can attach more than one annotations to a word, group of words, or whole utterances a word xx yy xxx yyy www 0 0word 0*word 00word &; phonological fragment &=; happening, such as sneeze utterance initiators or linkers; they indicate the way to fit the current utterance with an earlier one, the CHAT conventional symbols are listed too +" +"" +^ +< +, ++ +≋ +≈ a pointer to a selection in the single video/audio file associated with the transcript frame second millisecond byte character + for mor word# =word (English translation) morphemes suffix marker, CHAT equivalent is - suffix fusion marker, CHAT equivalent is &; morphological category, CHAT equivalent is :, when used after the stem the beings along with their characteristics (age, sex...) stress, blocking etc. / // /// : ^ internal ^ at beginning #, pause between words [x number] in CHAT , ,, ; : [c] clause-delimiter; period, question, exclamation; basic utterance terminator; tone terminator +. +... +..? +!? +/. +/? +//. +//? +"/. +". For heritage only structure used to let annotations to belong to more than one word, can be recursive, although unnecessary: one can attach more than one annotations to a word, group of words, or whole utterances Phonetic transcriptions of orthographic forms. Collection of syllable constituents. Specifies a syllable constituent. The type is one of constituentTypeType. Each constituent can constist of one or more phones identified by zero-based index of the parent phonetic rep. If two adjacent nuclei exist, diphthongMember controls the parsing of a hiatus. Valid syllable constituent labels. Syllable boundary marker (e.g., space, '.') Syllable stress (i.e., primary or secondary) Left appendix Onset Nucleus Coda Right appendix Onset of an empty headed syllable Ambisyllabic Unknown This type represents the alignment of two phonetic representations. The number -1 represents an indel (insertion-deletion point). Any number >= 0 is the index of a phone identified by the referenced syllabifcation element. clitic or compound or reduplication markers in wordnet compound, CHAT + clitic, CHAT ~ hyphen, CHAT - clitic separators in morphemics preclitic, CHAT $ postclitic, CHAT ~ a group of utterances having something in common, usually the speaker these are the (legacy) dependent tiers, %mor line is, now, <morphemics> element %add %act %alt %cod; general purpose coding %coh; cohesion tier %com;[% text]; comments by investigator %eng %err; error coding [%exc ...] %exp; [= text] %flo %fac %gls %gpx %int %lan %ort %par: %: %pho: %pht: %mod: %def; on the main line, not recommended %sit %ssy %spa %spe %tim arbitrary annotations of the form %xfoo, intended as an extension mechanism %ton %rom %sdi %sch %sxx # ## ### fmc should change to xs:duration For use for delimited material. A workaround for lack of overlapping elements in XML. Begin delimited material End delimited material For use for delimited material. A workaround for lack of overlapping elements in XML. Begin delimited material End delimited material Begin and end delimited material (degenerate case) Underline arbitrary content Long feature <TAG material TAG>for Santa Barbara; other begin/end features Nonvocal <<TAG material TAG>>for Santa Barbara CA delimited material CA subword element § ° Ϋ [: word1 ...] ["] scoped symbols plural after form marker a pointer to a graphics file a word spoken by some other speaker equivalent of CHAT symbol @; category Hack for CA heritage the place to add research content