CHILDES Clinical Spanish ChromoLang Corpus


Maite Fernández-Urquiza
Linguistics
University of Oviedo


Antonio Benitez-Burraco
Linguistics
University of Sevilla


Maria Salud Jiménez Romero
Education
University of Córdoba


Participants: 8
Type of Study: toy play
Location: Spain
Media type: video
DOI: doi:10.21415/p29g-4e87

Browsable transcripts

Download transcripts

Link to media folder

Because ChromoLang uses special codes and because the transcripts are not linked to the video, the full collection of original transcripts is available in this zip file The files in the browsable database were created through ASR and are mostly just useful for playing back video with an emphasis on the parents' speech. Correct encoding of the details of the speech is in the ChromoTrans files. Privacy considerations require that access to the video requires that permission be granted by the data contributors.

Citation information

In addition, some bibliographical references must be cited when using the different transcripts or video files of the CHROMOLANG corpus:

Project Description

The acronym CHROMOLANG stands for CHROMOsomopathies and LANGuage. The corpus collects conversations among 8 Spanish children (age range 3;4 – 11;6) with rare chromosomal disorders and their close relatives and/or speech therapists in naturalistic environments. It consists of 21 transcripts with their corresponding videos, distributed as follows:

The clinical history of each informant, along with the results of extensive molecular and cytogenetic analyses, and of multiple linguistic, cognitive and behavioral standardized assessments can be found in the papers linked to each of the transcripts in the previous section. We cannot provide the same information for DAR, GVA, and ECC, as these studies are still pending publication.

The total duration of the conversations is about two hours, with an approximate average of 15 minutes per informant. The data were collected as part of the research project led by Antonio Benítez-Burraco with funding from the Spanish Ministry with competences in Science (MINECO) (REF: FFI2016-78034-C2-2-P) within the State Plan for the Promotion of Scientific and Technical Research of Excellence. We studied children with various low-prevalence chromosomal disorders from a genetic, neuropsychological, and communicative perspective, with the aim of contributing to the description of the cognitive, linguistic, and communicative impairments associated with their genetic abnormalities.

Data collection protocol

Dr. Mª Salud Jiménez Romero, a speech therapist with the CHROMOLANG project, contacted the participants. Some of the speech samples were collected at the clinic, always in the presence of the participants' parents and siblings, and when the children were accustomed to the speech therapist's presence. In some cases, the children's parents were asked to video record them during conversations in familiar settings with family members. A detailed description of the methodological protocol used to carry out these case studies can be found in the publications cited above.

Codes

The codes used in the main and dependent tiers of the transcripts are summarized and exemplified in this table The first four codes in the table refer to phonetic phenomena characteristic of the dialectal varieties of Andalusian Spanish. The majority of the informants in the ChromoLang corpus are children from different regions of Andalusia, making it essential to consider their dialectal characteristics. To this end, it is crucial to pay attention to how their parents and close relatives produce certain phonemes (mainly /s/, /l/, /ɾ/, /r/) that tend to disappear, be aspirated, or assimilate in the post-nuclear syllable margin, depending on the phonetic context, and whose omission may or may not lead to an increase in the opening of the preceding vowel. When the informant or their family members produce any of these phonetic phenomena [ħ, pħ, tħ, cħ, pp, tt, l-l, A, E, O] we additionally code the word with the symbol @d.

Ethics approval for this research was granted by the Ethics Committee of the “Reina Sofía” Hospital (Córdoba, Andalucía, Spain). All participants' legal guardians gave their written informed consent for the collection and use of their data for research purposes, provided their anonymity was preserved. Therefore, pseudonyms have been used, and we request the utmost discretion in the use of the video files (approved access level).

Acknowledgements

When using the data for research purposes, it is required to include the following citation: “The data used in this research/paper are part of the ChromoLang corpus, and have been contributed to Talkbank by Maite Fernández-Urquiza, Mª Salud Jiménez Romero y Antonio Benítez-Burraco thanks to the funding provided by the Spanish Ministry with competences in Science (MINECO) (REF: FFI2016-78034-C2-2-P) within the State Plan for the Promotion of Scientific and Technical Research of Excellence”. We also beg the authors to send a copy of their work to fernandezmaite@uniovi.es