CHILDES English Fernald-Marchman-Bang FMB_book Corpus


Anne Fernald
Department of Psychology
Stanford University

Virginia Marchman
Department of Psychology
Stanford University

Janet Bang
Child Development
San Jose State University

Participants: 45
Type of Study: daylong audio
Location: California
Media type: not available
DOI: xxx

Browsable transcripts

Download transcripts

Citation information

Publications using this data should cite:

Tan, A., Read, K., *Gamboa, S., Bang, J.Y., & Marchman, V. (under review). The power of the page: Comparing richness in text and talk during book sharing with two-year old children

Codes and pre-processed data for this publication can be seen on Open Science Framework (osf.io/q26jx).

Additional publications presenting data from this sample:

Bang, J.Y., Mora, A., Munévar, M., Fernald, A., & Marchman, V. (under revision). Time to talk: Multiple sources of variability in caregiver verbal engagement during everyday activities in English- and Spanish-speaking families in the U.S. https://psyarxiv.com/6jzwg/

Bang, J.Y., Kachergis, G., Weisleder, A., & Marchman, V. (2023). An automated classifier for periods of sleep and target-child-directed speech from LENA recordings. Language Development Research, 3(1), 211-248. https://doi.org/10.34842/xmrq-er43

Marchman, V. A., Bermúdez, V. N., Bang, J. Y., & Fernald, A. (2020). Off to a good start: Early Spanish‐language processing efficiency supports Spanish‐ and English‐language outcomes at 4½ years in sequential bilinguals. Developmental Science, 23(6), e12973. https://doi.org/10.1111/desc.12973

Fernald, A., Marchman, V. A., & Weisleder, A. (2013). SES differences in language processing skill and vocabulary are evident at 18 months. Developmental Science, 16(2), 234–248. https://doi.org/10.1111/desc.12019

Project Description

The present corpus includes 15 English- and 13 Spanish-speaking families with book sharing interactions that occurred naturally at home. The corpus reflects a subset of the sample used in Tan et al. (22 English and 20 Spanish).

This book sharing activity transcripts are from the same daylong recordings as the transcripts labeled ‘book’ in the Fernald home activities corpus, with some minor additions. The Fernald home activities corpus sampled 10-min periods of the day with the most adult talk directed to the child wearing the recorder. For the present book sharing corpus, research assistants focused specifically on the book sharing activities, and listened to the daylong recording to capture the ‘true’ start and end of the book sharing activity, to the best of their ability. Thus, some portions of the transcripts duplicate those in the Fernald home activities corpus, but include additional information. Although at times caregivers switched to other activities, each transcript only includes the interactions that occurred during book sharing (noted by the “@ch_books” gem). If there is more than one transcript per participant, this indicates that the caregiver-child interaction occurred at a different time in the day or on a different day. Additional recruitment, sampling, and participant details can be seen elsewhere (Bang et al., under review; Tan et al., under review).

The present transcripts were used to assess the “Read aloud” speech and the “Spontaneous - book” speech in Tan et al. “Read aloud” speech is denoted using the [+ recit] and the [+ rmix] postcodes. “Spontaneous - book” speech excludes [+ recit], [+ rmix], and any overheard speech [- ohs]. Additional details regarding transcript information, including postcodes, can be found in the documentation for the Fernald home activities corpus.

Please note: Due to potential copyright issues, transcripts of the “Book texts” used in Tan et al. were not included.

Transcripts of the “Spontaneous - Other” used in Tan et al., can be found in the Fernald home activities corpus.

Acknowledgements

We are grateful to the children and parents who participated in this research. The following work would not have been possible without the collaboration of the undergraduate students and staff of the Language Learning Lab, directed by Dr. Anne Fernald. This work was a part of an undergraduate honors thesis by Sophia Gamboa.

We are grateful to the research assistants who helped code and transcribe the data: Jessica Magallón, Nadia Segura, Shriya Anand, Sophia Gamboa, Marisol Rodriguez, Maria Lopez, Stephen Lopez, Jesús Esquivel-Barrientos, Laura Jonsson, Kalpana Gopalkrishnan, Maribel Mercardo, Tami Alade, Jaqueline De Paz-Romero, Lesly Leon, Alice Articia, Julia Briones-Avila, and Elizabeth Sanchez. We greatly appreciate the patience, positivity, and dedication by all to capture natural and spontaneous language in everyday interactions with young children.

This work was supported by grants from the National Institutes of Health (R01 HD42235, R01 DC008838, R01 HD092343, 2R01 HD069150), the Schusterman Foundation, the David and Lucile Packard Foundation, the Bezos Family Foundation, and the Stanford Maternal and Child Health Research Institute.

Usage Restrictions

Please only use these transcripts for research/educational purposes.