CHILDES Tagalog GarciaKidd Corpus


Rowena Garcia
Language Development and Multilingualism
ZAS Berlin

Evan Kidd
Languages and Linguistics
The Australian National University

Participants: 20
Type of Study: cross-sectional
Location: Manila
Media type: audio
DOI: doi:10.21415/94EK-BS22

Browsable transcripts

Download transcripts

Link to media folder

Citation information

Articles using this corpus should cite:

Garcia, R., & Kidd, E. (2022). Acquiring verb-argument structure in Tagalog: A multivariate corpus analysis of caregiver and child speech. Linguistics, 60(6), 1855-1906. https://doi.org/10.1515/ling-2021-0107

Additional references and publications which used the MPI Tagalog corpus

Garcia, R., Valdez, Michael C., & Boll-Avetisyan, N. (2025). Infants discriminate subtle nasal contrasts late: Evidence from field psycholinguistic experiments on Tagalog-learning infants in the Philippines. Developmental Psychology. 10.1037/dev0002053. Advance online publication. https://doi.org/10.1037/dev0002053

Kidd, E., Hellwig, B., Garcia, R., Defina, R., Davidson, L., & Allen, S. (2025). A comparative study of child-directed language across five cultures based on data from the Acquisition Sketch Project. Australian Journal of Linguistics. 1–25. https://doi.org/10.1080/07268602.2025.2514825

Pizarro-Guevara, J. S., & Garcia, R. (2024). Philippine psycholinguistics. Annual Review of Linguistics, 10, 145-167. https://doi.org/10.1146/annurev-linguistics-031522-102844

Barrios, A., & Garcia, R. (2023). Filipino children’s acquisition of nominal and verbal markers in L1 and L2 Tagalog. Languages, 8(3), 188. https://doi.org/10.3390/languages8030188

Garcia, R., Garrido, G., & Kidd, E. (2021). Developmental effects in the online use of morphosyntactic cues in sentence processing: Evidence from Tagalog. Cognition, 216, 104859. https://doi.org/10.1016/j.cognition.2021.104859

Project information

This corpus contains one-hour recordings of one-on-one interactions between 20 Tagalog-speaking children (2;0–4;0) and their caregivers (total: 20 hours). Families were visited at their homes in Metro Manila in June 2019 and were provided with a set of toys, picture books, and a wordless picture story book, Frog, Where are You? (Mayer, 1969). The toys included dolls, animal figures, kitchen set, doctor set, blocks, cars, furniture miniatures, and a magic slate. The picture books contained different images of actions between entities of varying animacy, as the original project focused on children’s acquisition of verb marking and thematic role assignment. The researcher encouraged the caregivers to use these materials to engage with the children. Other family members were also asked not to join the interactions. No other instructions were given.

The one-hour session was recorded using a video camera with a microphone. The recordings were transcribed by two native speakers of Tagalog. The transcriptions were originally created on ELAN (archived at Max Planck Institute for Psycholinguistics’ The Language Archive: https://hdl.handle.net/1839/43d11ac1-c666-4249-92cb-afd88387caaf). These were converted to .cha files via the ELAN2CHAT CLAN command. Other formatting changes were made automatedly via R (i.e., removal of “@e” which indicated echoing or repetition of a word in the original transcripts) and manually, for the transcripts to pass CLAN’s CHECK function. Personal details mentioned in the transcripts were replaced by “Firstname” and “Lastname” following the CHAT guidelines.

This table provides ages, MLU from 100 utterances and caregiver information.

Acknowledgements

This work was supported by the Max Planck Society. We are grateful to the families in Bagong Barrio, Caloocan City who welcomed us into their homes, and allowed us to collect and share this corpus data set. We thank Brian MacWhinney and Andrew Jessop for helping us prepare the transcripts into CHILDES-ready format.