|
Michael Brent Washington University |
|
Sharon Goldwater The University of Edinburgh |
|
Nan Bernstein Ratner University of Maryland |
If using these corpora in published materials, please use the following citations.
CHILDES database:
MacWhinney, B., & Snow, C. (1985). The child language data exchange system. Journal of Child Language, 12, 271-296.
Bernstein-Ratner corpus (original source of data):
Bernstein-Ratner, N. (1987). The phonology of parent-child speech. In K. Nelson and A. van Kleeck (Eds.), Children's Language (Vol. 6, 159-174). Erlbaum, Hillsdale, NJ.
Brent version of BR corpus:
Brent, M. R., & Cartwright, T. A. (1996). Distributional regularity and phonotactic constraints are useful for segmentation. Cognition, 61, 93-125.
This version of the Brown corpus has been parsed and labelled for semantic roles. These files were contributed by Sharon Goldwater in 2008 for corpora used in Goldwater et al. word segmentation papers.
Three files are included here, originally obtained from Michael Brent, and redistributed with his permission. This .zip file contains all three files.