|
Michael Brent Washington University |
|
Sharon Goldwater The University of Edinburgh |
|
Nan Bernstein Ratner University of Maryland |
If using these corpora in published materials, please use the following citations.
CHILDES database:
MacWhinney, B., & Snow, C. (1985). The child language data
exchange system. Journal of Child Language, 12, 271-296.
Bernstein-Ratner corpus (original source of data):
Bernstein-Ratner, N. (1987). The phonology of parent-child
speech. In K. Nelson and A. van Kleeck (Eds.), Children's
Language (Vol. 6, 159-174). Erlbaum, Hillsdale, NJ.
Brent version of BR corpus:
Brent, M. R., & Cartwright, T. A.. (1996). Distributional regularity and
phonotactic constraints are useful for segmentation. Cognition, 61, 93-125.
This version of the Brown corpus has been parsed and labelled for semantic roles. These files were contributed by Sharon Goldwater in 2008 for corpora used in Goldwater et al. word segmentation papers.
Three files are included here, originally obtained from Michael Brent, and redistributed with his permission. This .zip file contains all three files.