|
Richard Weist Department of Psychology SUNY Fredonia weist@a12t.cc.fredonia.edu |
|
Gaja Jarosz Department of Linguistics UMass Amherst jarosz@linguist.umass.edu |
| Participants: | 4 |
| Type of Study: | naturalistic, longitudinal |
| Location: | Poland |
| Media type: | audio |
| DOI: | doi:10.21415/7TYG-KF32 |
Weist, Richard, & Witkowska-Stadnik, Katarzyna. (1986). Basic relations in child language and the word order myth. International Journal of Psychology, 21, 363–381.
Weist, Richard, Wysocka, Hanna, Witkowska-Stadnik, Katarzyna, Buczowska, Ewa, & Konieczna, Emilia (1984). The defective tense hypothesis: On the emergence of tense and aspect in child Polish. Journal of Child Language, 11, 347–374.
Jarosz, Gaja (2010). Implicational markedness and frequency in constraint-based computational models of phonological learning. Journal of Child Language. Special Issue on Computational Models of Child Language Learning 37(3). Cambridge University Press. 565-606.
Jarosz, Gaja, Calamaro, Shira, and Zentz, Jason (2017). Input Frequency and Inductive Bias in the Acquisition of Syllable Structure in Polish. Manuscript, Linguistics Department, Yale University.
In accordance with TalkBank rules, any use of data from this corpus must be accompanied by at least one of the above references. In this case, it would be good to cite one from Weist and one from Jarosz.
| Participant Name | Age Range | Sessions | Sex
| Bartosz | 1;7-1;11 | 6 | M
| Kubuś | 2;1-2;6 | 7 | M
| Marta | 1;7-1;10 | 6 (3 audio) | F
| Wawrzon | 2;2-3;2 | 20 (19 audio) | M
| |
All of the children were from middle-class families raised in the urban environment of Poznań, Poland. In general, their parents were highly educated. The children were recorded in their homes (typically an apartment) by two experimenters. One of the experimenters carried a small bag containing the tape recorder and the other took context notes, which were integrated during transcription.
The children’s productions were transcribed using broad phonetic transcription with the help of the open-source Phon software (Rose et al. 2006). The orthographic transcripts were used as the basis for creating phonetic transcriptions of the children’s target pronunciations, and the audio recordings were used to phonetically transcribe the children’s actual productions and align them with the target transcriptions word by word. The transcription of all child productions was first performed independently by two transcribers trained in phonetic transcription, at least one of whom was a native speaker of Polish. Then, two Polish speakers trained in phonetic transcription worked together to create a consensus transcription of all productions, relying on a third phonetically trained native speaker of Polish to adjudicate in cases when agreement could not be reached. The resulting corpus includes phonetic transcriptions of the children’s productions in all the available audio files, providing word-by-word alignment of target pronunciations and actual pronunciations.
Boundaries: We use word groups to delineate phonological word boundaries. In all cases except one, orthographic word boundaries correspond to phonological word boundaries. The only exception is the proclitics 'w' [v]/[f] and 'z' [z]/[s] which attach to the following word and cannot be pronounced independently. In this case, the orthography tier encodes the orthographic word boundaries, putting the proclitic in its own word group, while the IPA Target and IPA Actual tiers encode the proclitic together with the next word. So for example. '[z][kotem]' would be '[][skotɛm]' on the Target tier and potentially something like '[][sotɛm]' on the Actual tier.
Tier Conventions: We have maintained many of the conventions from the original CHAT transcripts and introduced several codes to denote special situations regarding phonetic transcription.
The following codes were used on the orthography tier:
IPA Conventions