Sakura Corpus

Susanne Miyata
Department of Medical Sciences
Aichi Shukotoku University


Participants: 31
Type of Study: xxx
Location: Japan
Media type: audio
DOI: doi:10.21415/T5M90R
Citation information

Project Description

This corpus of 18 conversations is the product of six graduation theses on gender differences in students' group talk. Each conversation lasted between 12 and 35 minutes (avg. 25 minutes) resulting in an overall time of 7 hours and 30 minutes. 31 Students (19 female, 12 male) participated in the study (Table 1). The participants gathered in groups of 4 students, either of the same or the opposite sex (6 conversations with a group of 4 female students, 6 with 4 male students, and 6 conversations with 2 male and 2 female students), according to age (first and third year students) and affiliation (two academic departments). In addition, the participants of each conversation came from the same small-sized class and were well acquainted.

The participants were informed that their conversations may be transcribed and a video recorded for use in possible publication when recruited. Additionally, permission was asked once more after the transcription in cases where either private information had been displayed, or a misunderstanding concerning the nature and degree of the publication of the conversations became apparent during the conversation.

The recordings took place in a small conference room at the university between or after lectures. The participants were given a card with a conversation topic to start with, but were free to vary (topic 1 "What do you expect from an opposite sex friend?" [isee ni motomeru koto]; topic 2 "Are you a dog lover or a cat lover?" [inuha ka nekoha ka]; topic 3 "About part-time work" [arubaito ni tsuite]). The investigator was not present during the recording. The combination of participants, the topic, and the duration of the 18 conversations are given in Table 2.

The participants produced 15,449 utterances overall (female: 8,027 utterances, male: 7,422 utterances). All utterances were linked to video and transcribed in regular Japanese orthography and Latin script (Wakachi2002), and provided with morphological tags (JMOR04.1). Proper names were replaced by pseudonyms.

Table 1: List of participants, sex, age, and their quantity of appearances
ID Age Sex # ID Age Sex #

Table 2: Specifications of the 18 conversations
File Participants Sex Topic Duration
sakura01G3M H3M K3F L3FMF126'00"
sakura02A1F B1F C1M B1MMF135'30"
sakura03H3F I3F J3M K3MMF211'45"
sakura04H1F B1F C1M E1MMF226'25"
sakura05I3M G3M L3F E3FMF327'20"
sakura06G1F F1F E1M D1MMF326'00"
sakura07D3F F3F L3F E3FFF125'00"
sakura08E1F B1F C1F D1FFF128'25"
sakura09E1F B1F A1F D1FFF227'00"
sakura10A3F B3F C3F G3FFF225'15"
sakura11A1F B1F C1F D1FFF325'25"
sakura12A3F B3F C3F G3FFF323'55"
sakura13G3M H3M I3M J3MMM121'20"
sakura14B1M A1M E1M C1MMM130'00"
sakura15I3M H3M G3M M3MMM225'50"
sakura16E1M A1M C1M D1MMM223'30"
sakura17L3M G3M I3M H3MMM326'45"
sakura18A1M B1M C1M D1MMM316'50"

Additional contributors: Banno, Kyoko; Konishi, Saya; Matsui, Ayumi; Matsumoto, Shiori; Oogi, Rie; Takahashi, Akane; Muraki, Kyoko.