MotorBank Portuguese AVFAD Corpus


Luis Jesus
Institute of Electronics and Informatics
University of Aveiro

Participants: 7098
Type of Study: voice assessment
Location: Portugal
Media type: audio
DOI:

To facilitate downloading, the database is broken into five *.zip files. Each .zip file contains audio data for the 11 sample types from about 150 participants, as described in the AVFAD.xlsx Excel file which also has acoustic parameter information.

Media folders

A-C

D-L

M

N-Z

silence

Citation information

In accordance with TalkBank rules, any use of data from this corpus must cite this reference:

Publications resulting from this project:

Project Description

This database a new open access resource called Advanced Voice Function Assessment Databases (AVFAD) was developed, based on a sample of 709 individuals (346 clinically diagnosed with vocal pathology and 363 with no vocal alterations) recruited in Portugal. All clinical conditions were registered according to the Classification Manual of Voice Disorders-I. Participants were audio-recorded, producing the following vocal tasks: Sustaining vowels /a, i, u/; reading of six CAPE-V sentences; reading a phonetically balanced text; spontaneous speech.

The AVFAD are comprised of 8648 uncompressed audio files and an additional database file with 19 Praat Voice Report parameter values and 16 clinical data entries per participant. Praat annotated files, where the segment of the vowel /a/ used to automatically run an acoustic analysis with a Praat script (ProcessVoiceReport_01_00_00.psc) is marked, are also distributed with the database. Radial graphs were generated using the Excel file RadialGraphs.xlsx considering that all variables had an approximately normal distribution and using previously calculated average and standard deviation values for all parameters.

The AVFAD will allow future cooperative work and testing of non-invasive methods that aid voice pathology diagnosis. Each speaker directory includes (at least) the following files:

  1. ZZZ001.wav [i] 3 repetitions (3-5 seconds duration each)
  2. ZZZ002.wav [a] 3 repetitions (3-5 seconds duration each)
  3. ZZZ003.wav [u] 3 repetitions (3-5 seconds duration each)
  4. ZZZ004.wav A Marta e o avÙ vivem naquele casar„o rosa velho (CAPE-V) 3 repetitions
  5. ZZZ005.wav Sofia saiu cedo da sala (CAPE-V) 3 repetitions
  6. ZZZ006.wav A asa do avi„o andava avariada (CAPE-V) 3 repetitions
  7. ZZZ007.wav Agora È hora de acabar (CAPE-V) 3 repetitions
  8. ZZZ008.wav A minha m„e mandou-me embora (CAPE-V) 3 repetitions
  9. ZZZ009.wav O Tiago comeu quatro peras (CAPE-V) 3 repetitions
  10. ZZZ010.wav O Vento Norte e o Sol (The North Wind and the Sun)
  11. ZZZ011.wav Spontaneous speech (at least 20 seconds)
  12. ZZZ002.prt Praat annotated binary [audio+annotation] file

Some directories include additional files produced for the work reported Jesus, L., Castilho, S., & Hall, A. (2015).

We also include 60-70s of "silence" (background noise) recorded in the same room and just after the other audio recordings. Files have the following format: Visit_Date_Visit_Place_silence_60s.wav

University of Aveiro's Health Assessment Tools are distributed using a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.