TalkBank Database Versioning

In order to ensure reproducibility in analyses, it is important to have access to specific versions of input corpora. This means that, for maximal accuracy, published articles should include the date on which the data was retrieved from the web.

To support version retrieval, we have been running GIT for all TalkBank databases since March 2017. Each GIT update is indexed by date. So, if you need to obtain a version of a given corpus or complete database to reproduce a given result, please send an email to macw@cmu.edu with the date from which you need the specific version of the corpus and we will send you the relevant corpus in a .zip file.

Within TalkBankDB it is also now possible to specify the status of a corpus or a search from a given previous date and then retrieve data from that time.