TalkBank MOR and UD Grammars

We are currently transitioning the TalkBank system for morphosyntactic analysis from the MOR/POST/MEGRASP system to the UD (Universal Dependencies) system which is described in detail here . We apply UD taggers to TalkBank files using Chris Manning's Stanza system that has been built into the Batchalign program created by Houjun Liu.

The great advantage of UD over MOR is that it is available for many more languages. It also seems to perform as well as or better than MOR for computing dependency relations on the %gra line. However, its control of morphological analysis on the %mor line is not yet as good as MOR. So, for English and a few other languages, we will retain use of MOR for this purpose until we have finally harmonized codes with UD.

As of September 2023, we have tagged these languages in CHILDES using UD: Afrikaans, Dutch, German, Icelandic, Italian, Norwegian, Portuguese, Romanian, and Swedish. We hope to eventually apply UD through Batchalign to all languages in CHILDES. Until that work is completed, users may wish to continue use of MOR through these grammars:

  • Cantonese (yue): This grammar was built by Brian MacWhinney, Sam Po Law, and Anthony Kong with additional help from a Cantonese-English lexicon provided by K. K. Luke.
  • Chinese (zho): This grammar was built by Brian MacWhinney and Twila Tardif. Thanks to K. J. Chen and the CKIP Group of the Academica Sinica for the input lexicon
  • English (eng): This grammar was built by Brian MacWhinney and Mitzi Morris.
  • Hebrew (heb): This grammar was developed by Aviad Albert, Bracha Nir, Shuly Wintner, Brian MacWhinney, and Ruth Berman.
  • Japanese (jpn): This grammar was constructed by Norio Naka and Susanne Miyata. The distribution includes the Wakachi system from Susanne Miyata for grammatical reference.
  • Spanish (spa): This grammar was built by Brian MacWhinney.