Skip to main content

Showing 1–2 of 2 results for author: Kovačić, B

.
  1. arXiv:2403.10293  [pdf, other

    cs.CL

    MaiBaam: A Multi-Dialectal Bavarian Universal Dependency Treebank

    Authors: Verena Blaschke, Barbara Kovačić, Siyao Peng, Hinrich Schütze, Barbara Plank

    Abstract: Despite the success of the Universal Dependencies (UD) project exemplified by its impressive language breadth, there is still a lack in `within-language breadth': most treebanks focus on standard languages. Even for German, the language with the most annotations in UD, so far no treebank exists for one of its language varieties spoken by over 10M people: Bavarian. To contribute to closing this gap… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: LREC-COLING 2024

  2. arXiv:2403.05902  [pdf, other

    cs.CL

    MaiBaam Annotation Guidelines

    Authors: Verena Blaschke, Barbara Kovačić, Siyao Peng, Barbara Plank

    Abstract: This document provides the annotation guidelines for MaiBaam, a Bavarian corpus annotated with part-of-speech (POS) tags and syntactic dependencies. MaiBaam belongs to the Universal Dependencies (UD) project, and our annotations elaborate on the general and German UD version 2 guidelines. In this document, we detail how to preprocess and tokenize Bavarian data, provide an overview of the POS tags… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.