Skip to main content

Showing 1–3 of 3 results for author: Paonessa, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.09088  [pdf, other

    cs.CL cs.AI

    Dialect Transfer for Swiss German Speech Translation

    Authors: Claudio Paonessa, Yanick Schraner, Jan Deriu, Manuela Hürlimann, Manfred Vogel, Mark Cieliebak

    Abstract: This paper investigates the challenges in building Swiss German speech translation systems, specifically focusing on the impact of dialect diversity and differences between Swiss German and Standard German. Swiss German is a spoken language with no formal writing system, it comprises many diverse dialects and is a low-resource language with only around 5 million speakers. The study is guided by tw… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  2. arXiv:2305.18855  [pdf, other

    cs.CL cs.AI

    STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions

    Authors: Michel Plüss, Jan Deriu, Yanick Schraner, Claudio Paonessa, Julia Hartmann, Larissa Schmidt, Christian Scheller, Manuela Hürlimann, Tanja Samardžić, Manfred Vogel, Mark Cieliebak

    Abstract: We present STT4SG-350 (Speech-to-Text for Swiss German), a corpus of Swiss German speech, annotated with Standard German text at the sentence level. The data is collected using a web app in which the speakers are shown Standard German sentences, which they translate to Swiss German and record. We make the corpus publicly available. It contains 343 hours of speech from all dialect regions and is th… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  3. arXiv:2305.12918  [pdf, other

    cs.CL cs.AI

    Improving Metrics for Speech Translation

    Authors: Claudio Paonessa, Dominik Frefel, Manfred Vogel

    Abstract: We introduce Parallel Paraphrasing ($\text{Para}_\text{both}$), an augmentation method for translation metrics making use of automatic paraphrasing of both the reference and hypothesis. This method counteracts the typically misleading results of speech translation metrics such as WER, CER, and BLEU if only a single reference is available. We introduce two new datasets explicitly created to measure… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: Preprint SwissText 2023