Skip to main content

Showing 1–2 of 2 results for author: Virkkunen, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2203.14876  [pdf, other

    cs.CL cs.SD eess.AS

    Finnish Parliament ASR corpus - Analysis, benchmarks and statistics

    Authors: Anja Virkkunen, Aku Rouhe, Nhan Phan, Mikko Kurimo

    Abstract: Public sources like parliament meeting recordings and transcripts provide ever-growing material for the training and evaluation of automatic speech recognition (ASR) systems. In this paper, we publish and analyse the Finnish parliament ASR corpus, the largest publicly available collection of manually transcribed speech data for Finnish with over 3000 hours of speech and 449 speakers for which it p… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: Submitted to Language Resources and Evaluation

  2. arXiv:2203.12906  [pdf, other

    cs.CL eess.AS

    Lahjoita puhetta -- a large-scale corpus of spoken Finnish with some benchmarks

    Authors: Anssi Moisio, Dejan Porjazovski, Aku Rouhe, Yaroslav Getman, Anja Virkkunen, Tamás Grósz, Krister Lindén, Mikko Kurimo

    Abstract: The Donate Speech campaign has so far succeeded in gathering approximately 3600 hours of ordinary, colloquial Finnish speech into the Lahjoita puhetta (Donate Speech) corpus. The corpus includes over twenty thousand speakers from all the regions of Finland and from all age brackets. The primary goals of the collection were to create a representative, large-scale resource to study spontaneous spoke… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

    Comments: Submitted to Language Resources and Evaluation