Skip to main content

Showing 1–2 of 2 results for author: Moëll, B

Searching in archive eess. Search in all archives.
.
  1. arXiv:2405.13379  [pdf, ps, other

    cs.CL cs.SD eess.AS

    You don't understand me!: Comparing ASR results for L1 and L2 speakers of Swedish

    Authors: Ronald Cumbal, Birger Moell, Jose Lopes, Olof Engwall

    Abstract: The performance of Automatic Speech Recognition (ASR) systems has constantly increased in state-of-the-art development. However, performance tends to decrease considerably in more challenging conditions (e.g., background noise, multiple speaker social conversations) and with more atypical speakers (e.g., children, non-native speakers or people with speech disorders), which signifies that general i… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  2. arXiv:2404.19622  [pdf, other

    cs.HC cs.CV cs.GR cs.SD eess.AS

    Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis

    Authors: Shivam Mehta, Anna Deichler, Jim O'Regan, Birger Moëll, Jonas Beskow, Gustav Eje Henter, Simon Alexanderson

    Abstract: Although humans engaged in face-to-face conversation simultaneously communicate both verbally and non-verbally, methods for joint and unified synthesis of speech audio and co-speech 3D gesture motion from text are a new and emerging field. These technologies hold great promise for more human-like, efficient, expressive, and robust synthetic communication, but are currently held back by the lack of… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 13+1 pages, 2 figures, accepted at the Human Motion Generation workshop (HuMoGen) at CVPR 2024

    MSC Class: 68T07 (Primary); 68T42 (Secondary) ACM Class: I.2.7; I.2.6; H.5