Skip to main content

Showing 1–5 of 5 results for author: Vieting, P

Searching in archive eess. Search in all archives.
.
  1. arXiv:2309.08454  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Mixture Encoder Supporting Continuous Speech Separation for Meeting Recognition

    Authors: Peter Vieting, Simon Berger, Thilo von Neumann, Christoph Boeddeker, Ralf Schlüter, Reinhold Haeb-Umbach

    Abstract: Many real-life applications of automatic speech recognition (ASR) require processing of overlapped speech. A commonmethod involves first separating the speech into overlap-free streams and then performing ASR on the resulting signals. Recently, the inclusion of a mixture encoder in the ASR model has been proposed. This mixture encoder leverages the original overlapped speech to mitigate the effect… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP 2024

  2. arXiv:2308.04286  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Comparative Analysis of the wav2vec 2.0 Feature Extractor

    Authors: Peter Vieting, Ralf Schlüter, Hermann Ney

    Abstract: Automatic speech recognition (ASR) systems typically use handcrafted feature extraction pipelines. To avoid their inherent information loss and to achieve more consistent modeling from speech to transcribed text, neural raw waveform feature extractors (FEs) are an appealing approach. Also the wav2vec 2.0 model, which has recently gained large popularity, uses a convolutional FE which operates dire… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

    Comments: Accepted at ITG 2023

  3. arXiv:2210.15445  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Efficient Utilization of Large Pre-Trained Models for Low Resource ASR

    Authors: Peter Vieting, Christoph Lüscher, Julian Dierkes, Ralf Schlüter, Hermann Ney

    Abstract: Unsupervised representation learning has recently helped automatic speech recognition (ASR) to tackle tasks with limited labeled data. Following this, hardware limitations and applications give rise to the question how to take advantage of large pre-trained models efficiently and reduce their complexity. In this work, we study a challenging low resource conversational telephony speech corpus from… ▽ More

    Submitted 17 August, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted at ICASSP SASB 2023

  4. arXiv:2210.13397  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech

    Authors: Christoph Lüscher, Mohammad Zeineldeen, Zijian Yang, Tina Raissi, Peter Vieting, Khai Le-Duc, Weiyue Wang, Ralf Schlüter, Hermann Ney

    Abstract: Language barriers present a great challenge in our increasingly connected and global world. Especially within the medical domain, e.g. hospital or emergency room, communication difficulties and delays may lead to malpractice and non-optimal patient care. In the HYKIST project, we consider patient-physician communication, more specifically between a German-speaking physician and an Arabic- or Vietn… ▽ More

    Submitted 22 September, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: ASR System Paper for HYKIST project

  5. arXiv:2104.04298  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    On Architectures and Training for Raw Waveform Feature Extraction in ASR

    Authors: Peter Vieting, Christoph Lüscher, Wilfried Michel, Ralf Schlüter, Hermann Ney

    Abstract: With the success of neural network based modeling in automatic speech recognition (ASR), many studies investigated acoustic modeling and learning of feature extractors directly based on the raw waveform. Recently, one line of research has focused on unsupervised pre-training of feature extractors on audio-only data to improve downstream ASR performance. In this work, we investigate the usefulness… ▽ More

    Submitted 5 October, 2021; v1 submitted 9 April, 2021; originally announced April 2021.

    Comments: Accepted for ASRU 2021