Skip to main content

Showing 1–4 of 4 results for author: Watzel, T

Searching in archive eess. Search in all archives.
.
  1. arXiv:2104.01471  [pdf, other

    eess.AS

    Adversarial Joint Training with Self-Attention Mechanism for Robust End-to-End Speech Recognition

    Authors: Lujun Li, Yikai Kang, Yuchen Shi, Ludwig Kürzinger, Tobias Watzel, Gerhard Rigoll

    Abstract: Lately, the self-attention mechanism has marked a new milestone in the field of automatic speech recognition (ASR). Nevertheless, its performance is susceptible to environmental intrusions as the system predicts the next output symbol depending on the full input sequence and the previous predictions. Inspired by the extensive applications of the generative adversarial networks (GANs) in speech enh… ▽ More

    Submitted 3 April, 2021; originally announced April 2021.

  2. arXiv:2007.10723  [pdf, ps, other

    eess.AS cs.SD

    Audio Adversarial Examples for Robust Hybrid CTC/Attention Speech Recognition

    Authors: Ludwig Kürzinger, Edgar Ricardo Chavez Rosas, Lujun Li, Tobias Watzel, Gerhard Rigoll

    Abstract: Recent advances in Automatic Speech Recognition (ASR) demonstrated how end-to-end systems are able to achieve state-of-the-art performance. There is a trend towards deeper neural networks, however those ASR models are also more complex and prone against specially crafted noisy data. Those Audio Adversarial Examples (AAE) were previously demonstrated on ASR systems that use Connectionist Temporal C… ▽ More

    Submitted 21 July, 2020; originally announced July 2020.

    Comments: To be published at SPECOM 2020

  3. CTC-Segmentation of Large Corpora for German End-to-end Speech Recognition

    Authors: Ludwig Kürzinger, Dominik Winkelbauer, Lujun Li, Tobias Watzel, Gerhard Rigoll

    Abstract: Recent end-to-end Automatic Speech Recognition (ASR) systems demonstrated the ability to outperform conventional hybrid DNN/ HMM ASR. Aside from architectural improvements in those systems, those models grew in terms of depth, parameters and model capacity. However, these models also require more training data to achieve comparable performance. In this work, we combine freely available corpora f… ▽ More

    Submitted 5 October, 2020; v1 submitted 17 July, 2020; originally announced July 2020.

    Comments: Published at SPECOM 2020

    Journal ref: Speech and Computer (2020)

  4. arXiv:2006.08506  [pdf, ps, other

    eess.AS cs.CL

    Regularized Forward-Backward Decoder for Attention Models

    Authors: Tobias Watzel, Ludwig Kürzinger, Lujun Li, Gerhard Rigoll

    Abstract: Nowadays, attention models are one of the popular candidates for speech recognition. So far, many studies mainly focus on the encoder structure or the attention module to enhance the performance of these models. However, mostly ignore the decoder. In this paper, we propose a novel regularization technique incorporating a second decoder during the training phase. This decoder is optimized on time-r… ▽ More

    Submitted 28 October, 2020; v1 submitted 15 June, 2020; originally announced June 2020.