Skip to main content

Showing 1–1 of 1 results for author: Shabalin, E

Searching in archive eess. Search in all archives.
.
  1. arXiv:2302.14036  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator

    Authors: Vladimir Bataev, Roman Korostik, Evgeny Shabalin, Vitaly Lavrukhin, Boris Ginsburg

    Abstract: We propose an end-to-end Automatic Speech Recognition (ASR) system that can be trained on transcribed speech data, text-only data, or a mixture of both. The proposed model uses an integrated auxiliary block for text-based training. This block combines a non-autoregressive multi-speaker text-to-mel-spectrogram generator with a GAN-based enhancer to improve the spectrogram quality. The proposed syst… ▽ More

    Submitted 16 August, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: Accepted to INTERSPEECH 2023