Skip to main content

Showing 1–2 of 2 results for author: Ilin, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2104.02526  [pdf, ps, other

    eess.AS cs.CL cs.LG

    LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring

    Authors: Anton Mitrofanov, Mariya Korenevskaya, Ivan Podluzhny, Yuri Khokhlov, Aleksandr Laptev, Andrei Andrusenko, Aleksei Ilin, Maxim Korenevsky, Ivan Medennikov, Aleksei Romanenko

    Abstract: Neural network-based language models are commonly used in rescoring approaches to improve the quality of modern automatic speech recognition (ASR) systems. Most of the existing methods are computationally expensive since they use autoregressive language models. We propose a novel rescoring approach, which processes the entire lattice in a single call to the model. The key feature of our rescoring… ▽ More

    Submitted 6 April, 2021; originally announced April 2021.

    Comments: Submitted to InterSpeech 2021

  2. arXiv:2004.13764  [pdf, other

    eess.AS cs.SD

    Conditional Spoken Digit Generation with StyleGAN

    Authors: Kasperi Palkama, Lauri Juvela, Alexander Ilin

    Abstract: This paper adapts a StyleGAN model for speech generation with minimal or no conditioning on text. StyleGAN is a multi-scale convolutional GAN capable of hierarchically capturing data structure and latent variation on multiple spatial (or temporal) levels. The model has previously achieved impressive results on facial image generation, and it is appealing to audio applications due to similar multi-… ▽ More

    Submitted 15 September, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

    Comments: Interspeech2020 accepted version