Skip to main content

Showing 1–1 of 1 results for author: Tatanov, O

Searching in archive eess. Search in all archives.
.
  1. arXiv:2110.03584  [pdf, other

    eess.AS

    Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings

    Authors: Oktai Tatanov, Stanislav Beliaev, Boris Ginsburg

    Abstract: This paper describes Mixer-TTS, a non-autoregressive model for mel-spectrogram generation. The model is based on the MLP-Mixer architecture adapted for speech synthesis. The basic Mixer-TTS contains pitch and duration predictors, with the latter being trained with an unsupervised TTS alignment framework. Alongside the basic model, we propose the extended version which additionally uses token embed… ▽ More

    Submitted 22 October, 2021; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: Preprint. Submitted to ICASSP-22