Skip to main content

Showing 1–4 of 4 results for author: Büthe, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.14521  [pdf, other

    eess.AS cs.SD

    NoLACE: Improving Low-Complexity Speech Codec Enhancement Through Adaptive Temporal Sha**

    Authors: Jan Büthe, Ahmed Mustafa, Jean-Marc Valin, Karim Helwani, Michael M. Goodwin

    Abstract: Speech codec enhancement methods are designed to remove distortions added by speech codecs. While classical methods are very low in complexity and add zero delay, their effectiveness is rather limited. Compared to that, DNN-based methods deliver higher quality but they are typically high in complexity and/or require delay. The recently proposed Linear Adaptive Coding Enhancer (LACE) addresses this… ▽ More

    Submitted 12 January, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: final version, accepted at ICASSP 2024

  2. arXiv:2309.14507  [pdf, other

    eess.AS cs.SD

    Noise-Robust DSP-Assisted Neural Pitch Estimation with Very Low Complexity

    Authors: Krishna Subramani, Jean-Marc Valin, Jan Buethe, Paris Smaragdis, Mike Goodwin

    Abstract: Pitch estimation is an essential step of many speech processing algorithms, including speech coding, synthesis, and enhancement. Recently, pitch estimators based on deep neural networks (DNNs) have have been outperforming well-established DSP-based techniques. Unfortunately, these new estimators can be impractical to deploy in real-time systems, both because of their relatively high complexity, an… ▽ More

    Submitted 16 January, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP 2024, 5 pages

  3. arXiv:2212.04532  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Framewise WaveGAN: High Speed Adversarial Vocoder in Time Domain with Very Low Computational Complexity

    Authors: Ahmed Mustafa, Jean-Marc Valin, Jan Büthe, Paris Smaragdis, Mike Goodwin

    Abstract: GAN vocoders are currently one of the state-of-the-art methods for building high-quality neural waveform generative models. However, most of their architectures require dozens of billion floating-point operations per second (GFLOPS) to generate speech waveforms in samplewise manner. This makes GAN vocoders still challenging to run on normal CPUs without accelerators or parallel computers. In this… ▽ More

    Submitted 1 March, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: Accepted to ICASSP 2023, demo: https://ahmed-fau.github.io/fwgan_demo/

  4. arXiv:2108.04051  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate

    Authors: Ahmed Mustafa, Jan Büthe, Srikanth Korse, Kishan Gupta, Guillaume Fuchs, Nicola Pia

    Abstract: Recently, GAN vocoders have seen rapid progress in speech synthesis, starting to outperform autoregressive models in perceptual quality with much higher generation speed. However, autoregressive vocoders are still the common choice for neural generation of speech signals coded at very low bit rates. In this paper, we present a GAN vocoder which is able to generate wideband speech waveforms from pa… ▽ More

    Submitted 9 August, 2021; originally announced August 2021.

    Comments: Accepted to the 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2021)