Skip to main content

Showing 1–5 of 5 results for author: Burchi, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2405.12983  [pdf, other

    eess.AS cs.AI cs.CV cs.MM cs.SD

    Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer

    Authors: Maxime Burchi, Krishna C. Puvvada, Jagadeesh Balam, Boris Ginsburg, Radu Timofte

    Abstract: Humans are adept at leveraging visual cues from lip movements for recognizing speech in adverse listening conditions. Audio-Visual Speech Recognition (AVSR) models follow similar approach to achieve robust speech recognition in noisy conditions. In this work, we present a multilingual AVSR model incorporating several enhancements to improve performance and audio noise robustness. Notably, we adapt… ▽ More

    Submitted 13 March, 2024; originally announced May 2024.

  2. arXiv:2301.01456  [pdf, other

    cs.CV cs.CL cs.SD eess.AS

    Audio-Visual Efficient Conformer for Robust Speech Recognition

    Authors: Maxime Burchi, Radu Timofte

    Abstract: End-to-end Automatic Speech Recognition (ASR) systems based on neural networks have seen large improvements in recent years. The availability of large scale hand-labeled datasets and sufficient computing resources made it possible to train powerful deep neural networks, reaching very low Word Error Rate (WER) on academic benchmarks. However, despite impressive performance on clean audio samples, a… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

  3. arXiv:2209.11345  [pdf, other

    cs.CV eess.IV

    Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration

    Authors: Marcos V. Conde, Ui-** Choi, Maxime Burchi, Radu Timofte

    Abstract: Compression plays an important role on the efficient transmission and storage of images and videos through band-limited systems such as streaming services, virtual reality or videogames. However, compression unavoidably leads to artifacts and the loss of the original information, which may severely degrade the visual quality. For these reasons, quality enhancement of compressed images has become a… ▽ More

    Submitted 22 September, 2022; originally announced September 2022.

    Comments: European Conference on Computer Vision (ECCV 2022) Workshops

  4. arXiv:2204.12819  [pdf, other

    eess.IV cs.CV

    Conformer and Blind Noisy Students for Improved Image Quality Assessment

    Authors: Marcos V. Conde, Maxime Burchi, Radu Timofte

    Abstract: Generative models for image restoration, enhancement, and generation have significantly improved the quality of the generated images. Surprisingly, these models produce more pleasant images to the human eye than other methods, yet, they may get a lower perceptual quality score using traditional perceptual quality metrics such as PSNR or SSIM. Therefore, it is necessary to develop a quantitative me… ▽ More

    Submitted 27 April, 2022; originally announced April 2022.

    Comments: CVPR NTIRE 2022

  5. arXiv:2109.01163  [pdf, other

    eess.AS cs.AI cs.CL cs.SD

    Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognition

    Authors: Maxime Burchi, Valentin Vielzeuf

    Abstract: The recently proposed Conformer architecture has shown state-of-the-art performances in Automatic Speech Recognition by combining convolution with attention to model both local and global dependencies. In this paper, we study how to reduce the Conformer architecture complexity with a limited computing budget, leading to a more efficient architecture design that we call Efficient Conformer. We intr… ▽ More

    Submitted 8 September, 2021; v1 submitted 31 August, 2021; originally announced September 2021.

    Journal ref: ASRU 2021, Dec 2021, Cartagena, Colombia