Skip to main content

Showing 1–3 of 3 results for author: Gimeno, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.07478  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Direct Text to Speech Translation System using Acoustic Units

    Authors: Victoria Mingote, Pablo Gimeno, Luis Vicente, Sameer Khurana, Antoine Laurent, Jarod Duret

    Abstract: This paper proposes a direct text to speech translation system using discrete acoustic units. This framework employs text in different source languages as input to generate speech in the target language without the need for text transcriptions in this language. Motivated by the success of acoustic units in previous works for direct speech to speech translation systems, we use the same pipeline to… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: 5 pages, 4 figures

  2. arXiv:2306.00789  [pdf, other

    cs.CL cs.AI eess.AS eess.SP

    Improved Cross-Lingual Transfer Learning For Automatic Speech Translation

    Authors: Sameer Khurana, Nauman Dawalatabad, Antoine Laurent, Luis Vicente, Pablo Gimeno, Victoria Mingote, James Glass

    Abstract: Research in multilingual speech-to-text translation is topical. Having a single model that supports multiple translation tasks is desirable. The goal of this work it to improve cross-lingual transfer learning in multilingual speech-to-text translation via semantic knowledge distillation. We show that by initializing the encoder of the encoder-decoder sequence-to-sequence translation model with SAM… ▽ More

    Submitted 25 January, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

  3. arXiv:2110.14425  [pdf, other

    cs.SD cs.LG eess.AS

    Generalizing AUC Optimization to Multiclass Classification for Audio Segmentation With Limited Training Data

    Authors: Pablo Gimeno, Victoria Mingote, Alfonso Ortega, Antonio Miguel, Eduardo Lleida

    Abstract: Area under the ROC curve (AUC) optimisation techniques developed for neural networks have recently demonstrated their capabilities in different audio and speech related tasks. However, due to its intrinsic nature, AUC optimisation has focused only on binary tasks so far. In this paper, we introduce an extension to the AUC optimisation framework so that it can be easily applied to an arbitrary numb… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

    Journal ref: IEEE Signal Processing Letters, vol. 28, pp. 1135-1139, 2021