Skip to main content

Showing 1–2 of 2 results for author: McCarthy, A D

Searching in archive eess. Search in all archives.
.
  1. arXiv:2002.12231  [pdf, other

    eess.AS cs.CL cs.SD

    SkinAugment: Auto-Encoding Speaker Conversions for Automatic Speech Translation

    Authors: Arya D. McCarthy, Liezl Puzon, Juan Pino

    Abstract: We propose autoencoding speaker conversion for training data augmentation in automatic speech translation. This technique directly transforms an audio sequence, resulting in audio synthesized to resemble another speaker's voice. Our method compares favorably to SpecAugment on English$\to$French and English$\to$Romanian automatic speech translation (AST) tasks as well as on a low-resource English a… ▽ More

    Submitted 27 February, 2020; originally announced February 2020.

    Comments: Accepted to ICASSP 2020

  2. arXiv:1909.06515  [pdf, other

    cs.CL cs.SD eess.AS

    Harnessing Indirect Training Data for End-to-End Automatic Speech Translation: Tricks of the Trade

    Authors: Juan Pino, Liezl Puzon, Jiatao Gu, Xutai Ma, Arya D. McCarthy, Deepak Gopinath

    Abstract: For automatic speech translation (AST), end-to-end approaches are outperformed by cascaded models that transcribe with automatic speech recognition (ASR), then translate with machine translation (MT). A major cause of the performance gap is that, while existing AST corpora are small, massive datasets exist for both the ASR and MT subsystems. In this work, we evaluate several data augmentation and… ▽ More

    Submitted 22 October, 2019; v1 submitted 13 September, 2019; originally announced September 2019.

    Comments: IWSLT 2019