Skip to main content

Showing 1–1 of 1 results for author: de Falco, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2302.04975  [pdf, other

    cs.CL

    Leveraging supplementary text data to kick-start automatic speech recognition system development with limited transcriptions

    Authors: Nay San, Martijn Bartelds, Blaine Billings, Ella de Falco, Hendi Feriza, Johan Safri, Wawan Sahrozi, Ben Foley, Bradley McDonnell, Dan Jurafsky

    Abstract: Recent research using pre-trained transformer models suggests that just 10 minutes of transcribed speech may be enough to fine-tune such a model for automatic speech recognition (ASR) -- at least if we can also leverage vast amounts of text data (803 million tokens). But is that much text data necessary? We study the use of different amounts of text data, both for creating a lexicon that constrain… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: Accepted for ComputEL-6