Skip to main content

Showing 1–4 of 4 results for author: Delworth, N

Searching in archive eess. Search in all archives.
.
  1. arXiv:2309.15013  [pdf, other

    cs.CL cs.SD eess.AS

    Updated Corpora and Benchmarks for Long-Form Speech Recognition

    Authors: Jennifer Drexler Fox, Desh Raj, Natalie Delworth, Quinn McNamara, Corey Miller, Migüel Jetté

    Abstract: The vast majority of ASR research uses corpora in which both the training and test data have been pre-segmented into utterances. In most real-word ASR use-cases, however, test audio is not segmented, leading to a mismatch between inference-time conditions and models trained on segmented utterances. In this paper, we re-release three standard ASR corpora - TED-LIUM 3, Gigapeech, and VoxPopuli-en -… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Submitted to ICASSP 2024

  2. arXiv:2209.01250  [pdf, other

    cs.CL cs.SD eess.AS

    Improving Contextual Recognition of Rare Words with an Alternate Spelling Prediction Model

    Authors: Jennifer Drexler Fox, Natalie Delworth

    Abstract: Contextual ASR, which takes a list of bias terms as input along with audio, has drawn recent interest as ASR use becomes more widespread. We are releasing contextual biasing lists to accompany the Earnings21 dataset, creating a public benchmark for this task. We present baseline results on this benchmark using a pretrained end-to-end ASR model from the WeNet toolkit. We show results for shallow fu… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

  3. Earnings-21: A Practical Benchmark for ASR in the Wild

    Authors: Miguel Del Rio, Natalie Delworth, Ryan Westerman, Michelle Huang, Nishchal Bhandari, Joseph Palakapilly, Quinten McNamara, Joshua Dong, Piotr Zelasko, Miguel Jette

    Abstract: Commonly used speech corpora inadequately challenge academic and commercial ASR systems. In particular, speech corpora lack metadata needed for detailed analysis and WER measurement. In response, we present Earnings-21, a 39-hour corpus of earnings calls containing entity-dense speech from nine different financial sectors. This corpus is intended to benchmark ASR systems in the wild with special a… ▽ More

    Submitted 15 June, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

    Comments: Accepted to INTERSPEECH 2021. June 15 2021: Addressing the comments of reviewers and updating the results of our internal ESPNet model. The results do not change our conclusions. April 28th, 2021: We found and resolved an issue in our experimental evaluation that scored the LibriSpeech model at ~20% worse relative WER than the actual WER. The updated results do not affect our conclusions

  4. arXiv:2104.10747  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Accented Speech Recognition: A Survey

    Authors: Arthur Hinsvark, Natalie Delworth, Miguel Del Rio, Quinten McNamara, Joshua Dong, Ryan Westerman, Michelle Huang, Joseph Palakapilly, Jennifer Drexler, Ilya Pirkin, Nishchal Bhandari, Miguel Jette

    Abstract: Automatic Speech Recognition (ASR) systems generalize poorly on accented speech. The phonetic and linguistic variability of accents present hard challenges for ASR systems today in both data collection and modeling strategies. The resulting bias in ASR performance across accents comes at a cost to both users and providers of ASR. We present a survey of current promising approaches to accented sp… ▽ More

    Submitted 2 June, 2021; v1 submitted 21 April, 2021; originally announced April 2021.