Skip to main content

Showing 1–1 of 1 results for author: Dovzhenko, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2104.02014  [pdf, other

    cs.CL eess.AS

    SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

    Authors: Patrick K. O'Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko

    Abstract: In the English speech-to-text (STT) machine learning task, acoustic models are conventionally trained on uncased Latin characters, and any necessary orthography (such as capitalization, punctuation, and denormalization of non-standard words) is imputed by separate post-processing models. This adds complexity and limits performance, as many formatting tasks benefit from semantic information present… ▽ More

    Submitted 6 April, 2021; v1 submitted 5 April, 2021; originally announced April 2021.

    Comments: 5 pages, 1 figure. Submitted to INTERSPEECH 2021