Skip to main content

Showing 1–3 of 3 results for author: Sarawagi, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2307.05006  [pdf, ps, other

    cs.CL cs.LG eess.AS

    Improving RNN-Transducers with Acoustic LookAhead

    Authors: Vinit S. Unni, Ashish Mittal, Preethi Jyothi, Sunita Sarawagi

    Abstract: RNN-Transducers (RNN-Ts) have gained widespread acceptance as an end-to-end model for speech to text conversion because of their high accuracy and streaming capabilities. A typical RNN-T independently encodes the input audio and the text context, and combines the two encodings by a thin joint network. While this architecture provides SOTA streaming accuracy, it also makes the model vulnerable to s… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: 5 pages, 1 fig, 7 tables, Proceedings of Interspeech 2023

  2. arXiv:2103.03142  [pdf, other

    cs.SD cs.CL eess.AS

    Error-driven Fixed-Budget ASR Personalization for Accented Speakers

    Authors: Abhijeet Awasthi, Aman Kansal, Sunita Sarawagi, Preethi Jyothi

    Abstract: We consider the task of personalizing ASR models while being constrained by a fixed budget on recording speaker-specific utterances. Given a speaker and an ASR model, we propose a method of identifying sentences for which the speaker's utterances are likely to be harder for the given ASR model to recognize. We assume a tiny amount of speaker-specific data to learn phoneme-level error models which… ▽ More

    Submitted 2 June, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

    Comments: In ICASSP 2021

  3. arXiv:2006.13519  [pdf, other

    eess.AS cs.CL cs.SD

    Black-box Adaptation of ASR for Accented Speech

    Authors: Kartik Khandelwal, Preethi Jyothi, Abhijeet Awasthi, Sunita Sarawagi

    Abstract: We introduce the problem of adapting a black-box, cloud-based ASR system to speech from a target accent. While leading online ASR services obtain impressive performance on main-stream accents, they perform poorly on sub-populations - we observed that the word error rate (WER) achieved by Google's ASR API on Indian accents is almost twice the WER on US accents. Existing adaptation methods either re… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: A slightly different version submitted to INTERSPEECH 2020 (currently under review)