Skip to main content

Showing 1–8 of 8 results for author: Swaminathan, R V

.
  1. Accelerator-Aware Training for Transducer-Based Speech Recognition

    Authors: Suhaila M. Shakiah, Rupak Vignesh Swaminathan, Hieu Duy Nguyen, Raviteja Chinta, Tariq Afzal, Nathan Susanj, Athanasios Mouchtaris, Grant P. Strimel, Ariya Rastrow

    Abstract: Machine learning model weights and activations are represented in full-precision during training. This leads to performance degradation in runtime when deployed on neural network accelerator (NNA) chips, which leverage highly parallelized fixed-point arithmetic to improve runtime memory and latency. In this work, we replicate the NNA operators during the training phase, accounting for the degradat… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: Accepted to SLT 2022

    Journal ref: IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, 2023, pp. 100-107

  2. arXiv:2303.00692  [pdf, other

    eess.AS

    Leveraging Redundancy in Multiple Audio Signals for Far-Field Speech Recognition

    Authors: Feng-Ju Chang, Anastasios Alexandridis, Rupak Vignesh Swaminathan, Martin Radfar, Harish Mallidi, Maurizio Omologo, Athanasios Mouchtaris, Brian King, Roland Maas

    Abstract: To achieve robust far-field automatic speech recognition (ASR), existing techniques typically employ an acoustic front end (AFE) cascaded with a neural transducer (NT) ASR model. The AFE output, however, could be unreliable, as the beamforming output in AFE is steered to a wrong direction. A promising way to address this issue is to exploit the microphone signals before the beamforming stage and a… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

  3. arXiv:2210.16238  [pdf, ps, other

    eess.AS cs.LG cs.SD eess.SP

    Contextual-Utterance Training for Automatic Speech Recognition

    Authors: Alejandro Gomez-Alanis, Lukas Drude, Andreas Schwarz, Rupak Vignesh Swaminathan, Simon Wiesler

    Abstract: Recent studies of streaming automatic speech recognition (ASR) recurrent neural network transducer (RNN-T)-based systems have fed the encoder with past contextual information in order to improve its word error rate (WER) performance. In this paper, we first propose a contextual-utterance training technique which makes use of the previous and future contextual utterances in order to do an implicit… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

  4. arXiv:2209.14868  [pdf, other

    cs.SD cs.CL eess.AS

    ConvRNN-T: Convolutional Augmented Recurrent Neural Network Transducers for Streaming Speech Recognition

    Authors: Martin Radfar, Rohit Barnwal, Rupak Vignesh Swaminathan, Feng-Ju Chang, Grant P. Strimel, Nathan Susanj, Athanasios Mouchtaris

    Abstract: The recurrent neural network transducer (RNN-T) is a prominent streaming end-to-end (E2E) ASR technology. In RNN-T, the acoustic encoder commonly consists of stacks of LSTMs. Very recently, as an alternative to LSTM layers, the Conformer architecture was introduced where the encoder of RNN-T is replaced with a modified Transformer encoder composed of convolutional layers at the frontend and betwee… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: This paper was presented in Interspeech 2022

  5. arXiv:2106.07734  [pdf, other

    cs.CL cs.LG eess.AS

    CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition

    Authors: Rupak Vignesh Swaminathan, Brian King, Grant P. Strimel, Jasha Droppo, Athanasios Mouchtaris

    Abstract: We propose a simple yet effective method to compress an RNN-Transducer (RNN-T) through the well-known knowledge distillation paradigm. We show that the transducer's encoder outputs naturally have a high entropy and contain rich information about acoustically similar word-piece confusions. This rich information is suppressed when combined with the lower entropy decoder outputs to produce the joint… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

    Comments: Accepted at InterSpeech 2021

  6. arXiv:2106.06126  [pdf, other

    cs.SD cs.LG eess.AS

    Exploiting Large-scale Teacher-Student Training for On-device Acoustic Models

    Authors: **g Liu, Rupak Vignesh Swaminathan, Sree Hari Krishnan Parthasarathi, Chunchuan Lyu, Athanasios Mouchtaris, Siegfried Kunzmann

    Abstract: We present results from Alexa speech teams on semi-supervised learning (SSL) of acoustic models (AM) with experiments spanning over 3000 hours of GPU time, making our study one of the largest of its kind. We discuss SSL for AMs in a small footprint setting, showing that a smaller capacity model trained with 1 million hours of unsupervised data can outperform a baseline supervised system by 14.3% w… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: TSD2021

  7. arXiv:1809.08671  [pdf, other

    physics.ao-ph physics.comp-ph physics.flu-dyn

    Jovian vortices and jets

    Authors: Glenn R. Flierl, Philip J. Morrison, Rohith Vilasur Swaminathan

    Abstract: We explore the conditions required for isolated vortices to exist in sheared zonal flows and the stability of the underlying zonal winds. This is done using the standard 2-layer quasigeostrophic model with the lower layer depth becoming infinite; however, this model differs from the usual layer model because the lower layer is not assumed to be motionless but has a steady configuration of alternat… ▽ More

    Submitted 23 September, 2018; originally announced September 2018.

  8. Dynamics of circular arrangements of vorticity in two dimensions

    Authors: Rohith V. Swaminathan, S. Ravichandran, Prasad Perlekar, Rama Govindarajan

    Abstract: The merger of two like-signed vortices is a well-studied problem, but in a turbulent flow, we may often have more than two like-signed vortices interacting. We study the merger of three or more identical co-rotating vortices initially arranged on the vertices of a regular polygon. At low to moderate Reynolds numbers, we find an additional stage in the merger process, absent in the merger of two vo… ▽ More

    Submitted 21 June, 2016; v1 submitted 17 June, 2015; originally announced June 2015.

    Comments: Abstract truncated. Paper to appear in Physical Review E

    Journal ref: Phys. Rev. E 94, 013105 (2016)