Skip to main content

Showing 1–9 of 9 results for author: Parthasarathi, S H K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2303.02284  [pdf, other

    eess.AS cs.AI cs.LG eess.SP

    Fixed-point quantization aware training for on-device keyword-spotting

    Authors: Sashank Macha, Om Oza, Alex Escott, Francesco Caliva, Robbie Armitano, Santosh Kumar Cheekatmalla, Sree Hari Krishnan Parthasarathi, Yuzong Liu

    Abstract: Fixed-point (FXP) inference has proven suitable for embedded devices with limited computational resources, and yet model training is continually performed in floating-point (FLP). FXP training has not been fully explored and the non-trivial conversion from FLP to FXP presents unavoidable performance drop. We propose a novel method to train and obtain FXP convolutional keyword-spotting (KWS) models… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: 5 pages, 3 figures, 4 tables

    Journal ref: ICASSP 2023

  2. arXiv:2302.11054  [pdf, other

    cs.CL

    Conversational Text-to-SQL: An Odyssey into State-of-the-Art and Challenges Ahead

    Authors: Sree Hari Krishnan Parthasarathi, Lu Zeng, Dilek Hakkani-Tur

    Abstract: Conversational, multi-turn, text-to-SQL (CoSQL) tasks map natural language utterances in a dialogue to SQL queries. State-of-the-art (SOTA) systems use large, pre-trained and finetuned language models, such as the T5-family, in conjunction with constrained decoding. With multi-tasking (MT) over coherent tasks with discrete prompts during training, we improve over specialized text-to-SQL T5-family… ▽ More

    Submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted for publication at ICASSP 2023

  3. arXiv:2210.10668  [pdf, other

    cs.CL cs.AI

    N-Best Hypotheses Reranking for Text-To-SQL Systems

    Authors: Lu Zeng, Sree Hari Krishnan Parthasarathi, Dilek Hakkani-Tur

    Abstract: Text-to-SQL task maps natural language utterances to structured queries that can be issued to a database. State-of-the-art (SOTA) systems rely on finetuning large, pre-trained language models in conjunction with constrained decoding applying a SQL parser. On the well established Spider dataset, we begin with Oracle studies: specifically, choosing an Oracle hypothesis from a SOTA model's 10-best li… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: Accepted for publication at IEEE SLT'22

  4. arXiv:2207.06920  [pdf, ps, other

    cs.SD cs.LG eess.AS

    Sub 8-Bit Quantization of Streaming Keyword Spotting Models for Embedded Chipsets

    Authors: Lu Zeng, Sree Hari Krishnan Parthasarathi, Yuzong Liu, Alex Escott, Santosh Kumar Cheekatmalla, Nikko Strom, Shiv Vitaladevuni

    Abstract: We propose a novel 2-stage sub 8-bit quantization aware training algorithm for all components of a 250K parameter feedforward, streaming, state-free keyword spotting model. For the 1st-stage, we adapt a recently proposed quantization technique using a non-linear transformation with tanh(.) on dense layer weights. In the 2nd-stage, we use linear quantization methods on the rest of the network, incl… ▽ More

    Submitted 8 September, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

  5. arXiv:2207.06423  [pdf, ps, other

    cs.SD cs.AI cs.LG eess.AS

    Wakeword Detection under Distribution Shifts

    Authors: Sree Hari Krishnan Parthasarathi, Lu Zeng, Christin Jose, Joseph Wang

    Abstract: We propose a novel approach for semi-supervised learning (SSL) designed to overcome distribution shifts between training and real-world data arising in the keyword spotting (KWS) task. Shifts from training data distribution are a key challenge for real-world KWS tasks: when a new model is deployed on device, the gating of the accepted data undergoes a shift in distribution, making the problem of t… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

  6. arXiv:2106.06126  [pdf, other

    cs.SD cs.LG eess.AS

    Exploiting Large-scale Teacher-Student Training for On-device Acoustic Models

    Authors: **g Liu, Rupak Vignesh Swaminathan, Sree Hari Krishnan Parthasarathi, Chunchuan Lyu, Athanasios Mouchtaris, Siegfried Kunzmann

    Abstract: We present results from Alexa speech teams on semi-supervised learning (SSL) of acoustic models (AM) with experiments spanning over 3000 hours of GPU time, making our study one of the largest of its kind. We discuss SSL for AMs in a small footprint setting, showing that a smaller capacity model trained with 1 million hours of unsupervised data can outperform a baseline supervised system by 14.3% w… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: TSD2021

  7. arXiv:1904.10584  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Realizing Petabyte Scale Acoustic Modeling

    Authors: Sree Hari Krishnan Parthasarathi, Nitin Sivakrishnan, Pranav Ladkat, Nikko Strom

    Abstract: Large scale machine learning (ML) systems such as the Alexa automatic speech recognition (ASR) system continue to improve with increasing amounts of manually transcribed training data. Instead of scaling manual transcription to impractical levels, we utilize semi-supervised learning (SSL) to learn acoustic models (AM) from the vast firehose of untranscribed audio data. Learning an AM from 1 Millio… ▽ More

    Submitted 23 April, 2019; originally announced April 2019.

    Comments: 2156-3357 ©2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information

  8. arXiv:1904.01624  [pdf, ps, other

    cs.LG cs.SD eess.AS stat.ML

    Lessons from Building Acoustic Models with a Million Hours of Speech

    Authors: Sree Hari Krishnan Parthasarathi, Nikko Strom

    Abstract: This is a report of our lessons learned building acoustic models from 1 Million hours of unlabeled speech, while labeled speech is restricted to 7,000 hours. We employ student/teacher training on unlabeled data, hel** scale out target generation in comparison to confidence model based methods, which require a decoder and a confidence model. To optimize storage and to parallelize target generatio… ▽ More

    Submitted 2 April, 2019; originally announced April 2019.

    Comments: "Copyright 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works."

  9. arXiv:1901.02348  [pdf, other

    eess.AS cs.CL cs.LG cs.SD stat.ML

    Improving noise robustness of automatic speech recognition via parallel data and teacher-student learning

    Authors: Ladislav Mošner, Minhua Wu, Anirudh Raju, Sree Hari Krishnan Parthasarathi, Kenichi Kumatani, Shiva Sundaram, Roland Maas, Björn Hoffmeister

    Abstract: For real-world speech recognition applications, noise robustness is still a challenge. In this work, we adopt the teacher-student (T/S) learning technique using a parallel clean and noisy corpus for improving automatic speech recognition (ASR) performance under multimedia noise. On top of that, we apply a logits selection method which only preserves the k highest values to prevent wrong emphasis o… ▽ More

    Submitted 15 March, 2019; v1 submitted 5 January, 2019; originally announced January 2019.

    Comments: To Appear in ICASSP 2019