Skip to main content

Showing 1–5 of 5 results for author: Gruenstein, A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2205.03481  [pdf, other

    eess.AS cs.SD eess.SP

    A Conformer-based Waveform-domain Neural Acoustic Echo Canceller Optimized for ASR Accuracy

    Authors: Sankaran Panchapagesan, Arun Narayanan, Turaj Zakizadeh Shabestary, Shuai Shao, Nathan Howard, Alex Park, James Walker, Alexander Gruenstein

    Abstract: Acoustic Echo Cancellation (AEC) is essential for accurate recognition of queries spoken to a smart speaker that is playing out audio. Previous work has shown that a neural AEC model operating on log-mel spectral features (denoted "logmel" hereafter) can greatly improve Automatic Speech Recognition (ASR) accuracy when optimized with an auxiliary loss utilizing a pre-trained ASR model encoder. In t… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: Submitted to Interspeech 2022

  2. arXiv:2106.00856  [pdf, other

    eess.AS cs.SD

    A Neural Acoustic Echo Canceller Optimized Using An Automatic Speech Recognizer And Large Scale Synthetic Data

    Authors: Nathan Howard, Alex Park, Turaj Zakizadeh Shabestary, Alexander Gruenstein, Rohit Prabhavalkar

    Abstract: We consider the problem of recognizing speech utterances spoken to a device which is generating a known sound waveform; for example, recognizing queries issued to a digital assistant which is generating responses to previous user inputs. Previous work has proposed building acoustic echo cancellation (AEC) models for this task that optimize speech enhancement metrics using both neural network as we… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: To appear in ICASSP 2021

  3. arXiv:2011.06110  [pdf, other

    eess.AS cs.SD

    Efficient Knowledge Distillation for RNN-Transducer Models

    Authors: Sankaran Panchapagesan, Daniel S. Park, Chung-Cheng Chiu, Yuan Shangguan, Qiao Liang, Alexander Gruenstein

    Abstract: Knowledge Distillation is an effective method of transferring knowledge from a large model to a smaller model. Distillation can be viewed as a type of model compression, and has played an important role for on-device ASR applications. In this paper, we develop a distillation method for RNN-Transducer (RNN-T) models, a popular end-to-end neural network architecture for streaming speech recognition.… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

    Comments: 5 pages, 1 figure, 2 tables; submitted to ICASSP 2021

  4. arXiv:2009.04323  [pdf, other

    eess.AS cs.LG cs.SD eess.SP stat.ML

    VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition

    Authors: Quan Wang, Ignacio Lopez Moreno, Mert Saglam, Kevin Wilson, Alan Chiao, Renjie Liu, Yanzhang He, Wei Li, Jason Pelecanos, Marily Nika, Alexander Gruenstein

    Abstract: We introduce VoiceFilter-Lite, a single-channel source separation model that runs on the device to preserve only the speech signals from a target user, as part of a streaming speech recognition system. Delivering such a model presents numerous challenges: It should improve the performance when the input signal consists of overlapped speech, and must not hurt the speech recognition performance unde… ▽ More

    Submitted 9 September, 2020; originally announced September 2020.

  5. arXiv:1712.03603  [pdf, other

    cs.SD eess.AS

    A Cascade Architecture for Keyword Spotting on Mobile Devices

    Authors: Alexander Gruenstein, Raziel Alvarez, Chris Thornton, Mohammadali Ghodrat

    Abstract: We present a cascade architecture for keyword spotting with speaker verification on mobile devices. By pairing a small computational footprint with specialized digital signal processing (DSP) chips, we are able to achieve low power consumption while continuously listening for a keyword.

    Submitted 10 December, 2017; originally announced December 2017.

    Comments: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA