Skip to main content

Showing 1–4 of 4 results for author: Macoskey, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2207.02393  [pdf, other

    cs.CL cs.SD eess.AS

    Compute Cost Amortized Transformer for Streaming ASR

    Authors: Yi Xie, Jonathan Macoskey, Martin Radfar, Feng-Ju Chang, Brian King, Ariya Rastrow, Athanasios Mouchtaris, Grant P. Strimel

    Abstract: We present a streaming, Transformer-based end-to-end automatic speech recognition (ASR) architecture which achieves efficient neural inference through compute cost amortization. Our architecture creates sparse computation pathways dynamically at inference time, resulting in selective use of compute resources throughout decoding, enabling significant reductions in compute with minimal impact on acc… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

  2. arXiv:2108.01704  [pdf, other

    eess.AS cs.SD

    Bifocal Neural ASR: Exploiting Keyword Spotting for Inference Optimization

    Authors: Jonathan Macoskey, Grant P. Strimel, Ariya Rastrow

    Abstract: We present Bifocal RNN-T, a new variant of the Recurrent Neural Network Transducer (RNN-T) architecture designed for improved inference time latency on speech recognition tasks. The architecture enables a dynamic pivot for its runtime compute pathway, namely taking advantage of keyword spotting to select which component of the network to execute for a given audio frame. To accomplish this, we leve… ▽ More

    Submitted 3 August, 2021; originally announced August 2021.

    Comments: Accepted at ICASSP 2021

  3. arXiv:2108.01561  [pdf, other

    eess.AS cs.SD

    Learning a Neural Diff for Speech Models

    Authors: Jonathan Macoskey, Grant P. Strimel, Ariya Rastrow

    Abstract: As more speech processing applications execute locally on edge devices, a set of resource constraints must be considered. In this work we address one of these constraints, namely over-the-network data budgets for transferring models from server to device. We present neural update approaches for release of subsequent speech model generations abiding by a data budget. We detail two architecture-agno… ▽ More

    Submitted 17 August, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

    Comments: Accepted at Interspeech 2021

  4. arXiv:2108.01553  [pdf, other

    eess.AS cs.SD

    Amortized Neural Networks for Low-Latency Speech Recognition

    Authors: Jonathan Macoskey, Grant P. Strimel, **ru Su, Ariya Rastrow

    Abstract: We introduce Amortized Neural Networks (AmNets), a compute cost- and latency-aware network architecture particularly well-suited for sequence modeling tasks. We apply AmNets to the Recurrent Neural Network Transducer (RNN-T) to reduce compute cost and latency for an automatic speech recognition (ASR) task. The AmNets RNN-T architecture enables the network to dynamically switch between encoder bran… ▽ More

    Submitted 3 August, 2021; originally announced August 2021.

    Comments: Accepted at Interspeech 2021