Skip to main content

Showing 1–12 of 12 results for author: Tomanek, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.08904  [pdf, other

    cs.CL

    Detecting Hallucination and Coverage Errors in Retrieval Augmented Generation for Controversial Topics

    Authors: Tyler A. Chang, Katrin Tomanek, Jessica Hoffmann, Nithum Thain, Erin van Liemt, Kathleen Meier-Hellstern, Lucas Dixon

    Abstract: We explore a strategy to handle controversial topics in LLM-based chatbots based on Wikipedia's Neutral Point of View (NPOV) principle: acknowledge the absence of a single true answer and surface multiple perspectives. We frame this as retrieval augmented generation, where perspectives are retrieved from a knowledge base and the LLM is tasked with generating a fluent and faithful response from the… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted at LREC-COLING 2024

  2. arXiv:2312.14327  [pdf, other

    cs.CL

    Parameter Efficient Tuning Allows Scalable Personalization of LLMs for Text Entry: A Case Study on Abbreviation Expansion

    Authors: Katrin Tomanek, Shanqing Cai, Subhashini Venugopalan

    Abstract: Abbreviation expansion is a strategy used to speed up communication by limiting the amount of ty** and using a language model to suggest expansions. Here we look at personalizing a Large Language Model's (LLM) suggestions based on prior conversations to enhance the relevance of predictions, particularly when the user data is small (~1000 samples). Specifically, we compare fine-tuning, prompt-tun… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  3. arXiv:2312.01532  [pdf, other

    cs.HC cs.CL

    Using Large Language Models to Accelerate Communication for Users with Severe Motor Impairments

    Authors: Shanqing Cai, Subhashini Venugopalan, Katie Seaver, Xiang Xiao, Katrin Tomanek, Sri Jalasutram, Meredith Ringel Morris, Shaun Kane, Ajit Narayanan, Robert L. MacDonald, Emily Kornman, Daniel Vance, Blair Casey, Steve M. Gleason, Philip Q. Nelson, Michael P. Brenner

    Abstract: Finding ways to accelerate text input for individuals with profound motor impairments has been a long-standing area of research. Closing the speed gap for augmentative and alternative communication (AAC) devices such as eye-tracking keyboards is important for improving the quality of life for such individuals. Recent advances in neural networks of natural language pose new opportunities for re-thi… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

  4. arXiv:2302.06541  [pdf, other

    cs.CL

    Towards Agile Text Classifiers for Everyone

    Authors: Maximilian Mozes, Jessica Hoffmann, Katrin Tomanek, Muhamed Kouate, Nithum Thain, Ann Yuan, Tolga Bolukbasi, Lucas Dixon

    Abstract: Text-based safety classifiers are widely used for content moderation and increasingly to tune generative language model behavior - a topic of growing concern for the safety of digital assistants and chatbots. However, different policies require different classifiers, and safety policies themselves improve from iteration and adaptation. This paper introduces and evaluates methods for agile text cla… ▽ More

    Submitted 21 October, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: Findings of EMNLP 2023

  5. arXiv:2211.00089  [pdf, other

    eess.AS cs.CL cs.SD

    An analysis of degenerating speech due to progressive dysarthria on ASR performance

    Authors: Katrin Tomanek, Katie Seaver, Pan-Pan Jiang, Richard Cave, Lauren Harrel, Jordan R. Green

    Abstract: Although personalized automatic speech recognition (ASR) models have recently been designed to recognize even severely impaired speech, model performance may degrade over time for persons with degenerating speech. The aims of this study were to (1) analyze the change of performance of ASR over time in individuals with degrading speech, and (2) explore mitigation strategies to optimize recognition… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

    Comments: Submitted to ICASSP 2023

  6. arXiv:2209.10591  [pdf, other

    eess.AS cs.CL cs.LG

    Assessing ASR Model Quality on Disordered Speech using BERTScore

    Authors: Jimmy Tobin, Qisheng Li, Subhashini Venugopalan, Katie Seaver, Richard Cave, Katrin Tomanek

    Abstract: Word Error Rate (WER) is the primary metric used to assess automatic speech recognition (ASR) model quality. It has been shown that ASR models tend to have much higher WER on speakers with speech impairments than typical English speakers. It is hard to determine if models can be be useful at such high error rates. This study investigates the use of BERTScore, an evaluation metric for text generati… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: Accepted to Interspeech 2022 Workshop on Speech for Social Good

  7. arXiv:2205.03767  [pdf, other

    cs.CL

    Context-Aware Abbreviation Expansion Using Large Language Models

    Authors: Shanqing Cai, Subhashini Venugopalan, Katrin Tomanek, Ajit Narayanan, Meredith Ringel Morris, Michael P. Brenner

    Abstract: Motivated by the need for accelerating text entry in augmentative and alternative communication (AAC) for people with severe motor impairments, we propose a paradigm in which phrases are abbreviated aggressively as primarily word-initial letters. Our approach is to expand the abbreviations into full-phrase options by leveraging conversation context with the power of pretrained large language model… ▽ More

    Submitted 10 May, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

    Comments: 15 pages, 7 figures, 8 tables. Accepted as a long paper at NAACL 2022

  8. arXiv:2110.04612  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Personalized Automatic Speech Recognition Trained on Small Disordered Speech Datasets

    Authors: Jimmy Tobin, Katrin Tomanek

    Abstract: This study investigates the performance of personalized automatic speech recognition (ASR) for recognizing disordered speech using small amounts of per-speaker adaptation data. We trained personalized models for 195 individuals with different types and severities of speech impairment with training sets ranging in size from <1 minute to 18-20 minutes of speech data. Word error rate (WER) thresholds… ▽ More

    Submitted 9 October, 2021; originally announced October 2021.

    Comments: Submitted to ICASSP 2022

  9. arXiv:2109.06952  [pdf, other

    cs.CL cs.SD eess.AS

    Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech

    Authors: Katrin Tomanek, Vicky Zayats, Dirk Padfield, Kara Vaillancourt, Fadi Biadsy

    Abstract: Automatic Speech Recognition (ASR) systems are often optimized to work best for speakers with canonical speech patterns. Unfortunately, these systems perform poorly when tested on atypical speech and heavily accented speech. It has previously been shown that personalization through model fine-tuning substantially improves performance. However, maintaining such large models per speaker is costly an… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: Accepted to EMNLP 2021

  10. arXiv:2107.03985  [pdf, other

    eess.AS cs.LG cs.SD

    Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases

    Authors: Subhashini Venugopalan, Joel Shor, Manoj Plakal, Jimmy Tobin, Katrin Tomanek, Jordan R. Green, Michael P. Brenner

    Abstract: Automatic classification of disordered speech can provide an objective tool for identifying the presence and severity of speech impairment. Classification approaches can also help identify hard-to-recognize speech samples to teach ASR systems about the variable manifestations of impaired speech. Here, we develop and compare different deep learning techniques to classify the intelligibility of diso… ▽ More

    Submitted 8 July, 2021; originally announced July 2021.

    Comments: Accepted at INTERSPEECH 2021

  11. arXiv:2106.10259  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    On-Device Personalization of Automatic Speech Recognition Models for Disordered Speech

    Authors: Katrin Tomanek, Françoise Beaufays, Julie Cattiau, Angad Chandorkar, Khe Chai Sim

    Abstract: While current state-of-the-art Automatic Speech Recognition (ASR) systems achieve high accuracy on typical speech, they suffer from significant performance degradation on disordered speech and other atypical speech patterns. Personalization of ASR models, a commonly applied solution to this problem, is usually performed in a server-based training environment posing problems around data privacy, de… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

  12. arXiv:1902.08295  [pdf, other

    cs.LG stat.ML

    Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

    Authors: Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob , et al. (66 additional authors not shown)

    Abstract: Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models. Lingvo models are composed of modular building blocks that are flexible and easily extensible, and experiment configurations are centralized and highly customizable. Distributed training and quantized inference are supported directly w… ▽ More

    Submitted 21 February, 2019; originally announced February 2019.