Skip to main content

Showing 1–26 of 26 results for author: Fosler-Lussier, E

.
  1. arXiv:2402.11676  [pdf, other

    cs.CL cs.AI

    A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models

    Authors: Jaylen Jones, Lingbo Mo, Eric Fosler-Lussier, Huan Sun

    Abstract: Counter narratives - informed responses to hate speech contexts designed to refute hateful claims and de-escalate encounters - have emerged as an effective hate speech intervention strategy. While previous work has proposed automatic counter narrative generation methods to aid manual interventions, the evaluation of these approaches remains underdeveloped. Previous automatic metrics for counter na… ▽ More

    Submitted 29 March, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: 22 pages, camera-ready version; references added, typos corrected, methodology section expanded, additional table

  2. arXiv:2310.11486  [pdf, other

    eess.AS cs.AI cs.LG

    End-to-End real time tracking of children's reading with pointer network

    Authors: Vishal Sunder, Beulah Karrolla, Eric Fosler-Lussier

    Abstract: In this work, we explore how a real time reading tracker can be built efficiently for children's voices. While previously proposed reading trackers focused on ASR-based cascaded approaches, we propose a fully end-to-end model making it less prone to lags in voice tracking. We employ a pointer network that directly learns to predict positions in the ground truth text conditioned on the streaming sp… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: 5 pages, 3 figures

  3. arXiv:2310.06302  [pdf, other

    cs.CL

    Selective Demonstrations for Cross-domain Text-to-SQL

    Authors: Shuaichen Chang, Eric Fosler-Lussier

    Abstract: Large language models (LLMs) with in-context learning have demonstrated impressive generalization capabilities in the cross-domain text-to-SQL task, without the use of in-domain annotations. However, incorporating in-domain demonstration examples has been found to greatly enhance LLMs' performance. In this paper, we delve into the key factors within in-domain examples that contribute to the improv… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023

  4. arXiv:2305.11853  [pdf, other

    cs.CL

    How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain, and Cross-domain Settings

    Authors: Shuaichen Chang, Eric Fosler-Lussier

    Abstract: Large language models (LLMs) with in-context learning have demonstrated remarkable capability in the text-to-SQL task. Previous research has prompted LLMs with various demonstration-retrieval strategies and intermediate reasoning steps to enhance the performance of LLMs. However, those works often employ varied strategies when constructing the prompt text for text-to-SQL inputs, such as databases… ▽ More

    Submitted 26 November, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Journal ref: NeurIPS 2023 Table Representation Learning Workshop

  5. arXiv:2211.08545  [pdf, other

    cs.CV cs.CL

    MapQA: A Dataset for Question Answering on Choropleth Maps

    Authors: Shuaichen Chang, David Palzer, Jialin Li, Eric Fosler-Lussier, Ningchuan Xiao

    Abstract: Choropleth maps are a common visual representation for region-specific tabular data and are used in a number of different venues (newspapers, articles, etc). These maps are human-readable but are often challenging to deal with when trying to extract data for screen readers, analyses, or other related tasks. Recent research into Visual-Question Answering (VQA) has studied question answering on huma… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  6. arXiv:2204.05188  [pdf, other

    cs.CL cs.SD eess.AS

    Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems

    Authors: Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang J. Kuo, Brian Kingsbury

    Abstract: Recent advances in End-to-End (E2E) Spoken Language Understanding (SLU) have been primarily due to effective pretraining of speech representations. One such pretraining paradigm is the distillation of semantic knowledge from state-of-the-art text-based models like BERT to speech encoder neural networks. This work is a step towards doing the same in a much more efficient and fine-grained manner whe… ▽ More

    Submitted 1 July, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: 5 pages, 2 figures

  7. arXiv:2204.05183  [pdf, other

    cs.CL cs.SD eess.AS

    Building an ASR Error Robust Spoken Virtual Patient System in a Highly Class-Imbalanced Scenario Without Speech Data

    Authors: Vishal Sunder, Prashant Serai, Eric Fosler-Lussier

    Abstract: A Virtual Patient (VP) is a powerful tool for training medical students to take patient histories, where responding to a diverse set of spoken questions is essential to simulate natural conversations with a student. The performance of such a Spoken Language Understanding system (SLU) can be adversely affected by both the presence of Automatic Speech Recognition (ASR) errors in the test data and a… ▽ More

    Submitted 1 July, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: 5 pages, 3 figures

  8. arXiv:2204.05169  [pdf, other

    cs.CL cs.AI

    Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding

    Authors: Vishal Sunder, Samuel Thomas, Hong-Kwang J. Kuo, Jatin Ganhotra, Brian Kingsbury, Eric Fosler-Lussier

    Abstract: Dialog history plays an important role in spoken language understanding (SLU) performance in a dialog system. For end-to-end (E2E) SLU, previous work has used dialog history in text form, which makes the model dependent on a cascaded automatic speech recognizer (ASR). This rescinds the benefits of an E2E system which is intended to be compact and robust to ASR errors. In this paper, we propose a h… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: 5 pages, 1 figure

  9. arXiv:2112.06068  [pdf, other

    cs.SD eess.AS

    Perceptual Loss with Recognition Model for Single-Channel Enhancement and Robust ASR

    Authors: Peter Plantinga, Deblin Bagchi, Eric Fosler-Lussier

    Abstract: Single-channel speech enhancement approaches do not always improve automatic recognition rates in the presence of noise, because they can introduce distortions unhelpful for recognition. Following a trend towards end-to-end training of sequential neural network models, several research groups have addressed this problem with joint training of front-end enhancement module with back-end recognition… ▽ More

    Submitted 11 December, 2021; originally announced December 2021.

  10. arXiv:2103.12258  [pdf, other

    cs.CL cs.LG

    Hallucination of speech recognition errors with sequence to sequence learning

    Authors: Prashant Serai, Vishal Sunder, Eric Fosler-Lussier

    Abstract: Automatic Speech Recognition (ASR) is an imperfect process that results in certain mismatches in ASR output text when compared to plain written text or transcriptions. When plain text data is to be used to train systems for spoken language understanding or ASR, a proven strategy to reduce said mismatch and prevent degradations, is to hallucinate what the ASR outputs would be given a gold transcrip… ▽ More

    Submitted 31 March, 2021; v1 submitted 22 March, 2021; originally announced March 2021.

    Comments: Submitted to IEEE/ACM Transactions on Audio Speech and Language Processing

  11. arXiv:2103.11029  [pdf, other

    cs.CL cs.AI cs.HC

    TextEssence: A Tool for Interactive Analysis of Semantic Shifts Between Corpora

    Authors: Denis Newman-Griffis, Venkatesh Sivaraman, Adam Perer, Eric Fosler-Lussier, Harry Hochheiser

    Abstract: Embeddings of words and concepts capture syntactic and semantic regularities of language; however, they have seen limited use as tools to study characteristics of different corpora and how they relate to one another. We introduce TextEssence, an interactive system designed to enable comparative analysis of corpora using embeddings. TextEssence includes visual, neighbor-based, and similarity-based… ▽ More

    Submitted 19 March, 2021; originally announced March 2021.

    Comments: Accepted as a Systems Demonstration at NAACL-HLT 2021. Video demonstration at https://youtu.be/1xEEfsMwL0k

  12. Automated Coding of Under-Studied Medical Concept Domains: Linking Physical Activity Reports to the International Classification of Functioning, Disability, and Health

    Authors: Denis Newman-Griffis, Eric Fosler-Lussier

    Abstract: Linking clinical narratives to standardized vocabularies and coding systems is a key component of unlocking the information in medical text for analysis. However, many domains of medical concepts lack well-developed terminologies that can support effective coding of medical text. We present a framework for develo** natural language processing (NLP) technologies for automated coding of under-stud… ▽ More

    Submitted 10 March, 2021; v1 submitted 27 November, 2020; originally announced November 2020.

    Comments: Updated final version, published in Frontiers in Digital Health, https://doi.org/10.3389/fdgth.2021.620828. 34 pages (23 text + 11 references); 9 figures, 2 tables

    Journal ref: Frontiers in Digital Health, 3:620828 (2021)

  13. arXiv:2010.15090  [pdf, other

    cs.CL cs.LG

    Handling Class Imbalance in Low-Resource Dialogue Systems by Combining Few-Shot Classification and Interpolation

    Authors: Vishal Sunder, Eric Fosler-Lussier

    Abstract: Utterance classification performance in low-resource dialogue systems is constrained by an inevitably high degree of data imbalance in class labels. We present a new end-to-end pairwise learning framework that is designed specifically to tackle this phenomenon by inducing a few-shot classification capability in the utterance representations and augmenting data through an interpolation of utterance… ▽ More

    Submitted 28 October, 2020; originally announced October 2020.

    Comments: 5 pages, 4 figures, 3 tables

  14. arXiv:2003.01769  [pdf, other

    eess.AS cs.CL cs.SD

    Phonetic Feedback for Speech Enhancement With and Without Parallel Speech Data

    Authors: Peter Plantinga, Deblin Bagchi, Eric Fosler-Lussier

    Abstract: While deep learning systems have gained significant ground in speech enhancement research, these systems have yet to make use of the full potential of deep learning systems to provide high-level feedback. In particular, phonetic feedback is rare in speech enhancement research even though it includes valuable top-down information. We use the technique of mimic loss to provide phonetic feedback to a… ▽ More

    Submitted 3 March, 2020; originally announced March 2020.

    Comments: 4 pages + 1 page for references, accepted to ICASSP 2020

  15. arXiv:2003.01765  [pdf, other

    eess.AS cs.CL cs.SD

    Towards Real-time Mispronunciation Detection in Kids' Speech

    Authors: Peter Plantinga, Eric Fosler-Lussier

    Abstract: Modern mispronunciation detection and diagnosis systems have seen significant gains in accuracy due to the introduction of deep learning. However, these systems have not been evaluated for the ability to be run in real-time, an important factor in applications that provide rapid feedback. In particular, the state-of-the-art uses bi-directional recurrent networks, where a uni-directional network ma… ▽ More

    Submitted 3 March, 2020; originally announced March 2020.

    Comments: 6 pages + 1 page for references, accepted at ASRU 2019

  16. arXiv:1911.04427  [pdf, other

    cs.CL cs.IR cs.LG

    Sequence-to-Set Semantic Tagging: End-to-End Multi-label Prediction using Neural Attention for Complex Query Reformulation and Automated Text Categorization

    Authors: Manirupa Das, Juanxi Li, Eric Fosler-Lussier, Simon Lin, Soheil Moosavinasab, Steve Rust, Yungui Huang, Rajiv Ramnath

    Abstract: Novel contexts may often arise in complex querying scenarios such as in evidence-based medicine (EBM) involving biomedical literature, that may not explicitly refer to entities or canonical concept forms occurring in any fact- or rule-based knowledge source such as an ontology like the UMLS. Moreover, hidden associations between candidate concepts meaningful in the current context, may not exist w… ▽ More

    Submitted 11 November, 2019; originally announced November 2019.

    Comments: 8 pages, 4 figures, 1 table

  17. arXiv:1910.08270  [pdf, other

    cs.CL cs.IR cs.LG

    Learning to Answer Subjective, Specific Product-Related Queries using Customer Reviews by Adversarial Domain Adaptation

    Authors: Manirupa Das, Zhen Wang, Evan Jaffe, Madhuja Chattopadhyay, Eric Fosler-Lussier, Rajiv Ramnath

    Abstract: Online customer reviews on large-scale e-commerce websites, represent a rich and varied source of opinion data, often providing subjective qualitative assessments of product usage that can help potential customers to discover features that meet their personal needs and preferences. Thus they have the potential to automatically answer specific queries about products, and to address the problems of… ▽ More

    Submitted 22 October, 2019; v1 submitted 18 October, 2019; originally announced October 2019.

    Comments: 8 pages, 1 figure, 6 tables, added additional references to end of section 2.1, removed graphics from referenced works, added to argument in section 2.3 corrected typos, results unchanged

  18. arXiv:1910.00192  [pdf, other

    cs.CL cs.AI

    Writing habits and telltale neighbors: analyzing clinical concept usage patterns with sublanguage embeddings

    Authors: Denis Newman-Griffis, Eric Fosler-Lussier

    Abstract: Natural language processing techniques are being applied to increasingly diverse types of electronic health records, and can benefit from in-depth understanding of the distinguishing characteristics of medical document types. We present a method for characterizing the usage patterns of clinical concepts among different document types, in order to capture semantic differences beyond the lexical lev… ▽ More

    Submitted 1 October, 2019; originally announced October 2019.

    Comments: LOUHI 2019 (co-located with EMNLP)

  19. arXiv:1908.11302  [pdf, other

    cs.CL cs.AI cs.IR

    HARE: a Flexible Highlighting Annotator for Ranking and Exploration

    Authors: Denis Newman-Griffis, Eric Fosler-Lussier

    Abstract: Exploration and analysis of potential data sources is a significant challenge in the application of NLP techniques to novel information domains. We describe HARE, a system for highlighting relevant information in document collections to support ranking and triage, which provides tools for post-processing and qualitative analysis for model development and tuning. We apply HARE to the use case of na… ▽ More

    Submitted 29 August, 2019; originally announced August 2019.

    Comments: EMNLP 2019 Systems Demonstration. Online version including supplementary material

  20. arXiv:1904.04866  [pdf, other

    cs.CL cs.LG

    Characterizing the impact of geometric properties of word embeddings on task performance

    Authors: Brendan Whitaker, Denis Newman-Griffis, Aparajita Haldar, Hakan Ferhatosmanoglu, Eric Fosler-Lussier

    Abstract: Analysis of word embedding properties to inform their use in downstream NLP tasks has largely been studied by assessing nearest neighbors. However, geometric properties of the continuous feature space contribute directly to the use of embedding features in downstream models, and are largely unexplored. We consider four properties of word embedding geometry, namely: position relative to the origin,… ▽ More

    Submitted 9 April, 2019; originally announced April 2019.

    Comments: Appearing in the Third Workshop on Evaluating Vector Space Representations for NLP (RepEval 2019). 7 pages + references

  21. arXiv:1809.09756  [pdf, other

    cs.SD eess.AS

    An Exploration of Mimic Architectures for Residual Network Based Spectral Map**

    Authors: Peter Plantinga, Deblin Bagchi, Eric Fosler-Lussier

    Abstract: Spectral map** uses a deep neural network (DNN) to map directly from noisy speech to clean speech. Our previous study found that the performance of spectral map** improves greatly when using helpful cues from an acoustic model trained on clean speech. The mapper network learns to mimic the input favored by the spectral classifier and cleans the features accordingly. In this study, we explore t… ▽ More

    Submitted 25 September, 2018; originally announced September 2018.

    Comments: Published in the IEEE 2018 Workshop on Spoken Language Technology (SLT 2018)

  22. arXiv:1807.03399  [pdf, other

    cs.CL cs.AI

    Jointly Embedding Entities and Text with Distant Supervision

    Authors: Denis Newman-Griffis, Albert M. Lai, Eric Fosler-Lussier

    Abstract: Learning representations for knowledge base entities and concepts is becoming increasingly important for NLP applications. However, recent entity embedding methods have relied on structured resources that are expensive to create for new domains and corpora. We present a distantly-supervised method for jointly learning embeddings of entities and text from an unnanotated corpus, using only a list of… ▽ More

    Submitted 9 July, 2018; originally announced July 2018.

    Comments: 12 pages; Accepted to 3rd Workshop on Representation Learning for NLP (Repl4NLP 2018). Code at https://github.com/OSU-slatelab/JET

  23. arXiv:1803.09816  [pdf, ps, other

    cs.SD cs.CL eess.AS

    Spectral feature map** with mimic loss for robust speech recognition

    Authors: Deblin Bagchi, Peter Plantinga, Adam Stiff, Eric Fosler-Lussier

    Abstract: For the task of speech enhancement, local learning objectives are agnostic to phonetic structures helpful for speech recognition. We propose to add a global criterion to ensure de-noised speech is useful for downstream tasks like ASR. We first train a spectral classifier on clean speech to predict senone labels. Then, the spectral classifier is joined with our speech enhancer as a noisy speech rec… ▽ More

    Submitted 26 March, 2018; originally announced March 2018.

  24. arXiv:1706.02241  [pdf, other

    cs.CL

    Insights into Analogy Completion from the Biomedical Domain

    Authors: Denis Newman-Griffis, Albert M Lai, Eric Fosler-Lussier

    Abstract: Analogy completion has been a popular task in recent years for evaluating the semantic properties of word embeddings, but the standard methodology makes a number of assumptions about analogies that do not always hold, either in recent benchmark datasets or when expanding into other domains. Through an analysis of analogies in the biomedical domain, we identify three assumptions: that of a Single A… ▽ More

    Submitted 7 June, 2017; originally announced June 2017.

    Comments: Accepted to BioNLP 2017. (10 pages)

  25. arXiv:1705.08488  [pdf, other

    cs.CL cs.AI

    Second-Order Word Embeddings from Nearest Neighbor Topological Features

    Authors: Denis Newman-Griffis, Eric Fosler-Lussier

    Abstract: We introduce second-order vector representations of words, induced from nearest neighborhood topological features in pre-trained contextual word embeddings. We then analyze the effects of using second-order embeddings as input features in two deep natural language processing models, for named entity recognition and recognizing textual entailment, as well as a linear model for paraphrase recognitio… ▽ More

    Submitted 23 May, 2017; originally announced May 2017.

    Comments: Submitted to NIPS 2017. (8 pages + 4 reference)

  26. arXiv:1502.04049  [pdf

    cs.CY cs.AI cs.CL

    How essential are unstructured clinical narratives and information fusion to clinical trial recruitment?

    Authors: Preethi Raghavan, James L. Chen, Eric Fosler-Lussier, Albert M. Lai

    Abstract: Electronic health records capture patient information using structured controlled vocabularies and unstructured narrative text. While structured data typically encodes lab values, encounters and medication lists, unstructured data captures the physician's interpretation of the patient's condition, prognosis, and response to therapeutic intervention. In this paper, we demonstrate that information e… ▽ More

    Submitted 13 February, 2015; originally announced February 2015.

    Comments: AMIA TBI 2014, 6 pages