Skip to main content

Showing 1–33 of 33 results for author: Heck, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17168  [pdf, other

    cs.LG cs.AI cs.RO

    Reinforcement Learning via Auxiliary Task Distillation

    Authors: Abhinav Narayan Harish, Larry Heck, Josiah P. Hanna, Zsolt Kira, Andrew Szot

    Abstract: We present Reinforcement Learning via Auxiliary Task Distillation (AuxDistill), a new method that enables reinforcement learning (RL) to perform long-horizon robot control problems by distilling behaviors from auxiliary RL tasks. AuxDistill achieves this by concurrently carrying out multi-task RL with auxiliary tasks, which are easier to learn and relevant to the main task. A weighted distillation… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2406.08398  [pdf, other

    cs.CL cs.AI cs.LG

    cPAPERS: A Dataset of Situated and Multimodal Interactive Conversations in Scientific Papers

    Authors: Anirudh Sundar, ** Xu, William Gay, Christopher Richardson, Larry Heck

    Abstract: An emerging area of research in situated and multimodal interactive conversations (SIMMC) includes interactions in scientific papers. Since scientific papers are primarily composed of text, equations, figures, and tables, SIMMC methods must be developed specifically for each component to support the depth of inquiry and interactions required by research scientists. This work introduces Conversatio… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 14 pages, 1 figure

  3. arXiv:2404.12580  [pdf, other

    cs.CL cs.AI cs.LG

    iTBLS: A Dataset of Interactive Conversations Over Tabular Information

    Authors: Anirudh Sundar, Christopher Richardson, William Gay, Larry Heck

    Abstract: This paper introduces Interactive Tables (iTBLS), a dataset of interactive conversations situated in tables from scientific articles. This dataset is designed to facilitate human-AI collaborative problem-solving through AI-powered multi-task tabular capabilities. In contrast to prior work that models interactions as factoid QA or procedure synthesis, iTBLS broadens the scope of interactions to inc… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 14 pages, 4 figures

  4. arXiv:2403.14457  [pdf, other

    cs.CL cs.IR cs.LG

    gTBLS: Generating Tables from Text by Conditional Question Answering

    Authors: Anirudh Sundar, Christopher Richardson, Larry Heck

    Abstract: Distilling large, unstructured text into a structured, condensed form such as tables is an open research problem. One of the primary challenges in automatically generating tables is ensuring their syntactic validity. Prior approaches address this challenge by including additional parameters in the Transformer's attention mechanism to attend to specific rows and column headers. In contrast to this… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: 12 pages, 1 figure

  5. arXiv:2403.05045  [pdf, other

    cs.CL cs.AI cs.LG

    Are Human Conversations Special? A Large Language Model Perspective

    Authors: Toshish Jawale, Chaitanya Animesh, Sekhar Vallath, Kartik Talamadupula, Larry Heck

    Abstract: This study analyzes changes in the attention mechanisms of large language models (LLMs) when used to understand natural conversations between humans (human-human). We analyze three use cases of LLMs: interactions over web content, code, and mathematical texts. By analyzing attention distance, dispersion, and interdependency across these domains, we highlight the unique challenges posed by conversa… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  6. arXiv:2402.11035  [pdf, other

    cs.CL cs.IR

    Retrieval-Augmented Generation: Is Dense Passage Retrieval Retrieving?

    Authors: Benjamin Reichman, Larry Heck

    Abstract: Dense passage retrieval (DPR) is the first step in the retrieval augmented generation (RAG) paradigm for improving the performance of large language models (LLM). DPR fine-tunes pre-trained networks to enhance the alignment of the embeddings between queries and relevant textual data. A deeper understanding of DPR fine-tuning will be required to fundamentally unlock the full potential of this appro… ▽ More

    Submitted 16 April, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  7. arXiv:2309.10015  [pdf, other

    cs.CL cs.AI cs.LG

    SYNDICOM: Improving Conversational Commonsense with Error-Injection and Natural Language Feedback

    Authors: Christopher Richardson, Anirudh Sundar, Larry Heck

    Abstract: Commonsense reasoning is a critical aspect of human communication. Despite recent advances in conversational AI driven by large language models, commonsense reasoning remains a challenging task. In this work, we introduce SYNDICOM - a method for improving commonsense in dialogue response generation. SYNDICOM consists of two components. The first component is a dataset composed of commonsense dialo… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: Published at SigDial 2023, Number 129

    Report number: 129

  8. arXiv:2303.12024  [pdf, other

    cs.CL cs.AI

    cTBLS: Augmenting Large Language Models with Conversational Tables

    Authors: Anirudh S Sundar, Larry Heck

    Abstract: Optimizing accuracy and performance while eliminating hallucinations of open-domain conversational large language models (LLMs) is an open research challenge. A particularly promising direction is to augment and ground LLMs with information from structured sources. This paper introduces Conversational Tables (cTBLS), a three-step architecture to retrieve and generate dialogue responses grounded on… ▽ More

    Submitted 30 May, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

  9. arXiv:2302.07926  [pdf, other

    cs.CL cs.LG

    Commonsense Reasoning for Conversational AI: A Survey of the State of the Art

    Authors: Christopher Richardson, Larry Heck

    Abstract: Large, transformer-based pretrained language models like BERT, GPT, and T5 have demonstrated a deep understanding of contextual semantics and language syntax. Their success has enabled significant advances in conversational AI, including the development of open-dialogue systems capable of coherent, salient conversations which can answer questions, chat casually, and complete tasks. However, state-… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

    Comments: Accepted to Workshop on Knowledge Augmented Methods for Natural Language Processing, in conjunction with AAAI 2023

  10. arXiv:2210.16460  [pdf, ps, other

    math.MG cs.DS

    The Vector Balancing Constant for Zonotopes

    Authors: Laurel Heck, Victor Reis, Thomas Rothvoss

    Abstract: The vector balancing constant $\mathrm{vb}(K,Q)$ of two symmetric convex bodies $K,Q$ is the minimum $r \geq 0$ so that any number of vectors from $K$ can be balanced into an $r$-scaling of $Q$. A question raised by Schechtman is whether for any zonotope $K \subseteq \mathbb{R}^d$ one has $\mathrm{vb}(K,K) \lesssim \sqrt{d}$. Intuitively, this asks whether a natural geometric generalization of Spe… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

    Comments: 20 pages

  11. arXiv:2205.06907  [pdf, other

    cs.LG

    Multimodal Conversational AI: A Survey of Datasets and Approaches

    Authors: Anirudh Sundar, Larry Heck

    Abstract: As humans, we experience the world with all our senses or modalities (sound, sight, touch, smell, and taste). We use these modalities, particularly sight and touch, to convey and interpret specific meanings. Multimodal expressions are central to conversations; a rich set of modalities amplify and often compensate for each other. A multimodal conversational AI system answers questions, fulfills tas… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: 17 pages, 1 figure, to be published in the 4th Workshop on NLP for Conversational AI

  12. arXiv:2111.04407  [pdf, other

    cs.LO

    Gradient-Descent for Randomized Controllers under Partial Observability

    Authors: Linus Heck, Jip Spel, Sebastian Junges, Joshua Moerman, Joost-Pieter Katoen

    Abstract: Randomization is a powerful technique to create robust controllers, in particular in partially observable settings. The degrees of randomization have a significant impact on the system performance, yet they are intricate to get right. The use of synthesis algorithms for parametric Markov chains (pMCs) is a promising direction to support the design process of such controllers. This paper shows how… ▽ More

    Submitted 8 November, 2021; originally announced November 2021.

    Comments: Technical Report VMCAI22 submission

  13. arXiv:2103.16057  [pdf, other

    cs.CL cs.LG

    Grounding Open-Domain Instructions to Automate Web Support Tasks

    Authors: Nancy Xu, Sam Masling, Michael Du, Giovanni Campagna, Larry Heck, James Landay, Monica S Lam

    Abstract: Grounding natural language instructions on the web to perform previously unseen tasks enables accessibility and automation. We introduce a task and dataset to train AI agents from open-domain, step-by-step instructions originally written for people. We build RUSS (Rapid Universal Support Service) to tackle this problem. RUSS consists of two models: First, a BERT-LSTM with pointers parses instructi… ▽ More

    Submitted 4 April, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: To be published in NAACL 2021

  14. arXiv:2011.12340  [pdf, other

    cs.AI

    mForms : Multimodal Form-Filling with Question Answering

    Authors: Larry Heck, Simon Heck, Anirudh Sundar

    Abstract: This paper presents a new approach to form-filling by reformulating the task as multimodal natural language Question Answering (QA). The reformulation is achieved by first translating the elements on the GUI form (text fields, buttons, icons, etc.) to natural language questions, where these questions capture the element's multimodal semantics. After a match is determined between the form element (… ▽ More

    Submitted 23 March, 2024; v1 submitted 24 November, 2020; originally announced November 2020.

    Comments: 5 pages, 6 figures, 4 tables

  15. arXiv:1912.11040  [pdf, ps, other

    eess.AS cs.LG cs.SD eess.SP stat.ML

    end-to-end training of a large vocabulary end-to-end speech recognition system

    Authors: Chanwoo Kim, Sungsoo Kim, Kwangyoun Kim, Mehul Kumar, Jiyeon Kim, Kyungmin Lee, Changwoo Han, Abhinav Garg, Eunhyang Kim, Minkyoo Shin, Shatrughan Singh, Larry Heck, Dhananjaya Gowda

    Abstract: In this paper, we present an end-to-end training framework for building state-of-the-art end-to-end speech recognition systems. Our training system utilizes a cluster of Central Processing Units(CPUs) and Graphics Processing Units (GPUs). The entire data reading, large scale data augmentation, neural network parameter updates are all performed "on-the-fly". We use vocal tract length perturbation [… ▽ More

    Submitted 21 December, 2019; originally announced December 2019.

    Comments: Accepted and presented at the ASRU 2019 conference

  16. arXiv:1904.00781  [pdf, other

    cs.CV cs.AI stat.ML

    RILOD: Near Real-Time Incremental Learning for Object Detection at the Edge

    Authors: Dawei Li, Serafettin Tasci, Shalini Ghosh, **gwen Zhu, Junting Zhang, Larry Heck

    Abstract: Object detection models shipped with camera-equipped edge devices cannot cover the objects of interest for every user. Therefore, the incremental learning capability is a critical feature for a robust and personalized object detection system that many applications would rely on. In this paper, we present an efficient yet practical system, RILOD, to incrementally train an existing object detection… ▽ More

    Submitted 23 September, 2019; v1 submitted 26 March, 2019; originally announced April 2019.

    Comments: Camera-ready for ACM/IEEE SEC 2019

  17. arXiv:1903.07864  [pdf, other

    cs.CV cs.LG

    Class-incremental Learning via Deep Model Consolidation

    Authors: Junting Zhang, Jie Zhang, Shalini Ghosh, Dawei Li, Serafettin Tasci, Larry Heck, Heming Zhang, C. -C. Jay Kuo

    Abstract: Deep neural networks (DNNs) often suffer from "catastrophic forgetting" during incremental learning (IL) --- an abrupt degradation of performance on the original set of classes when the training objective is adapted to a newly added set of classes. Existing IL approaches tend to produce a model that is biased towards either the old classes or new classes, unless with the help of exemplars of the o… ▽ More

    Submitted 15 January, 2020; v1 submitted 19 March, 2019; originally announced March 2019.

    Comments: WACV 2020 camera-ready

  18. arXiv:1902.09818  [pdf, other

    cs.CV

    Generative Visual Dialogue System via Adaptive Reasoning and Weighted Likelihood Estimation

    Authors: Heming Zhang, Shalini Ghosh, Larry Heck, Stephen Walsh, Junting Zhang, Jie Zhang, C. -C. Jay Kuo

    Abstract: The key challenge of generative Visual Dialogue (VD) systems is to respond to human queries with informative answers in natural and contiguous conversation flow. Traditional Maximum Likelihood Estimation (MLE)-based methods only learn from positive responses but ignore the negative responses, and consequently tend to yield safe or generic responses. To address this issue, we propose a novel traini… ▽ More

    Submitted 13 August, 2019; v1 submitted 26 February, 2019; originally announced February 2019.

    Comments: IJCAI 2019

  19. arXiv:1902.03751  [pdf, other

    cs.CV

    Taking a HINT: Leveraging Explanations to Make Vision and Language Models More Grounded

    Authors: Ramprasaath R. Selvaraju, Stefan Lee, Yilin Shen, Hongxia **, Shalini Ghosh, Larry Heck, Dhruv Batra, Devi Parikh

    Abstract: Many vision and language models suffer from poor visual grounding - often falling back on easy-to-learn language priors rather than basing their decisions on visual concepts in the image. In this work, we propose a generic approach called Human Importance-aware Network Tuning (HINT) that effectively leverages human demonstrations to improve visual grounding. HINT encourages deep networks to be sen… ▽ More

    Submitted 28 October, 2019; v1 submitted 11 February, 2019; originally announced February 2019.

    Comments: Published at ICCV'2019

    Journal ref: The IEEE International Conference on Computer Vision (ICCV) 2019

  20. arXiv:1804.06512  [pdf, other

    cs.CL

    Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems

    Authors: Bing Liu, Gokhan Tur, Dilek Hakkani-Tur, Pararth Shah, Larry Heck

    Abstract: In this work, we present a hybrid learning method for training task-oriented dialogue systems through online user interactions. Popular methods for learning task-oriented dialogues include applying reinforcement learning with user feedback on supervised pre-training models. Efficiency of such learning method may suffer from the mismatch of dialogue state distribution between offline training and o… ▽ More

    Submitted 17 April, 2018; originally announced April 2018.

    Comments: To appear in NAACL 2018 as a long paper

  21. arXiv:1801.04871  [pdf, other

    cs.AI cs.CL

    Building a Conversational Agent Overnight with Dialogue Self-Play

    Authors: Pararth Shah, Dilek Hakkani-Tür, Gokhan Tür, Abhinav Rastogi, Ankur Bapna, Neha Nayak, Larry Heck

    Abstract: We propose Machines Talking To Machines (M2M), a framework combining automation and crowdsourcing to rapidly bootstrap end-to-end dialogue agents for goal-oriented dialogues in arbitrary domains. M2M scales to new tasks with just a task schema and an API client from the dialogue system developer, but it is also customizable to cater to task-specific interactions. Compared to the Wizard-of-Oz appro… ▽ More

    Submitted 15 January, 2018; originally announced January 2018.

    Comments: 11 pages, 4 figures

  22. arXiv:1712.10224  [pdf, other

    cs.CL

    Scalable Multi-Domain Dialogue State Tracking

    Authors: Abhinav Rastogi, Dilek Hakkani-Tur, Larry Heck

    Abstract: Dialogue state tracking (DST) is a key component of task-oriented dialogue systems. DST estimates the user's goal at each user turn given the interaction until then. State of the art approaches for state tracking rely on deep learning methods, and represent dialogue state as a distribution over all possible slot values for each slot present in the ontology. Such a representation is not scalable wh… ▽ More

    Submitted 2 January, 2018; v1 submitted 29 December, 2017; originally announced December 2017.

    Comments: Published at ASRU-17. New version has updated results in Tables 1, 2 and 3 corresponding to the datasets released on github.com/google-research-datasets/simulated-dialogue

  23. arXiv:1712.08266  [pdf, other

    cs.AI

    Federated Control with Hierarchical Multi-Agent Deep Reinforcement Learning

    Authors: Saurabh Kumar, Pararth Shah, Dilek Hakkani-Tur, Larry Heck

    Abstract: We present a framework combining hierarchical and multi-agent deep reinforcement learning approaches to solve coordination problems among a multitude of agents using a semi-decentralized model. The framework extends the multi-agent learning setup by introducing a meta-controller that guides the communication between agent pairs, enabling agents to focus on communicating with only one other agent a… ▽ More

    Submitted 21 December, 2017; originally announced December 2017.

    Comments: Hierarchical Reinforcement Learning Workshop at the 31st Conference on Neural Information Processing Systems

  24. arXiv:1711.10712  [pdf, other

    cs.CL

    End-to-End Optimization of Task-Oriented Dialogue Model with Deep Reinforcement Learning

    Authors: Bing Liu, Gokhan Tur, Dilek Hakkani-Tur, Pararth Shah, Larry Heck

    Abstract: In this paper, we present a neural network based task-oriented dialogue system that can be optimized end-to-end with deep reinforcement learning (RL). The system is able to track dialogue state, interface with knowledge bases, and incorporate query results into agent's responses to successfully complete task-oriented dialogues. Dialogue policy learning is conducted with a hybrid supervised and dee… ▽ More

    Submitted 30 November, 2017; v1 submitted 29 November, 2017; originally announced November 2017.

  25. arXiv:1707.02363  [pdf, other

    cs.AI cs.CL

    Towards Zero-Shot Frame Semantic Parsing for Domain Scaling

    Authors: Ankur Bapna, Gokhan Tur, Dilek Hakkani-Tur, Larry Heck

    Abstract: State-of-the-art slot filling models for goal-oriented human/machine conversational language understanding systems rely on deep learning methods. While multi-task training of such models alleviates the need for large in-domain annotated datasets, bootstrap** a semantic parsing model for a new domain using only the semantic frame, such as the back-end API or knowledge graph schema, is still one o… ▽ More

    Submitted 7 July, 2017; originally announced July 2017.

    Comments: 4 pages + 1 references

  26. arXiv:1706.04486  [pdf, other

    cs.SD cs.AI

    Learning and Evaluating Musical Features with Deep Autoencoders

    Authors: Mason Bretan, Sageev Oore, Doug Eck, Larry Heck

    Abstract: In this work we describe and evaluate methods to learn musical embeddings. Each embedding is a vector that represents four contiguous beats of music and is derived from a symbolic representation. We consider autoencoding-based methods including denoising autoencoders, and context reconstruction, and evaluate the resulting embeddings on a forward prediction and a classification task.

    Submitted 15 June, 2017; v1 submitted 14 June, 2017; originally announced June 2017.

  27. arXiv:1705.03455  [pdf, other

    cs.CL cs.AI cs.LG

    Sequential Dialogue Context Modeling for Spoken Language Understanding

    Authors: Ankur Bapna, Gokhan Tur, Dilek Hakkani-Tur, Larry Heck

    Abstract: Spoken Language Understanding (SLU) is a key component of goal oriented dialogue systems that would parse user utterances into semantic frame representations. Traditionally SLU does not utilize the dialogue history beyond the previous system turn and contextual ambiguities are resolved by the downstream components. In this paper, we explore novel approaches for modeling dialogue context in a recur… ▽ More

    Submitted 7 July, 2017; v1 submitted 8 May, 2017; originally announced May 2017.

    Comments: 8 + 2 pages, Updated 10/17: Updated typos in abstract, Updated 07/07: Updated Title, abstract and few minor changes

  28. arXiv:1612.03789  [pdf, other

    cs.SD cs.AI cs.LG

    A Unit Selection Methodology for Music Generation Using Deep Neural Networks

    Authors: Mason Bretan, Gil Weinberg, Larry Heck

    Abstract: Several methods exist for a computer to generate music based on data including Markov chains, recurrent neural networks, recombinancy, and grammars. We explore the use of unit selection and concatenation as a means of generating music using a procedure based on ranking, where, we consider a unit to be a variable length number of measures of music. We first examine whether a unit selection method,… ▽ More

    Submitted 12 December, 2016; originally announced December 2016.

  29. arXiv:1606.07967  [pdf, other

    cs.CL

    Leveraging Semantic Web Search and Browse Sessions for Multi-Turn Spoken Dialog Systems

    Authors: Lu Wang, Larry Heck, Dilek Hakkani-Tur

    Abstract: Training statistical dialog models in spoken dialog systems (SDS) requires large amounts of annotated data. The lack of scalable methods for data mining and annotation poses a significant hurdle for state-of-the-art statistical dialog managers. This paper presents an approach that directly leverage billions of web search and browse sessions to overcome this hurdle. The key insight is that task com… ▽ More

    Submitted 25 June, 2016; originally announced June 2016.

    Comments: ICASSP 2014

  30. arXiv:1604.00117  [pdf, other

    cs.CL

    Domain Adaptation of Recurrent Neural Networks for Natural Language Understanding

    Authors: Aaron Jaech, Larry Heck, Mari Ostendorf

    Abstract: The goal of this paper is to use multi-task learning to efficiently scale slot filling models for natural language understanding to handle multiple target tasks or domains. The key to scalability is reducing the amount of training data needed to learn a model for a new task. The proposed multi-task model delivers better performance with less data by leveraging patterns that it learns from the othe… ▽ More

    Submitted 9 August, 2016; v1 submitted 31 March, 2016; originally announced April 2016.

    Comments: Interspeech 2016

  31. arXiv:1602.06291  [pdf, other

    cs.CL

    Contextual LSTM (CLSTM) models for Large scale NLP tasks

    Authors: Shalini Ghosh, Oriol Vinyals, Brian Strope, Scott Roy, Tom Dean, Larry Heck

    Abstract: Documents exhibit sequential structure at multiple levels of abstraction (e.g., sentences, paragraphs, sections). These abstractions constitute a natural hierarchy for representing the context in which to infer the meaning of words and larger fragments of text. In this paper, we present CLSTM (Contextual LSTM), an extension of the recurrent neural network LSTM (Long-Short Term Memory) model, where… ▽ More

    Submitted 31 May, 2016; v1 submitted 19 February, 2016; originally announced February 2016.

  32. arXiv:1504.07678  [pdf, other

    cs.CL

    Leveraging Deep Neural Networks and Knowledge Graphs for Entity Disambiguation

    Authors: Hongzhao Huang, Larry Heck, Heng Ji

    Abstract: Entity Disambiguation aims to link mentions of ambiguous entities to a knowledge base (e.g., Wikipedia). Modeling topical coherence is crucial for this task based on the assumption that information from the same semantic context tends to belong to the same topic. This paper presents a novel deep semantic relatedness model (DSRM) based on deep neural networks (DNN) and semantic knowledge graphs (KG… ▽ More

    Submitted 28 April, 2015; originally announced April 2015.

  33. arXiv:1401.0509  [pdf, other

    cs.CL cs.LG

    Zero-Shot Learning for Semantic Utterance Classification

    Authors: Yann N. Dauphin, Gokhan Tur, Dilek Hakkani-Tur, Larry Heck

    Abstract: We propose a novel zero-shot learning method for semantic utterance classification (SUC). It learns a classifier $f: X \to Y$ for problems where none of the semantic categories $Y$ are present in the training set. The framework uncovers the link between categories and utterances using a semantic space. We show that this semantic space can be learned by deep neural networks trained on large amounts… ▽ More

    Submitted 7 March, 2014; v1 submitted 20 December, 2013; originally announced January 2014.