-
"What do others think?": Task-Oriented Conversational Modeling with Subjective Knowledge
Authors:
Chao Zhao,
Spandana Gella,
Seokhwan Kim,
Di **,
Devamanyu Hazarika,
Alexandros Papangelis,
Behnam Hedayatnia,
Mahdi Namazifar,
Yang Liu,
Dilek Hakkani-Tur
Abstract:
Task-oriented Dialogue (TOD) Systems aim to build dialogue systems that assist users in accomplishing specific goals, such as booking a hotel or a restaurant. Traditional TODs rely on domain-specific APIs/DBs or external factual knowledge to generate responses, which cannot accommodate subjective user requests (e.g., "Is the WIFI reliable?" or "Does the restaurant have a good atmosphere?"). To add…
▽ More
Task-oriented Dialogue (TOD) Systems aim to build dialogue systems that assist users in accomplishing specific goals, such as booking a hotel or a restaurant. Traditional TODs rely on domain-specific APIs/DBs or external factual knowledge to generate responses, which cannot accommodate subjective user requests (e.g., "Is the WIFI reliable?" or "Does the restaurant have a good atmosphere?"). To address this issue, we propose a novel task of subjective-knowledge-based TOD (SK-TOD). We also propose the first corresponding dataset, which contains subjective knowledge-seeking dialogue contexts and manually annotated responses grounded in subjective knowledge sources. When evaluated with existing TOD approaches, we find that this task poses new challenges such as aggregating diverse opinions from multiple knowledge snippets. We hope this task and dataset can promote further research on TOD and subjective content understanding. The code and the dataset are available at https://github.com/alexa/dstc11-track5.
△ Less
Submitted 2 October, 2023; v1 submitted 20 May, 2023;
originally announced May 2023.
-
Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information
Authors:
Yen-Ting Lin,
Alexandros Papangelis,
Seokhwan Kim,
Sung** Lee,
Devamanyu Hazarika,
Mahdi Namazifar,
Di **,
Yang Liu,
Dilek Hakkani-Tur
Abstract:
This work focuses on in-context data augmentation for intent detection. Having found that augmentation via in-context prompting of large pre-trained language models (PLMs) alone does not improve performance, we introduce a novel approach based on PLMs and pointwise V-information (PVI), a metric that can measure the usefulness of a datapoint for training a model. Our method first fine-tunes a PLM o…
▽ More
This work focuses on in-context data augmentation for intent detection. Having found that augmentation via in-context prompting of large pre-trained language models (PLMs) alone does not improve performance, we introduce a novel approach based on PLMs and pointwise V-information (PVI), a metric that can measure the usefulness of a datapoint for training a model. Our method first fine-tunes a PLM on a small seed of training data and then synthesizes new datapoints - utterances that correspond to given intents. It then employs intent-aware filtering, based on PVI, to remove datapoints that are not helpful to the downstream intent classifier. Our method is thus able to leverage the expressive power of large language models to produce diverse training data. Empirical results demonstrate that our method can produce synthetic training data that achieve state-of-the-art performance on three challenging intent detection datasets under few-shot settings (1.28% absolute improvement in 5-shot and 1.18% absolute in 10-shot, on average) and perform on par with the state-of-the-art in full-shot settings (within 0.01% absolute, on average).
△ Less
Submitted 10 February, 2023;
originally announced February 2023.
-
PLACES: Prompting Language Models for Social Conversation Synthesis
Authors:
Maximillian Chen,
Alexandros Papangelis,
Chenyang Tao,
Seokhwan Kim,
Andy Rosenbaum,
Yang Liu,
Zhou Yu,
Dilek Hakkani-Tur
Abstract:
Collecting high quality conversational data can be very expensive for most applications and infeasible for others due to privacy, ethical, or similar concerns. A promising direction to tackle this problem is to generate synthetic dialogues by prompting large language models. In this work, we use a small set of expert-written conversations as in-context examples to synthesize a social conversation…
▽ More
Collecting high quality conversational data can be very expensive for most applications and infeasible for others due to privacy, ethical, or similar concerns. A promising direction to tackle this problem is to generate synthetic dialogues by prompting large language models. In this work, we use a small set of expert-written conversations as in-context examples to synthesize a social conversation dataset using prompting. We perform several thorough evaluations of our synthetic conversations compared to human-collected conversations. This includes various dimensions of conversation quality with human evaluation directly on the synthesized conversations, and interactive human evaluation of chatbots fine-tuned on the synthetically generated dataset. We additionally demonstrate that this prompting approach is generalizable to multi-party conversations, providing potential to create new synthetic data for multi-party tasks. Our synthetic multi-party conversations were rated more favorably across all measured dimensions compared to conversation excerpts sampled from a human-collected multi-party dataset.
△ Less
Submitted 16 February, 2023; v1 submitted 7 February, 2023;
originally announced February 2023.
-
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding
Authors:
Maximillian Chen,
Alexandros Papangelis,
Chenyang Tao,
Andy Rosenbaum,
Seokhwan Kim,
Yang Liu,
Zhou Yu,
Dilek Hakkani-Tur
Abstract:
Dialogue understanding tasks often necessitate abundant annotated data to achieve good performance and that presents challenges in low-resource settings. To alleviate this barrier, we explore few-shot data augmentation for dialogue understanding by prompting large pre-trained language models and present a novel approach that iterates on augmentation quality by applying weakly-supervised filters. W…
▽ More
Dialogue understanding tasks often necessitate abundant annotated data to achieve good performance and that presents challenges in low-resource settings. To alleviate this barrier, we explore few-shot data augmentation for dialogue understanding by prompting large pre-trained language models and present a novel approach that iterates on augmentation quality by applying weakly-supervised filters. We evaluate our methods on the emotion and act classification tasks in DailyDialog and the intent classification task in Facebook Multilingual Task-Oriented Dialogue. Models fine-tuned on our augmented data mixed with few-shot ground truth data are able to approach or surpass existing state-of-the-art performance on both datasets. For DailyDialog specifically, using 10% of the ground truth data we outperform the current state-of-the-art model which uses 100% of the data.
△ Less
Submitted 2 November, 2022; v1 submitted 25 October, 2022;
originally announced October 2022.
-
Knowledge-Grounded Conversational Data Augmentation with Generative Conversational Networks
Authors:
Yen-Ting Lin,
Alexandros Papangelis,
Seokhwan Kim,
Dilek Hakkani-Tur
Abstract:
While rich, open-domain textual data are generally available and may include interesting phenomena (humor, sarcasm, empathy, etc.) most are designed for language processing tasks, and are usually in a non-conversational format. In this work, we take a step towards automatically generating conversational data using Generative Conversational Networks, aiming to benefit from the breadth of available…
▽ More
While rich, open-domain textual data are generally available and may include interesting phenomena (humor, sarcasm, empathy, etc.) most are designed for language processing tasks, and are usually in a non-conversational format. In this work, we take a step towards automatically generating conversational data using Generative Conversational Networks, aiming to benefit from the breadth of available language and knowledge data, and train open domain social conversational agents. We evaluate our approach on conversations with and without knowledge on the Topical Chat dataset using automatic metrics and human evaluators. Our results show that for conversations without knowledge grounding, GCN can generalize from the seed data, producing novel conversations that are less relevant but more engaging and for knowledge-grounded conversations, it can produce more knowledge-focused, fluent, and engaging conversations. Specifically, we show that for open-domain conversations with 10\% of seed data, our approach performs close to the baseline that uses 100% of the data, while for knowledge-grounded conversations, it achieves the same using only 1% of the data, on human ratings of engagingness, fluency, and relevance.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
Authors:
Sebastian Gehrmann,
Abhik Bhattacharjee,
Abinaya Mahendiran,
Alex Wang,
Alexandros Papangelis,
Aman Madaan,
Angelina McMillan-Major,
Anna Shvets,
Ashish Upadhyay,
Bingsheng Yao,
Bryan Wilie,
Chandra Bhagavatula,
Chaobin You,
Craig Thomson,
Cristina Garbacea,
Dakuo Wang,
Daniel Deutsch,
Deyi Xiong,
Di **,
Dimitra Gkatzia,
Dragomir Radev,
Elizabeth Clark,
Esin Durmus,
Faisal Ladhak,
Filip Ginter
, et al. (52 additional authors not shown)
Abstract:
Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but the evaluation choices become sub-optimal as better alternatives arise. This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, an…
▽ More
Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but the evaluation choices become sub-optimal as better alternatives arise. This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims. To make following best model evaluation practices easier, we introduce GEMv2. The new version of the Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers to benefit from each others work. GEMv2 supports 40 documented datasets in 51 languages. Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.
△ Less
Submitted 24 June, 2022; v1 submitted 22 June, 2022;
originally announced June 2022.
-
Understanding How People Rate Their Conversations
Authors:
Alexandros Papangelis,
Nicole Chartier,
Pankaj Rajan,
Julia Hirschberg,
Dilek Hakkani-Tur
Abstract:
User ratings play a significant role in spoken dialogue systems. Typically, such ratings tend to be averaged across all users and then utilized as feedback to improve the system or personalize its behavior. While this method can be useful to understand broad, general issues with the system and its behavior, it does not take into account differences between users that affect their ratings. In this…
▽ More
User ratings play a significant role in spoken dialogue systems. Typically, such ratings tend to be averaged across all users and then utilized as feedback to improve the system or personalize its behavior. While this method can be useful to understand broad, general issues with the system and its behavior, it does not take into account differences between users that affect their ratings. In this work, we conduct a study to better understand how people rate their interactions with conversational agents. One macro-level characteristic that has been shown to correlate with how people perceive their inter-personal communication is personality. We specifically focus on agreeableness and extraversion as variables that may explain variation in ratings and therefore provide a more meaningful signal for training or personalization. In order to elicit those personality traits during an interaction with a conversational agent, we designed and validated a fictional story, grounded in prior work in psychology. We then implemented the story into an experimental conversational agent that allowed users to opt-in to hearing the story. Our results suggest that for human-conversational agent interactions, extraversion may play a role in user ratings, but more data is needed to determine if the relationship is significant. Agreeableness, on the other hand, plays a statistically significant role in conversation ratings: users who are more agreeable are more likely to provide a higher rating for their interaction. In addition, we found that users who opted to hear the story were, in general, more likely to rate their conversational experience higher than those who did not.
△ Less
Submitted 31 May, 2022;
originally announced June 2022.
-
What is wrong with you?: Leveraging User Sentiment for Automatic Dialog Evaluation
Authors:
Sarik Ghazarian,
Behnam Hedayatnia,
Alexandros Papangelis,
Yang Liu,
Dilek Hakkani-Tur
Abstract:
Accurate automatic evaluation metrics for open-domain dialogs are in high demand. Existing model-based metrics for system response evaluation are trained on human annotated data, which is cumbersome to collect. In this work, we propose to use information that can be automatically extracted from the next user utterance, such as its sentiment or whether the user explicitly ends the conversation, as…
▽ More
Accurate automatic evaluation metrics for open-domain dialogs are in high demand. Existing model-based metrics for system response evaluation are trained on human annotated data, which is cumbersome to collect. In this work, we propose to use information that can be automatically extracted from the next user utterance, such as its sentiment or whether the user explicitly ends the conversation, as a proxy to measure the quality of the previous system response. This allows us to train on a massive set of dialogs with weak supervision, without requiring manual system turn quality annotations. Experiments show that our model is comparable to models trained on human annotated data. Furthermore, our model generalizes across both spoken and written open-domain dialog corpora collected from real and paid users.
△ Less
Submitted 25 March, 2022;
originally announced March 2022.
-
User Response and Sentiment Prediction for Automatic Dialogue Evaluation
Authors:
Sarik Ghazarian,
Behnam Hedayatnia,
Alexandros Papangelis,
Yang Liu,
Dilek Hakkani-Tur
Abstract:
Automatic evaluation is beneficial for open-domain dialog system development. However, standard word-overlap metrics (BLEU, ROUGE) do not correlate well with human judgements of open-domain dialog systems. In this work we propose to use the sentiment of the next user utterance for turn or dialog level evaluation. Specifically we propose three methods: one that predicts the next sentiment directly,…
▽ More
Automatic evaluation is beneficial for open-domain dialog system development. However, standard word-overlap metrics (BLEU, ROUGE) do not correlate well with human judgements of open-domain dialog systems. In this work we propose to use the sentiment of the next user utterance for turn or dialog level evaluation. Specifically we propose three methods: one that predicts the next sentiment directly, and two others that predict the next user utterance using an utterance or a feedback generator model and then classify its sentiment. Experiments show our model outperforming existing automatic evaluation metrics on both written and spoken open-domain dialogue datasets.
△ Less
Submitted 16 February, 2022; v1 submitted 16 November, 2021;
originally announced November 2021.
-
Training Conversational Agents with Generative Conversational Networks
Authors:
Yen-Ting Lin,
Alexandros Papangelis,
Seokhwan Kim,
Dilek Hakkani-Tur
Abstract:
Rich, open-domain textual data available on the web resulted in great advancements for language processing. However, while that data may be suitable for language processing tasks, they are mostly non-conversational, lacking many phenomena that appear in human interactions and this is one of the reasons why we still have many unsolved challenges in conversational AI. In this work, we attempt to add…
▽ More
Rich, open-domain textual data available on the web resulted in great advancements for language processing. However, while that data may be suitable for language processing tasks, they are mostly non-conversational, lacking many phenomena that appear in human interactions and this is one of the reasons why we still have many unsolved challenges in conversational AI. In this work, we attempt to address this by using Generative Conversational Networks to automatically generate data and train social conversational agents. We evaluate our approach on TopicalChat with automatic metrics and human evaluators, showing that with 10% of seed data it performs close to the baseline that uses 100% of the data.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
"How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken Conversations
Authors:
Seokhwan Kim,
Yang Liu,
Di **,
Alexandros Papangelis,
Karthik Gopalakrishnan,
Behnam Hedayatnia,
Dilek Hakkani-Tur
Abstract:
Most prior work in dialogue modeling has been on written conversations mostly because of existing data sets. However, written dialogues are not sufficient to fully capture the nature of spoken conversations as well as the potential speech recognition errors in practical spoken dialogue systems. This work presents a new benchmark on spoken task-oriented conversations, which is intended to study mul…
▽ More
Most prior work in dialogue modeling has been on written conversations mostly because of existing data sets. However, written dialogues are not sufficient to fully capture the nature of spoken conversations as well as the potential speech recognition errors in practical spoken dialogue systems. This work presents a new benchmark on spoken task-oriented conversations, which is intended to study multi-domain dialogue state tracking and knowledge-grounded dialogue modeling. We report that the existing state-of-the-art models trained on written conversations are not performing well on our spoken data, as expected. Furthermore, we observe improvements in task performances when leveraging n-best speech recognition hypotheses such as by combining predictions based on individual hypotheses. Our data set enables speech-based benchmarking of task-oriented dialogue systems.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
Generative Conversational Networks
Authors:
Alexandros Papangelis,
Karthik Gopalakrishnan,
Aishwarya Padmakumar,
Seokhwan Kim,
Gokhan Tur,
Dilek Hakkani-Tur
Abstract:
Inspired by recent work in meta-learning and generative teaching networks, we propose a framework called Generative Conversational Networks, in which conversational agents learn to generate their own labelled training data (given some seed data) and then train themselves from that data to perform a given task. We use reinforcement learning to optimize the data generation process where the reward s…
▽ More
Inspired by recent work in meta-learning and generative teaching networks, we propose a framework called Generative Conversational Networks, in which conversational agents learn to generate their own labelled training data (given some seed data) and then train themselves from that data to perform a given task. We use reinforcement learning to optimize the data generation process where the reward signal is the agent's performance on the task. The task can be any language-related task, from intent detection to full task-oriented conversations. In this work, we show that our approach is able to generalise from seed data and performs well in limited data and limited computation settings, with significant gains for intent detection and slot tagging across multiple datasets: ATIS, TOD, SNIPS, and Restaurants8k. We show an average improvement of 35% in intent detection and 21% in slot tagging over a baseline model trained from the seed data. We also conduct an analysis of the novelty of the generated data and provide generated examples for intent detection, slot tagging, and non-goal oriented conversations.
△ Less
Submitted 16 July, 2021; v1 submitted 15 June, 2021;
originally announced June 2021.
-
Open-domain Topic Identification of Out-of-domain Utterances using Wikipedia
Authors:
A. Augustin,
A. Papangelis,
M. Kotti,
P. Vougiouklis,
J. Hare,
N. Braunschweiler
Abstract:
Users of spoken dialogue systems (SDS) expect high quality interactions across a wide range of diverse topics. However, the implementation of SDS capable of responding to every conceivable user utterance in an informative way is a challenging problem. Multi-domain SDS must necessarily identify and deal with out-of-domain (OOD) utterances to generate appropriate responses as users do not always kno…
▽ More
Users of spoken dialogue systems (SDS) expect high quality interactions across a wide range of diverse topics. However, the implementation of SDS capable of responding to every conceivable user utterance in an informative way is a challenging problem. Multi-domain SDS must necessarily identify and deal with out-of-domain (OOD) utterances to generate appropriate responses as users do not always know in advance what domains the SDS can handle. To address this problem, we extend the current state-of-the-art in multi-domain SDS by estimating the topic of OOD utterances using external knowledge representation from Wikipedia. Experimental results on real human-to-human dialogues showed that our approach does not degrade domain prediction performance when compared to the base model. But more significantly, our joint training achieves more accurate predictions of the nearest Wikipedia article by up to about 30% when compared to the benchmarks.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
Can You be More Social? Injecting Politeness and Positivity into Task-Oriented Conversational Agents
Authors:
Yi-Chia Wang,
Alexandros Papangelis,
Runze Wang,
Zhaleh Feizollahi,
Gokhan Tur,
Robert Kraut
Abstract:
Goal-oriented conversational agents are becoming prevalent in our daily lives. For these systems to engage users and achieve their goals, they need to exhibit appropriate social behavior as well as provide informative replies that guide users through tasks. The first component of the research in this paper applies statistical modeling techniques to understand conversations between users and human…
▽ More
Goal-oriented conversational agents are becoming prevalent in our daily lives. For these systems to engage users and achieve their goals, they need to exhibit appropriate social behavior as well as provide informative replies that guide users through tasks. The first component of the research in this paper applies statistical modeling techniques to understand conversations between users and human agents for customer service. Analyses show that social language used by human agents is associated with greater users' responsiveness and task completion. The second component of the research is the construction of a conversational agent model capable of injecting social language into an agent's responses while still preserving content. The model uses a sequence-to-sequence deep learning architecture, extended with a social language understanding element. Evaluation in terms of content preservation and social language level using both human judgment and automatic linguistic measures shows that the model can generate responses that enable agents to address users' issues in a more socially appropriate way.
△ Less
Submitted 29 December, 2020;
originally announced December 2020.
-
Language Model is All You Need: Natural Language Understanding as Question Answering
Authors:
Mahdi Namazifar,
Alexandros Papangelis,
Gokhan Tur,
Dilek Hakkani-Tür
Abstract:
Different flavors of transfer learning have shown tremendous impact in advancing research and applications of machine learning. In this work we study the use of a specific family of transfer learning, where the target domain is mapped to the source domain. Specifically we map Natural Language Understanding (NLU) problems to QuestionAnswering (QA) problems and we show that in low data regimes this…
▽ More
Different flavors of transfer learning have shown tremendous impact in advancing research and applications of machine learning. In this work we study the use of a specific family of transfer learning, where the target domain is mapped to the source domain. Specifically we map Natural Language Understanding (NLU) problems to QuestionAnswering (QA) problems and we show that in low data regimes this approach offers significant improvements compared to other approaches to NLU. Moreover we show that these gains could be increased through sequential transfer learning across NLU problems from different domains. We show that our approach could reduce the amount of required data for the same performance by up to a factor of 10.
△ Less
Submitted 5 November, 2020;
originally announced November 2020.
-
Controllable Text Generation with Focused Variation
Authors:
Lei Shu,
Alexandros Papangelis,
Yi-Chia Wang,
Gokhan Tur,
Hu Xu,
Zhaleh Feizollahi,
Bing Liu,
Piero Molino
Abstract:
This work introduces Focused-Variation Network (FVN), a novel model to control language generation. The main problems in previous controlled language generation models range from the difficulty of generating text according to the given attributes, to the lack of diversity of the generated texts. FVN addresses these issues by learning disjoint discrete latent spaces for each attribute inside codebo…
▽ More
This work introduces Focused-Variation Network (FVN), a novel model to control language generation. The main problems in previous controlled language generation models range from the difficulty of generating text according to the given attributes, to the lack of diversity of the generated texts. FVN addresses these issues by learning disjoint discrete latent spaces for each attribute inside codebooks, which allows for both controllability and diversity, while at the same time generating fluent text. We evaluate FVN on two text generation datasets with annotated content and style, and show state-of-the-art performance as assessed by automatic and human evaluations.
△ Less
Submitted 25 September, 2020;
originally announced September 2020.
-
Towards meaningful, grounded conversations with intelligent agents
Authors:
Alexandros Papangelis,
Stefan Ultes
Abstract:
As conversational agents become integral parts of many aspects of our lives, current approaches are reaching bottlenecks of performance that require increasing amounts of data or increasingly powerful models. It is also becoming clear that such agents are here to stay and accompany us for long periods of time. If we are, therefore, to design agents that can deeply understand our world and evolve w…
▽ More
As conversational agents become integral parts of many aspects of our lives, current approaches are reaching bottlenecks of performance that require increasing amounts of data or increasingly powerful models. It is also becoming clear that such agents are here to stay and accompany us for long periods of time. If we are, therefore, to design agents that can deeply understand our world and evolve with it, we need to take a step back and revisit the trade-offs we have made in the current state of the art models. This paper argues that a) we need to shift from slot filling into a more realistic conversation paradigm; and b) that, to realize that paradigm, we need models that are able to handle concrete and abstract entities as well as evolving relations between them.
△ Less
Submitted 28 June, 2020;
originally announced June 2020.
-
Joint Contextual Modeling for ASR Correction and Language Understanding
Authors:
Yue Weng,
Sai Sumanth Miryala,
Chandra Khatri,
Runze Wang,
Huaixiu Zheng,
Piero Molino,
Mahdi Namazifar,
Alexandros Papangelis,
Hugh Williams,
Franziska Bell,
Gokhan Tur
Abstract:
The quality of automatic speech recognition (ASR) is critical to Dialogue Systems as ASR errors propagate to and directly impact downstream tasks such as language understanding (LU). In this paper, we propose multi-task neural approaches to perform contextual language correction on ASR outputs jointly with LU to improve the performance of both tasks simultaneously. To measure the effectiveness of…
▽ More
The quality of automatic speech recognition (ASR) is critical to Dialogue Systems as ASR errors propagate to and directly impact downstream tasks such as language understanding (LU). In this paper, we propose multi-task neural approaches to perform contextual language correction on ASR outputs jointly with LU to improve the performance of both tasks simultaneously. To measure the effectiveness of this approach we used a public benchmark, the 2nd Dialogue State Tracking (DSTC2) corpus. As a baseline approach, we trained task-specific Statistical Language Models (SLM) and fine-tuned state-of-the-art Generalized Pre-training (GPT) Language Model to re-rank the n-best ASR hypotheses, followed by a model to identify the dialog act and slots. i) We further trained ranker models using GPT and Hierarchical CNN-RNN models with discriminatory losses to detect the best output given n-best hypotheses. We extended these ranker models to first select the best ASR output and then identify the dialogue act and slots in an end to end fashion. ii) We also proposed a novel joint ASR error correction and LU model, a word confusion pointer network (WCN-Ptr) with multi-head self-attention on top, which consumes the word confusions populated from the n-best. We show that the error rates of off the shelf ASR and following LU systems can be reduced significantly by 14% relative with joint models trained using small amounts of in-domain data.
△ Less
Submitted 28 January, 2020;
originally announced February 2020.
-
Exploration Based Language Learning for Text-Based Games
Authors:
Andrea Madotto,
Mahdi Namazifar,
Joost Huizinga,
Piero Molino,
Adrien Ecoffet,
Huaixiu Zheng,
Alexandros Papangelis,
Dian Yu,
Chandra Khatri,
Gokhan Tur
Abstract:
This work presents an exploration and imitation-learning-based agent capable of state-of-the-art performance in playing text-based computer games. Text-based computer games describe their world to the player through natural language and expect the player to interact with the game using text. These games are of interest as they can be seen as a testbed for language understanding, problem-solving, a…
▽ More
This work presents an exploration and imitation-learning-based agent capable of state-of-the-art performance in playing text-based computer games. Text-based computer games describe their world to the player through natural language and expect the player to interact with the game using text. These games are of interest as they can be seen as a testbed for language understanding, problem-solving, and language generation by artificial agents. Moreover, they provide a learning environment in which these skills can be acquired through interactions with an environment rather than using fixed corpora. One aspect that makes these games particularly challenging for learning agents is the combinatorially large action space. Existing methods for solving text-based games are limited to games that are either very simple or have an action space restricted to a predetermined set of admissible actions. In this work, we propose to use the exploration approach of Go-Explore for solving text-based games. More specifically, in an initial exploration phase, we first extract trajectories with high rewards, after which we train a policy to solve the game by imitating these trajectories. Our experiments show that this approach outperforms existing solutions in solving text-based games, and it is more sample efficient in terms of the number of interactions with the environment. Moreover, we show that the learned policy can generalize better than existing solutions to unseen games without using any restriction on the action space.
△ Less
Submitted 7 June, 2020; v1 submitted 23 January, 2020;
originally announced January 2020.
-
Plato Dialogue System: A Flexible Conversational AI Research Platform
Authors:
Alexandros Papangelis,
Mahdi Namazifar,
Chandra Khatri,
Yi-Chia Wang,
Piero Molino,
Gokhan Tur
Abstract:
As the field of Spoken Dialogue Systems and Conversational AI grows, so does the need for tools and environments that abstract away implementation details in order to expedite the development process, lower the barrier of entry to the field, and offer a common test-bed for new ideas. In this paper, we present Plato, a flexible Conversational AI platform written in Python that supports any kind of…
▽ More
As the field of Spoken Dialogue Systems and Conversational AI grows, so does the need for tools and environments that abstract away implementation details in order to expedite the development process, lower the barrier of entry to the field, and offer a common test-bed for new ideas. In this paper, we present Plato, a flexible Conversational AI platform written in Python that supports any kind of conversational agent architecture, from standard architectures to architectures with jointly-trained components, single- or multi-party interactions, and offline or online training of any conversational agent component. Plato has been designed to be easy to understand and debug and is agnostic to the underlying learning frameworks that train each component.
△ Less
Submitted 17 January, 2020;
originally announced January 2020.
-
Collaborative Multi-Agent Dialogue Model Training Via Reinforcement Learning
Authors:
Alexandros Papangelis,
Yi-Chia Wang,
Piero Molino,
Gokhan Tur
Abstract:
We present the first complete attempt at concurrently training conversational agents that communicate only via self-generated language. Using DSTC2 as seed data, we trained natural language understanding (NLU) and generation (NLG) networks for each agent and let the agents interact online. We model the interaction as a stochastic collaborative game where each agent (player) has a role ("assistant"…
▽ More
We present the first complete attempt at concurrently training conversational agents that communicate only via self-generated language. Using DSTC2 as seed data, we trained natural language understanding (NLU) and generation (NLG) networks for each agent and let the agents interact online. We model the interaction as a stochastic collaborative game where each agent (player) has a role ("assistant", "tourist", "eater", etc.) and their own objectives, and can only interact via natural language they generate. Each agent, therefore, needs to learn to operate optimally in an environment with multiple sources of uncertainty (its own NLU and NLG, the other agent's NLU, Policy, and NLG). In our evaluation, we show that the stochastic-game agents outperform deep learning based supervised baselines.
△ Less
Submitted 24 July, 2019; v1 submitted 11 July, 2019;
originally announced July 2019.
-
LD-SDS: Towards an Expressive Spoken Dialogue System based on Linked-Data
Authors:
Alexandros Papangelis,
Panagiotis Papadakos,
Margarita Kotti,
Yannis Stylianou,
Yannis Tzitzikas,
Dimitris Plexousakis
Abstract:
In this work we discuss the related challenges and describe an approach towards the fusion of state-of-the-art technologies from the Spoken Dialogue Systems (SDS) and the Semantic Web and Information Retrieval domains. We envision a dialogue system named LD-SDS that will support advanced, expressive, and engaging user requests, over multiple, complex, rich, and open-domain data sources that will l…
▽ More
In this work we discuss the related challenges and describe an approach towards the fusion of state-of-the-art technologies from the Spoken Dialogue Systems (SDS) and the Semantic Web and Information Retrieval domains. We envision a dialogue system named LD-SDS that will support advanced, expressive, and engaging user requests, over multiple, complex, rich, and open-domain data sources that will leverage the wealth of the available Linked Data. Specifically, we focus on: a) improving the identification, disambiguation and linking of entities occurring in data sources and user input; b) offering advanced query services for exploiting the semantics of the data, with reasoning and exploratory capabilities; and c) expanding the typical information seeking dialogue model (slot filling) to better reflect real-world conversational search scenarios.
△ Less
Submitted 9 October, 2017;
originally announced October 2017.