Skip to main content

Showing 1–26 of 26 results for author: Boureau, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2307.02768  [pdf, other

    cs.CL

    Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts

    Authors: Mounica Maddela, Megan Ung, **g Xu, Andrea Madotto, Heather Foran, Y-Lan Boureau

    Abstract: Many cognitive approaches to well-being, such as recognizing and reframing unhelpful thoughts, have received considerable empirical support over the past decades, yet still lack truly widespread adoption in self-help format. A barrier to that adoption is a lack of adequately specific and diverse dedicated practice material. This work examines whether current language models can be leveraged to bot… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

    Comments: ACL 2023

  2. arXiv:2306.04765  [pdf, other

    cs.AI cs.CL

    The HCI Aspects of Public Deployment of Research Chatbots: A User Study, Design Recommendations, and Open Challenges

    Authors: Morteza Behrooz, William Ngan, Joshua Lane, Giuliano Morse, Benjamin Babcock, Kurt Shuster, Mojtaba Komeili, Moya Chen, Melanie Kambadur, Y-Lan Boureau, Jason Weston

    Abstract: Publicly deploying research chatbots is a nuanced topic involving necessary risk-benefit analyses. While there have recently been frequent discussions on whether it is responsible to deploy such models, there has been far less focus on the interaction paradigms and design approaches that the resulting interfaces should adopt, in order to achieve their goals more effectively. We aim to pose, ground… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  3. arXiv:2306.04707  [pdf, other

    cs.CL cs.AI

    Improving Open Language Models by Learning from Organic Interactions

    Authors: **g Xu, Da Ju, Joshua Lane, Mojtaba Komeili, Eric Michael Smith, Megan Ung, Morteza Behrooz, William Ngan, Rashel Moritz, Sainbayar Sukhbaatar, Y-Lan Boureau, Jason Weston, Kurt Shuster

    Abstract: We present BlenderBot 3x, an update on the conversational model BlenderBot 3, which is now trained using organic conversation and feedback data from participating users of the system in order to improve both its skills and safety. We are publicly releasing the participating de-identified interaction data for use by the research community, in order to spur further progress. Training models with org… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  4. arXiv:2208.03295  [pdf, other

    cs.CL cs.AI

    Learning from data in the mixed adversarial non-adversarial case: Finding the helpers and ignoring the trolls

    Authors: Da Ju, **g Xu, Y-Lan Boureau, Jason Weston

    Abstract: The promise of interaction between intelligent conversational agents and humans is that models can learn from such feedback in order to improve. Unfortunately, such exchanges in the wild will not always involve human utterances that are benign or of high quality, and will include a mixture of engaged (helpers) and unengaged or even malicious users (trolls). In this work we study how to perform rob… ▽ More

    Submitted 5 August, 2022; originally announced August 2022.

  5. arXiv:2208.03270  [pdf, other

    cs.CL cs.AI

    Learning New Skills after Deployment: Improving open-domain internet-driven dialogue with human feedback

    Authors: **g Xu, Megan Ung, Mojtaba Komeili, Kushal Arora, Y-Lan Boureau, Jason Weston

    Abstract: Frozen models trained to mimic static datasets can never improve their performance. Models that can employ internet-retrieval for up-to-date information and obtain feedback from humans during deployment provide the promise of both adapting to new information, and improving their performance. In this work we study how to improve internet-driven conversational skills in such a learning framework. We… ▽ More

    Submitted 16 August, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

  6. arXiv:2208.03188  [pdf, other

    cs.CL cs.AI

    BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

    Authors: Kurt Shuster, **g Xu, Mojtaba Komeili, Da Ju, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora, Joshua Lane, Morteza Behrooz, William Ngan, Spencer Poff, Naman Goyal, Arthur Szlam, Y-Lan Boureau, Melanie Kambadur, Jason Weston

    Abstract: We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks. We release both the model weights and code, and have also deployed the model on a public web page to interact with organic users. This technical report describes how the model was built (arc… ▽ More

    Submitted 10 August, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

  7. arXiv:2201.04723  [pdf, other

    cs.CL cs.AI

    Human Evaluation of Conversations is an Open Problem: comparing the sensitivity of various methods for evaluating dialogue agents

    Authors: Eric Michael Smith, Orion Hsu, Rebecca Qian, Stephen Roller, Y-Lan Boureau, Jason Weston

    Abstract: At the heart of improving conversational AI is the open problem of how to evaluate conversations. Issues with automatic metrics are well known (Liu et al., 2016, arXiv:1603.08023), with human evaluations still considered the gold standard. Unfortunately, how to perform human evaluations is also an open problem: differing data collection methods have varying levels of human agreement and statistica… ▽ More

    Submitted 12 January, 2022; originally announced January 2022.

  8. arXiv:2110.07518  [pdf, other

    cs.CL cs.AI

    SaFeRDialogues: Taking Feedback Gracefully after Conversational Safety Failures

    Authors: Megan Ung, **g Xu, Y-Lan Boureau

    Abstract: Current open-domain conversational models can easily be made to talk in inadequate ways. Online learning from conversational feedback given by the conversation partner is a promising avenue for a model to improve and adapt, so as to generate fewer of these safety failures. However, current state-of-the-art models tend to react to feedback with defensive or oblivious responses. This makes for an un… ▽ More

    Submitted 4 May, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: Accepted at ACL 2022

  9. arXiv:2109.02734  [pdf, other

    cs.CL

    Detecting Inspiring Content on Social Media

    Authors: Oana Ignat, Y-Lan Boureau, Jane A. Yu, Alon Halevy

    Abstract: Inspiration moves a person to see new possibilities and transforms the way they perceive their own potential. Inspiration has received little attention in psychology, and has not been researched before in the NLP community. To the best of our knowledge, this work is the first to study inspiration through machine learning methods. We aim to automatically detect inspiring content from social media d… ▽ More

    Submitted 29 May, 2023; v1 submitted 6 September, 2021; originally announced September 2021.

    Comments: accepted at ACII 2021

  10. arXiv:2107.03451  [pdf, other

    cs.CL cs.AI

    Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling

    Authors: Emily Dinan, Gavin Abercrombie, A. Stevie Bergman, Shannon Spruit, Dirk Hovy, Y-Lan Boureau, Verena Rieser

    Abstract: Over the last several years, end-to-end neural conversational agents have vastly improved in their ability to carry a chit-chat conversation with humans. However, these models are often trained on large datasets from the internet, and as a result, may learn undesirable behaviors from this data, such as toxic or otherwise harmful language. Researchers must thus wrestle with the issue of how and whe… ▽ More

    Submitted 23 July, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

  11. arXiv:2012.14983  [pdf, other

    cs.CL cs.AI cs.LG

    Reducing conversational agents' overconfidence through linguistic calibration

    Authors: Sabrina J. Mielke, Arthur Szlam, Emily Dinan, Y-Lan Boureau

    Abstract: While improving neural dialogue agents' factual accuracy is the object of much research, another important aspect of communication, less studied in the setting of neural dialogue, is transparency about ignorance. In this work, we analyze to what extent state-of-the-art chit-chat models are linguistically calibrated in the sense that their verbalized expression of doubt (or confidence) matches the… ▽ More

    Submitted 26 June, 2022; v1 submitted 29 December, 2020; originally announced December 2020.

    Comments: Accepted in TACL, to be presented at NAACL 2022

  12. arXiv:2010.07079  [pdf, other

    cs.CL cs.AI

    Recipes for Safety in Open-domain Chatbots

    Authors: **g Xu, Da Ju, Margaret Li, Y-Lan Boureau, Jason Weston, Emily Dinan

    Abstract: Models trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior and unwanted biases. We investigate a variety of methods to mitigate these issues in the context of open-domain generative dialogue models. We introduce a new human-and-model-in-the-loop framework for both training safer models and for… ▽ More

    Submitted 4 August, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

  13. arXiv:2009.10855  [pdf, other

    cs.CL

    Controlling Style in Generated Dialogue

    Authors: Eric Michael Smith, Diana Gonzalez-Rico, Emily Dinan, Y-Lan Boureau

    Abstract: Open-domain conversation models have become good at generating natural-sounding dialogue, using very large architectures with billions of trainable parameters. The vast training data required to train these architectures aggregates many different styles, tones, and qualities. Using that data to train a single model makes it difficult to use the model as a consistent conversational agent, e.g. with… ▽ More

    Submitted 22 September, 2020; originally announced September 2020.

  14. arXiv:2006.12442  [pdf, other

    cs.CL cs.AI

    Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions

    Authors: Stephen Roller, Y-Lan Boureau, Jason Weston, Antoine Bordes, Emily Dinan, Angela Fan, David Gunning, Da Ju, Margaret Li, Spencer Poff, Pratik Ringshia, Kurt Shuster, Eric Michael Smith, Arthur Szlam, Jack Urbanek, Mary Williamson

    Abstract: We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the ga** holes we have not filled yet. We present a biased view, focusing on work done by our own group, while citing related work in each area. In particular, we discuss in detail the properties of cont… ▽ More

    Submitted 13 July, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

  15. arXiv:2005.00581  [pdf, other

    cs.CL cs.LG

    Multi-scale Transformer Language Models

    Authors: Sandeep Subramanian, Ronan Collobert, Marc'Aurelio Ranzato, Y-Lan Boureau

    Abstract: We investigate multi-scale transformer language models that learn representations of text at multiple scales, and present three different architectures that have an inductive bias to handle the hierarchical nature of language. Experiments on large-scale language modeling benchmarks empirically demonstrate favorable likelihood vs memory footprint trade-offs, e.g. we show that it is possible to trai… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

  16. arXiv:2004.13637  [pdf, other

    cs.CL cs.AI

    Recipes for building an open-domain chatbot

    Authors: Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, **g Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston

    Abstract: Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results, we show that other ingredients are important for a high-performing chatbot. Good conversation requires a number of skills that an expert conversationalist blends in a… ▽ More

    Submitted 30 April, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

  17. arXiv:2004.08449  [pdf, other

    cs.CL

    Can You Put it All Together: Evaluating Conversational Agents' Ability to Blend Skills

    Authors: Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-Lan Boureau

    Abstract: Being engaging, knowledgeable, and empathetic are all desirable general qualities in a conversational agent. Previous work has introduced tasks and datasets that aim to help agents to learn those qualities in isolation and gauge how well they can express them. But rather than being specialized in one single quality, a good open-domain conversational agent should be able to seamlessly blend them al… ▽ More

    Submitted 17 April, 2020; originally announced April 2020.

    Comments: accepted to ACL 2020 (long paper)

  18. arXiv:1912.12394  [pdf, other

    cs.CL cs.CV cs.LG

    All-in-One Image-Grounded Conversational Agents

    Authors: Da Ju, Kurt Shuster, Y-Lan Boureau, Jason Weston

    Abstract: As single-task accuracy on individual language and image tasks has improved substantially in the last few years, the long-term goal of a generally skilled agent that can both see and talk becomes more feasible to explore. In this work, we focus on leveraging individual language and image tasks, along with resources that incorporate both vision and language towards that objective. We design an arch… ▽ More

    Submitted 15 January, 2020; v1 submitted 27 December, 2019; originally announced December 2019.

  19. arXiv:1911.03914  [pdf, ps, other

    cs.CL

    Zero-Shot Fine-Grained Style Transfer: Leveraging Distributed Continuous Style Representations to Transfer To Unseen Styles

    Authors: Eric Michael Smith, Diana Gonzalez-Rico, Emily Dinan, Y-Lan Boureau

    Abstract: Text style transfer is usually performed using attributes that can take a handful of discrete values (e.g., positive to negative reviews). In this work, we introduce an architecture that can leverage pre-trained consistent continuous distributed style representations and use them to transfer to an attribute unseen during training, without requiring any re-tuning of the style transfer model. We dem… ▽ More

    Submitted 10 November, 2019; originally announced November 2019.

  20. arXiv:1911.03860  [pdf, other

    cs.CL

    Don't Say That! Making Inconsistent Dialogue Unlikely with Unlikelihood Training

    Authors: Margaret Li, Stephen Roller, Ilia Kulikov, Sean Welleck, Y-Lan Boureau, Kyunghyun Cho, Jason Weston

    Abstract: Generative dialogue models currently suffer from a number of problems which standard maximum likelihood training does not address. They tend to produce generations that (i) rely too much on copying from the context, (ii) contain repetitions within utterances, (iii) overuse frequent words, and (iv) at a deeper level, contain logical flaws. In this work we show how all of these problems can be addre… ▽ More

    Submitted 6 May, 2020; v1 submitted 10 November, 2019; originally announced November 2019.

  21. arXiv:1911.03768  [pdf, other

    cs.CL cs.AI

    The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents

    Authors: Kurt Shuster, Da Ju, Stephen Roller, Emily Dinan, Y-Lan Boureau, Jason Weston

    Abstract: We introduce dodecaDialogue: a set of 12 tasks that measures if a conversational agent can communicate engagingly with personality and empathy, ask questions, answer questions by utilizing knowledge resources, discuss topics and situations, and perceive and converse about images. By multi-tasking on such a broad large-scale set of data, we hope to both move towards and measure progress in producin… ▽ More

    Submitted 28 April, 2020; v1 submitted 9 November, 2019; originally announced November 2019.

    Comments: ACL 2020

  22. arXiv:1909.03922  [pdf, other

    cs.CL cs.AI

    Recommendation as a Communication Game: Self-Supervised Bot-Play for Goal-oriented Dialogue

    Authors: Dongyeop Kang, Anusha Balakrishnan, Pararth Shah, Paul Crook, Y-Lan Boureau, Jason Weston

    Abstract: Traditional recommendation systems produce static rather than interactive recommendations invariant to a user's specific requests, clarifications, or current mood, and can suffer from the cold-start problem if their tastes are unknown. These issues can be alleviated by treating recommendation as an interactive dialogue task instead, where an expert recommender can sequentially ask about someone's… ▽ More

    Submitted 9 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019

  23. arXiv:1903.05168  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    On the Pitfalls of Measuring Emergent Communication

    Authors: Ryan Lowe, Jakob Foerster, Y-Lan Boureau, Joelle Pineau, Yann Dauphin

    Abstract: How do we know if communication is emerging in a multi-agent system? The vast majority of recent papers on emergent communication show that adding a communication channel leads to an increase in reward or task success. This is a useful indicator, but provides only a coarse measure of the agent's learned communication abilities. As we move towards more complex environments, it becomes imperative to… ▽ More

    Submitted 12 March, 2019; originally announced March 2019.

    Comments: AAMAS 2019. 13 pages

  24. arXiv:1811.00552  [pdf, other

    cs.CL cs.LG

    Multiple-Attribute Text Style Transfer

    Authors: Sandeep Subramanian, Guillaume Lample, Eric Michael Smith, Ludovic Denoyer, Marc'Aurelio Ranzato, Y-Lan Boureau

    Abstract: The dominant approach to unsupervised "style transfer" in text is based on the idea of learning a latent representation, which is independent of the attributes specifying its "style". In this paper, we show that this condition is not necessary and is not always met in practice, even with domain adversarial training that explicitly aims at learning such disentangled representations. We thus propose… ▽ More

    Submitted 20 September, 2019; v1 submitted 1 November, 2018; originally announced November 2018.

  25. arXiv:1811.00207  [pdf, other

    cs.CL

    Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset

    Authors: Hannah Rashkin, Eric Michael Smith, Margaret Li, Y-Lan Boureau

    Abstract: One challenge for dialogue agents is recognizing feelings in the conversation partner and replying accordingly, a key communicative skill. While it is straightforward for humans to recognize and acknowledge others' feelings in a conversation, this is a significant challenge for AI systems due to the paucity of suitable publicly-available datasets for training and evaluation. This work proposes a n… ▽ More

    Submitted 28 August, 2019; v1 submitted 31 October, 2018; originally announced November 2018.

    Comments: accepted to ACL 2019 (long paper)

  26. arXiv:1605.07683  [pdf, other

    cs.CL

    Learning End-to-End Goal-Oriented Dialog

    Authors: Antoine Bordes, Y-Lan Boureau, Jason Weston

    Abstract: Traditional dialog systems used in goal-oriented applications require a lot of domain-specific handcrafting, which hinders scaling up to new domains. End-to-end dialog systems, in which all components are trained from the dialogs themselves, escape this limitation. But the encouraging success recently obtained in chit-chat dialog may not carry over to goal-oriented settings. This paper proposes a… ▽ More

    Submitted 30 March, 2017; v1 submitted 24 May, 2016; originally announced May 2016.

    Comments: Accepted as a conference paper at ICLR 2017