Skip to main content

Showing 1–15 of 15 results for author: Ju, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.06516  [pdf, other

    cs.CV

    Advancing Text-Driven Chest X-Ray Generation with Policy-Based Reinforcement Learning

    Authors: Woojung Han, Chanyoung Kim, Dayun Ju, Yumin Shim, Seong Jae Hwang

    Abstract: Recent advances in text-conditioned image generation diffusion models have begun paving the way for new opportunities in modern medical domain, in particular, generating Chest X-rays (CXRs) from diagnostic reports. Nonetheless, to further drive the diffusion models to generate CXRs that faithfully reflect the complexity and diversity of real data, it has become evident that a nontrivial learning a… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  2. arXiv:2403.01482  [pdf, other

    cs.CV

    EAGLE: Eigen Aggregation Learning for Object-Centric Unsupervised Semantic Segmentation

    Authors: Chanyoung Kim, Woojung Han, Dayun Ju, Seong Jae Hwang

    Abstract: Semantic segmentation has innately relied on extensive pixel-level annotated data, leading to the emergence of unsupervised methodologies. Among them, leveraging self-supervised Vision Transformers for unsupervised semantic segmentation (USS) has been making steady progress with expressive deep features. Yet, for semantically segmenting images with complex objects, a predominant challenge remains:… ▽ More

    Submitted 5 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

  3. arXiv:2306.04707  [pdf, other

    cs.CL cs.AI

    Improving Open Language Models by Learning from Organic Interactions

    Authors: **g Xu, Da Ju, Joshua Lane, Mojtaba Komeili, Eric Michael Smith, Megan Ung, Morteza Behrooz, William Ngan, Rashel Moritz, Sainbayar Sukhbaatar, Y-Lan Boureau, Jason Weston, Kurt Shuster

    Abstract: We present BlenderBot 3x, an update on the conversational model BlenderBot 3, which is now trained using organic conversation and feedback data from participating users of the system in order to improve both its skills and safety. We are publicly releasing the participating de-identified interaction data for use by the research community, in order to spur further progress. Training models with org… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  4. arXiv:2208.03295  [pdf, other

    cs.CL cs.AI

    Learning from data in the mixed adversarial non-adversarial case: Finding the helpers and ignoring the trolls

    Authors: Da Ju, **g Xu, Y-Lan Boureau, Jason Weston

    Abstract: The promise of interaction between intelligent conversational agents and humans is that models can learn from such feedback in order to improve. Unfortunately, such exchanges in the wild will not always involve human utterances that are benign or of high quality, and will include a mixture of engaged (helpers) and unengaged or even malicious users (trolls). In this work we study how to perform rob… ▽ More

    Submitted 5 August, 2022; originally announced August 2022.

  5. arXiv:2208.03188  [pdf, other

    cs.CL cs.AI

    BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

    Authors: Kurt Shuster, **g Xu, Mojtaba Komeili, Da Ju, Eric Michael Smith, Stephen Roller, Megan Ung, Moya Chen, Kushal Arora, Joshua Lane, Morteza Behrooz, William Ngan, Spencer Poff, Naman Goyal, Arthur Szlam, Y-Lan Boureau, Melanie Kambadur, Jason Weston

    Abstract: We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks. We release both the model weights and code, and have also deployed the model on a public web page to interact with organic users. This technical report describes how the model was built (arc… ▽ More

    Submitted 10 August, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

  6. arXiv:2106.04279  [pdf, other

    cs.LG cs.CL

    Staircase Attention for Recurrent Processing of Sequences

    Authors: Da Ju, Stephen Roller, Sainbayar Sukhbaatar, Jason Weston

    Abstract: Attention mechanisms have become a standard tool for sequence modeling tasks, in particular by stacking self-attention layers over the entire input sequence as in the Transformer architecture. In this work we introduce a novel attention procedure called staircase attention that, unlike self-attention, operates across the sequence (in time) recurrently processing the input by adding another step of… ▽ More

    Submitted 8 June, 2021; originally announced June 2021.

  7. arXiv:2106.03193  [pdf, other

    cs.CL cs.AI

    The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

    Authors: Naman Goyal, Cynthia Gao, Vishrav Chaudhary, Peng-Jen Chen, Guillaume Wenzek, Da Ju, Sanjana Krishnan, Marc'Aurelio Ranzato, Francisco Guzman, Angela Fan

    Abstract: One of the biggest challenges hindering progress in low-resource and multilingual machine translation is the lack of good evaluation benchmarks. Current evaluation benchmarks either lack good coverage of low-resource languages, consider only restricted domains, or are low quality because they are constructed using semi-automatic procedures. In this work, we introduce the FLORES-101 evaluation benc… ▽ More

    Submitted 6 June, 2021; originally announced June 2021.

  8. arXiv:2105.06548  [pdf, other

    cs.LG cs.AI

    Not All Memories are Created Equal: Learning to Forget by Expiring

    Authors: Sainbayar Sukhbaatar, Da Ju, Spencer Poff, Stephen Roller, Arthur Szlam, Jason Weston, Angela Fan

    Abstract: Attention mechanisms have shown promising results in sequence modeling tasks that require long-term memory. Recent work investigated mechanisms to reduce the computational cost of preserving and storing memories. However, not all content in the past is equally important to remember. We propose Expire-Span, a method that learns to retain the most important information and expire the irrelevant info… ▽ More

    Submitted 13 June, 2021; v1 submitted 13 May, 2021; originally announced May 2021.

  9. arXiv:2010.07079  [pdf, other

    cs.CL cs.AI

    Recipes for Safety in Open-domain Chatbots

    Authors: **g Xu, Da Ju, Margaret Li, Y-Lan Boureau, Jason Weston, Emily Dinan

    Abstract: Models trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior and unwanted biases. We investigate a variety of methods to mitigate these issues in the context of open-domain generative dialogue models. We introduce a new human-and-model-in-the-loop framework for both training safer models and for… ▽ More

    Submitted 4 August, 2021; v1 submitted 14 October, 2020; originally announced October 2020.

  10. arXiv:2010.01082  [pdf, other

    cs.CL cs.AI

    Multi-Modal Open-Domain Dialogue

    Authors: Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston

    Abstract: Recent work in open-domain conversational agents has demonstrated that significant improvements in model engagingness and humanness metrics can be achieved via massive scaling in both pre-training data and model size (Adiwardana et al., 2020; Roller et al., 2020). However, if we want to build agents with human-like abilities, we must expand beyond handling just text. A particularly important topic… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

  11. arXiv:2006.12442  [pdf, other

    cs.CL cs.AI

    Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions

    Authors: Stephen Roller, Y-Lan Boureau, Jason Weston, Antoine Bordes, Emily Dinan, Angela Fan, David Gunning, Da Ju, Margaret Li, Spencer Poff, Pratik Ringshia, Kurt Shuster, Eric Michael Smith, Arthur Szlam, Jack Urbanek, Mary Williamson

    Abstract: We present our view of what is necessary to build an engaging open-domain conversational agent: covering the qualities of such an agent, the pieces of the puzzle that have been built so far, and the ga** holes we have not filled yet. We present a biased view, focusing on work done by our own group, while citing related work in each area. In particular, we discuss in detail the properties of cont… ▽ More

    Submitted 13 July, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

  12. arXiv:2004.13637  [pdf, other

    cs.CL cs.AI

    Recipes for building an open-domain chatbot

    Authors: Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu, **g Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston

    Abstract: Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results, we show that other ingredients are important for a high-performing chatbot. Good conversation requires a number of skills that an expert conversationalist blends in a… ▽ More

    Submitted 30 April, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

  13. arXiv:1912.12394  [pdf, other

    cs.CL cs.CV cs.LG

    All-in-One Image-Grounded Conversational Agents

    Authors: Da Ju, Kurt Shuster, Y-Lan Boureau, Jason Weston

    Abstract: As single-task accuracy on individual language and image tasks has improved substantially in the last few years, the long-term goal of a generally skilled agent that can both see and talk becomes more feasible to explore. In this work, we focus on leveraging individual language and image tasks, along with resources that incorporate both vision and language towards that objective. We design an arch… ▽ More

    Submitted 15 January, 2020; v1 submitted 27 December, 2019; originally announced December 2019.

  14. arXiv:1911.03768  [pdf, other

    cs.CL cs.AI

    The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents

    Authors: Kurt Shuster, Da Ju, Stephen Roller, Emily Dinan, Y-Lan Boureau, Jason Weston

    Abstract: We introduce dodecaDialogue: a set of 12 tasks that measures if a conversational agent can communicate engagingly with personality and empathy, ask questions, answer questions by utilizing knowledge resources, discuss topics and situations, and perceive and converse about images. By multi-tasking on such a broad large-scale set of data, we hope to both move towards and measure progress in producin… ▽ More

    Submitted 28 April, 2020; v1 submitted 9 November, 2019; originally announced November 2019.

    Comments: ACL 2020

  15. arXiv:1811.08568  [pdf, other

    cs.LG stat.ML

    High-Level Strategy Selection under Partial Observability in StarCraft: Brood War

    Authors: Jonas Gehring, Da Ju, Vegard Mella, Daniel Gant, Nicolas Usunier, Gabriel Synnaeve

    Abstract: We consider the problem of high-level strategy selection in the adversarial setting of real-time strategy games from a reinforcement learning perspective, where taking an action corresponds to switching to the respective strategy. Here, a good strategy successfully counters the opponent's current and possible future strategies which can only be estimated using partial observations. We investigate… ▽ More

    Submitted 20 November, 2018; originally announced November 2018.