Skip to main content

Showing 1–50 of 52 results for author: Bras, R L

.
  1. arXiv:2406.04770  [pdf, other

    cs.CL cs.AI

    WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

    Authors: Bill Yuchen Lin, Yuntian Deng, Khyathi Chandu, Faeze Brahman, Abhilasha Ravichander, Valentina Pyatkin, Nouha Dziri, Ronan Le Bras, Ye** Choi

    Abstract: We introduce WildBench, an automated evaluation framework designed to benchmark large language models (LLMs) using challenging, real-world user queries. WildBench consists of 1,024 tasks carefully selected from over one million human-chatbot conversation logs. For automated evaluation with WildBench, we have developed two metrics, WB-Reward and WB-Score, which are computable using advanced LLMs su… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Link: https://hf.co/spaces/allenai/WildBench

  2. arXiv:2312.05979  [pdf, other

    cs.CL

    NovaCOMET: Open Commonsense Foundation Models with Symbolic Knowledge Distillation

    Authors: Peter West, Ronan Le Bras, Taylor Sorensen, Bill Yuchen Lin, Liwei Jiang, Ximing Lu, Khyathi Chandu, Jack Hessel, Ashutosh Baheti, Chandra Bhagavatula, Ye** Choi

    Abstract: We present NovaCOMET, an open commonsense knowledge model, that combines the best aspects of knowledge and general task models. Compared to previous knowledge models, NovaCOMET allows open-format relations enabling direct application to reasoning tasks; compared to general task models like Flan-T5, it explicitly centers knowledge, enabling superior performance for commonsense reasoning. NovaCOME… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  3. arXiv:2311.09682  [pdf, other

    cs.CL cs.AI

    MacGyver: Are Large Language Models Creative Problem Solvers?

    Authors: Yufei Tian, Abhilasha Ravichander, Lianhui Qin, Ronan Le Bras, Raja Marjieh, Nanyun Peng, Ye** Choi, Thomas L. Griffiths, Faeze Brahman

    Abstract: We explore the creative problem-solving capabilities of modern LLMs in a novel constrained setting. To this end, we create MACGYVER, an automatically generated dataset consisting of over 1,600 real-world problems deliberately designed to trigger innovative usage of objects and necessitate out-of-the-box thinking. We then present our collection to both LLMs and humans to compare and contrast their… ▽ More

    Submitted 27 March, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  4. arXiv:2310.15421  [pdf, other

    cs.CL cs.AI

    FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions

    Authors: Hyunwoo Kim, Melanie Sclar, Xuhui Zhou, Ronan Le Bras, Gunhee Kim, Ye** Choi, Maarten Sap

    Abstract: Theory of mind (ToM) evaluations currently focus on testing models using passive narratives that inherently lack interactivity. We introduce FANToM, a new benchmark designed to stress-test ToM within information-asymmetric conversational contexts via question answering. Our benchmark draws upon important theoretical requisites from psychology and necessary empirical considerations when evaluating… ▽ More

    Submitted 31 October, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023. Code and dataset can be found here: https://hyunw.kim/fantom

  5. arXiv:2306.02388  [pdf, other

    cs.CL cs.AI cs.LG

    Commonsense Knowledge Transfer for Pre-trained Language Models

    Authors: Wangchunshu Zhou, Ronan Le Bras, Ye** Choi

    Abstract: Despite serving as the foundation models for a wide range of NLP benchmarks, pre-trained language models have shown limited capabilities of acquiring implicit commonsense knowledge from self-supervision alone, compared to learning linguistic and factual knowledge that appear more explicitly in the surface patterns in text. In this work, we introduce commonsense knowledge transfer, a framework to t… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: ACL 2023 Findings

  6. arXiv:2306.02379  [pdf, other

    cs.CL cs.LG

    Modular Transformers: Compressing Transformers into Modularized Layers for Flexible Efficient Inference

    Authors: Wangchunshu Zhou, Ronan Le Bras, Ye** Choi

    Abstract: Pre-trained Transformer models like T5 and BART have advanced the state of the art on a wide range of text generation tasks. Compressing these models into smaller ones has become critically important for practical use. Common neural network compression techniques such as knowledge distillation or quantization are limited to static compression where the compression ratio is fixed. In this paper, we… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: ACL 2023 Findings

  7. arXiv:2306.01943  [pdf, other

    cs.CL cs.CY cs.HC

    NLPositionality: Characterizing Design Biases of Datasets and Models

    Authors: Sebastin Santy, Jenny T. Liang, Ronan Le Bras, Katharina Reinecke, Maarten Sap

    Abstract: Design biases in NLP systems, such as performance differences for different populations, often stem from their creator's positionality, i.e., views and lived experiences shaped by identity and background. Despite the prevalence and risks of design biases, they are hard to quantify because researcher, system, and dataset positionality is often unobserved. We introduce NLPositionality, a framework f… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Comments: ACL 2023

  8. arXiv:2305.18654  [pdf, other

    cs.CL cs.AI cs.LG

    Faith and Fate: Limits of Transformers on Compositionality

    Authors: Nouha Dziri, Ximing Lu, Melanie Sclar, Xiang Lorraine Li, Liwei Jiang, Bill Yuchen Lin, Peter West, Chandra Bhagavatula, Ronan Le Bras, Jena D. Hwang, Soumya Sanyal, Sean Welleck, Xiang Ren, Allyson Ettinger, Zaid Harchaoui, Ye** Choi

    Abstract: Transformer large language models (LLMs) have sparked admiration for their exceptional performance on tasks that demand intricate multi-step reasoning. Yet, these models simultaneously show failures on surprisingly trivial problems. This begs the question: Are these errors incidental, or do they signal more substantial limitations? In an attempt to demystify transformer LLMs, we investigate the li… ▽ More

    Submitted 31 October, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: 10 pages + appendix (40 pages)

  9. arXiv:2305.17174  [pdf, other

    cs.CL cs.CY

    From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language Models

    Authors: Julia Mendelsohn, Ronan Le Bras, Ye** Choi, Maarten Sap

    Abstract: Dogwhistles are coded expressions that simultaneously convey one meaning to a broad audience and a second one, often hateful or provocative, to a narrow in-group; they are deployed to evade both political repercussions and algorithmic content moderation. For example, in the sentence 'we need to end the cosmopolitan experiment,' the word 'cosmopolitan' likely means 'worldly' to many, but secretly m… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: ACL 2023, see https://dogwhistles.allen.ai/ for the glossary and other materials

  10. arXiv:2305.14718  [pdf, other

    cs.CL

    Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models

    Authors: Ashutosh Baheti, Ximing Lu, Faeze Brahman, Ronan Le Bras, Maarten Sap, Mark Riedl

    Abstract: Reinforcement Learning with Human Feedback (RLHF) is the most prominent method for Language Model (LM) alignment. However, RLHF is an unstable and data-hungry process that continually requires new high-quality LM-generated data for finetuning. We introduce Advantage-Leftover Lunch RL (A-LoL), a new class of offline policy gradient algorithms that enable RL training on any pre-existing data. By ass… ▽ More

    Submitted 19 April, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: published at ICLR 2024

  11. arXiv:2212.10465  [pdf, other

    cs.CL

    SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization

    Authors: Hyunwoo Kim, Jack Hessel, Liwei Jiang, Peter West, Ximing Lu, Youngjae Yu, Pei Zhou, Ronan Le Bras, Malihe Alikhani, Gunhee Kim, Maarten Sap, Ye** Choi

    Abstract: Data scarcity has been a long standing issue in the field of open-domain social dialogue. To quench this thirst, we present SODA: the first publicly available, million-scale high-quality social dialogue dataset. By contextualizing social commonsense knowledge from a knowledge graph, we are able to distill an exceptionally broad spectrum of social interactions from a large language model. Human eva… ▽ More

    Submitted 23 October, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: EMNLP 2023. Dataset, model, and code can be found at https://hyunw.kim/sodaverse

  12. arXiv:2212.09246  [pdf, other

    cs.CL

    I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation

    Authors: Chandra Bhagavatula, Jena D. Hwang, Doug Downey, Ronan Le Bras, Ximing Lu, Lianhui Qin, Keisuke Sakaguchi, Swabha Swayamdipta, Peter West, Ye** Choi

    Abstract: Commonsense capabilities of pre-trained language models dramatically improve with scale, leading many to believe that scale is the only winning recipe. But is it? Here, we investigate an alternative that a priori seems impossible: can smaller language models (e.g., GPT-2) win over models that are orders of magnitude larger and better (e.g., GPT-3), if powered with novel commonsense distillation al… ▽ More

    Submitted 26 May, 2023; v1 submitted 18 December, 2022; originally announced December 2022.

    Comments: ACL 2023

  13. arXiv:2207.13332  [pdf, other

    cs.CL

    RealTime QA: What's the Answer Right Now?

    Authors: Jungo Kasai, Keisuke Sakaguchi, Yoichi Takahashi, Ronan Le Bras, Akari Asai, Xinyan Yu, Dragomir Radev, Noah A. Smith, Ye** Choi, Kentaro Inui

    Abstract: We introduce REALTIME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version). REALTIME QA inquires about the current world, and QA systems need to answer questions about novel events or information. It therefore challenges static, conventional assumptions in open-domain QA datasets and pursues instantaneous applicat… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 July, 2022; originally announced July 2022.

    Comments: RealTime QA Website: https://realtimeqa.github.io/

  14. arXiv:2205.12630  [pdf, other

    cs.CL cs.CV

    Multimodal Knowledge Alignment with Reinforcement Learning

    Authors: Youngjae Yu, Jiwan Chung, Heeseung Yun, Jack Hessel, JaeSung Park, Ximing Lu, Prithviraj Ammanabrolu, Rowan Zellers, Ronan Le Bras, Gunhee Kim, Ye** Choi

    Abstract: Large language models readily adapt to novel settings, even without task-specific training data. Can their zero-shot capacity be extended to multimodal inputs? In this work, we propose ESPER which extends language-only zero-shot models to unseen multimodal tasks, like image and audio captioning. Our key novelty is to use reinforcement learning to align multimodal inputs to language model generatio… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    ACM Class: I.2.7; I.4.9

  15. arXiv:2205.11822  [pdf, other

    cs.CL

    Maieutic Prompting: Logically Consistent Reasoning with Recursive Explanations

    Authors: Jaehun Jung, Lianhui Qin, Sean Welleck, Faeze Brahman, Chandra Bhagavatula, Ronan Le Bras, Ye** Choi

    Abstract: Despite their impressive capabilities, large pre-trained language models (LMs) struggle with consistent reasoning; recently, prompting LMs to generate explanations that self-guide the inference has emerged as a promising direction to amend this. However, these approaches are fundamentally bounded by the correctness of explanations, which themselves are often noisy and inconsistent. In this work, w… ▽ More

    Submitted 24 October, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022

  16. arXiv:2205.09273  [pdf, other

    cs.CL

    Twist Decoding: Diverse Generators Guide Each Other

    Authors: Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Hao Peng, Ximing Lu, Dragomir Radev, Ye** Choi, Noah A. Smith

    Abstract: Many language generation models are now available for a wide range of generation tasks, including machine translation and summarization. Combining such diverse models may lead to further progress, but ensembling generation models is challenging during inference: conventional ensembling methods (e.g., shallow fusion) require that the models share vocabulary/tokenization schemes. We introduce Twist… ▽ More

    Submitted 28 October, 2022; v1 submitted 18 May, 2022; originally announced May 2022.

    Comments: Proc. of EMNLP 2022

  17. arXiv:2204.05424  [pdf, other

    cs.CL

    A Call for Clarity in Beam Search: How It Works and When It Stops

    Authors: Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Dragomir Radev, Ye** Choi, Noah A. Smith

    Abstract: Text generation with beam search has proven successful in a wide range of applications. We point out that, though largely overlooked in the literature, the commonly-used implementation of beam decoding (e.g., Hugging Face Transformers and fairseq) uses a first come, first served heuristic: it keeps a set of already completed sequences over time steps and stops when the size of this set reaches the… ▽ More

    Submitted 28 February, 2024; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: LREC-COLING 2024

  18. arXiv:2201.05320  [pdf, other

    cs.CL cs.AI cs.LG

    CommonsenseQA 2.0: Exposing the Limits of AI through Gamification

    Authors: Alon Talmor, Ori Yoran, Ronan Le Bras, Chandra Bhagavatula, Yoav Goldberg, Ye** Choi, Jonathan Berant

    Abstract: Constructing benchmarks that test the abilities of modern natural language understanding models is difficult - pre-trained language models exploit artifacts in benchmarks to achieve human parity, but still fail on adversarial examples and make errors that demonstrate a lack of common sense. In this work, we propose gamification as a framework for data construction. The goal of players in the game… ▽ More

    Submitted 14 January, 2022; originally announced January 2022.

    Comments: Presented as Oral at NeurIPS 2021

  19. arXiv:2112.08726  [pdf, other

    cs.CL

    NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics

    Authors: Ximing Lu, Sean Welleck, Peter West, Liwei Jiang, Jungo Kasai, Daniel Khashabi, Ronan Le Bras, Lianhui Qin, Youngjae Yu, Rowan Zellers, Noah A. Smith, Ye** Choi

    Abstract: The dominant paradigm for neural text generation is left-to-right decoding from autoregressive language models. Constrained or controllable generation under complex lexical constraints, however, requires foresight to plan ahead feasible future paths. Drawing inspiration from the A* search algorithm, we propose NeuroLogic A*esque, a decoding algorithm that incorporates heuristic estimates of futu… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

  20. arXiv:2112.04139  [pdf, other

    cs.CL

    Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand

    Authors: Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Lavinia Dunagan, Jacob Morrison, Alexander R. Fabbri, Ye** Choi, Noah A. Smith

    Abstract: Natural language processing researchers have identified limitations of evaluation methodology for generation tasks, with new questions raised about the validity of automatic metrics and of crowdworker judgments. Meanwhile, efforts to improve generation models tend to depend on simple n-gram overlap metrics (e.g., BLEU, ROUGE). We argue that new advances on models and metrics should each more direc… ▽ More

    Submitted 18 May, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: Proc. of NAACL 2022

  21. arXiv:2111.08940  [pdf, other

    cs.CL cs.CV

    Transparent Human Evaluation for Image Captioning

    Authors: Jungo Kasai, Keisuke Sakaguchi, Lavinia Dunagan, Jacob Morrison, Ronan Le Bras, Ye** Choi, Noah A. Smith

    Abstract: We establish THumB, a rubric-based human evaluation protocol for image captioning models. Our scoring rubrics and their definitions are carefully developed based on machine- and human-generated captions on the MSCOCO dataset. Each caption is evaluated along two main dimensions in a tradeoff (precision and recall) as well as other aspects that measure the text quality (fluency, conciseness, and inc… ▽ More

    Submitted 18 May, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

    Comments: Proc. of NAACL 2022

  22. arXiv:2110.08387  [pdf, other

    cs.CL

    Generated Knowledge Prompting for Commonsense Reasoning

    Authors: Jiacheng Liu, Alisa Liu, Ximing Lu, Sean Welleck, Peter West, Ronan Le Bras, Ye** Choi, Hannaneh Hajishirzi

    Abstract: It remains an open question whether incorporating external knowledge benefits commonsense reasoning while maintaining the flexibility of pretrained sequence models. To investigate this question, we develop generated knowledge prompting, which consists of generating knowledge from a language model, then providing the knowledge as additional input when answering a question. Our method does not requi… ▽ More

    Submitted 28 September, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ACL 2022 main conference

  23. arXiv:2110.07574  [pdf, other

    cs.CL

    Can Machines Learn Morality? The Delphi Experiment

    Authors: Liwei Jiang, Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jenny Liang, Jesse Dodge, Keisuke Sakaguchi, Maxwell Forbes, Jon Borchardt, Saadia Gabriel, Yulia Tsvetkov, Oren Etzioni, Maarten Sap, Regina Rini, Ye** Choi

    Abstract: As AI systems become increasingly powerful and pervasive, there are growing concerns about machines' morality or a lack thereof. Yet, teaching morality to machines is a formidable task, as morality remains among the most intensely debated questions in humanity, let alone for AI. Existing AI systems deployed to millions of users, however, are already making decisions loaded with moral implications,… ▽ More

    Submitted 12 July, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

  24. arXiv:2110.07178  [pdf, other

    cs.CL

    Symbolic Knowledge Distillation: from General Language Models to Commonsense Models

    Authors: Peter West, Chandra Bhagavatula, Jack Hessel, Jena D. Hwang, Liwei Jiang, Ronan Le Bras, Ximing Lu, Sean Welleck, Ye** Choi

    Abstract: The common practice for training commonsense models has gone from-human-to-corpus-to-machine: humans author commonsense knowledge graphs in order to train commonsense models. In this work, we investigate an alternative, from-machine-to-corpus-to-machine: general language models author these commonsense knowledge graphs to train commonsense models. Our study leads to a new framework, Symbolic Knowl… ▽ More

    Submitted 28 November, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

  25. arXiv:2104.08718  [pdf, other

    cs.CV cs.CL

    CLIPScore: A Reference-free Evaluation Metric for Image Captioning

    Authors: Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, Ye** Choi

    Abstract: Image captioning has conventionally relied on reference-based automatic evaluations, where machine captions are compared against captions written by humans. This is in contrast to the reference-free manner in which humans assess caption quality. In this paper, we report the surprising empirical finding that CLIP (Radford et al., 2021), a cross-modal model pretrained on 400M image+caption pairs f… ▽ More

    Submitted 23 March, 2022; v1 submitted 18 April, 2021; originally announced April 2021.

    Journal ref: EMNLP 2021

  26. arXiv:2104.08251  [pdf, other

    cs.CL

    proScript: Partially Ordered Scripts Generation via Pre-trained Language Models

    Authors: Keisuke Sakaguchi, Chandra Bhagavatula, Ronan Le Bras, Niket Tandon, Peter Clark, Ye** Choi

    Abstract: Scripts - standardized event sequences describing typical everyday activities - have been shown to help understand narratives by providing expectations, resolving ambiguity, and filling in unstated information. However, to date they have proved hard to author or extract from text. In this work, we demonstrate for the first time that pre-trained neural language models (LMs) can be be finetuned to g… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

  27. arXiv:2104.01112  [pdf, other

    cs.IR cs.LG

    NaturalProofs: Mathematical Theorem Proving in Natural Language

    Authors: Sean Welleck, Jiacheng Liu, Ronan Le Bras, Hannaneh Hajishirzi, Ye** Choi, Kyunghyun Cho

    Abstract: Understanding and creating mathematics using natural mathematical language - the mixture of symbolic and natural language used by humans - is a challenging and important problem for driving progress in machine learning. As a step in this direction, we develop NaturalProofs, a multi-domain corpus of mathematical statements and their proofs, written in natural mathematical language. NaturalProofs un… ▽ More

    Submitted 7 June, 2021; v1 submitted 23 March, 2021; originally announced April 2021.

  28. arXiv:2103.13009  [pdf, other

    cs.CL

    UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark

    Authors: Nicholas Lourie, Ronan Le Bras, Chandra Bhagavatula, Ye** Choi

    Abstract: Commonsense AI has long been seen as a near impossible goal -- until recently. Now, research interest has sharply increased with an influx of new benchmarks and models. We propose two new ways to evaluate commonsense models, emphasizing their generality on new tasks and building on diverse, recently introduced benchmarks. First, we propose a new multitask benchmark, RAINBOW, to promote research… ▽ More

    Submitted 24 March, 2021; originally announced March 2021.

    Comments: 27 pages, 19 figures, 34 tables. Accepted to AAAI 2021. For associated code and data see https://github.com/allenai/rainbow

  29. arXiv:2101.00297  [pdf, other

    cs.CL

    Analyzing Commonsense Emergence in Few-shot Knowledge Models

    Authors: Jeff Da, Ronan Le Bras, Ximing Lu, Ye** Choi, Antoine Bosselut

    Abstract: Recently, commonsense knowledge models - pretrained language models (LM) fine-tuned on knowledge graph (KG) tuples - showed that considerable amounts of commonsense knowledge can be encoded in the parameters of large language models. However, as parallel studies show that LMs are poor hypothesizers of declarative commonsense relationships on their own, it remains unclear whether this knowledge is… ▽ More

    Submitted 9 September, 2021; v1 submitted 1 January, 2021; originally announced January 2021.

    Comments: AKBC 2021

  30. arXiv:2012.15738  [pdf, other

    cs.CL cs.AI

    Moral Stories: Situated Reasoning about Norms, Intents, Actions, and their Consequences

    Authors: Denis Emelin, Ronan Le Bras, Jena D. Hwang, Maxwell Forbes, Ye** Choi

    Abstract: In social settings, much of human behavior is governed by unspoken rules of conduct. For artificial systems to be fully integrated into social environments, adherence to such norms is a central prerequisite. We investigate whether contemporary NLG models can function as behavioral priors for systems deployed in social settings by generating action hypotheses that achieve predefined goals under mor… ▽ More

    Submitted 31 December, 2020; originally announced December 2020.

    Comments: For the 'Moral Stories' dataset, see https://github.com/demelin/moral_stories

  31. arXiv:2010.12884  [pdf, other

    cs.CL

    NeuroLogic Decoding: (Un)supervised Neural Text Generation with Predicate Logic Constraints

    Authors: Ximing Lu, Peter West, Rowan Zellers, Ronan Le Bras, Chandra Bhagavatula, Ye** Choi

    Abstract: Conditional text generation often requires lexical constraints, i.e., which words should or shouldn't be included in the output text. While the dominant recipe for conditional text generation has been large-scale pretrained language models that are finetuned on the task-specific training data, such models do not learn to follow the underlying constraints reliably, even when supervised with large a… ▽ More

    Submitted 20 April, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: NAACL 2021

  32. arXiv:2010.07526  [pdf, other

    cs.CL cs.CV

    Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense Graphs

    Authors: Ana Marasović, Chandra Bhagavatula, Jae Sung Park, Ronan Le Bras, Noah A. Smith, Ye** Choi

    Abstract: Natural language rationales could provide intuitive, higher-level explanations that are easily understandable by humans, complementing the more broadly studied lower-level explanations based on gradients or attention weights. We present the first study focused on generating natural language rationales across several complex visual reasoning tasks: visual commonsense reasoning, visual-textual entai… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: Accepted to Findings of EMNLP

  33. arXiv:2010.05953  [pdf, other

    cs.CL

    COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs

    Authors: Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sakaguchi, Antoine Bosselut, Ye** Choi

    Abstract: Recent years have brought about a renewed interest in commonsense representation and reasoning in the field of natural language understanding. The development of new commonsense knowledge graphs (CSKG) has been central to these advances as their diverse facts can be used and referenced by machine learning models for tackling new and challenging tasks. At the same time, there remain questions about… ▽ More

    Submitted 16 December, 2021; v1 submitted 12 October, 2020; originally announced October 2020.

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence (2021), 35(7), 6384-6392

  34. arXiv:2010.05906  [pdf, other

    cs.CL cs.AI cs.LG

    Back to the Future: Unsupervised Backprop-based Decoding for Counterfactual and Abductive Commonsense Reasoning

    Authors: Lianhui Qin, Vered Shwartz, Peter West, Chandra Bhagavatula, Jena Hwang, Ronan Le Bras, Antoine Bosselut, Ye** Choi

    Abstract: Abductive and counterfactual reasoning, core abilities of everyday human cognition, require reasoning about what might have happened at time t, while conditioning on multiple contexts from the relative past and future. However, simultaneous incorporation of past and future contexts using generative language models (LMs) can be challenging, as they are trained either to condition only on the past c… ▽ More

    Submitted 2 August, 2021; v1 submitted 12 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  35. arXiv:2010.01486  [pdf, other

    cs.CL cs.LG

    Paragraph-level Commonsense Transformers with Recurrent Memory

    Authors: Saadia Gabriel, Chandra Bhagavatula, Vered Shwartz, Ronan Le Bras, Maxwell Forbes, Ye** Choi

    Abstract: Human understanding of narrative texts requires making commonsense inferences beyond what is stated explicitly in the text. A recent model, COMET, can generate such implicit commonsense inferences along several dimensions such as pre- and post-conditions, motivations, and mental states of the participants. However, COMET was trained on commonsense inferences of short phrases, and is therefore disc… ▽ More

    Submitted 2 February, 2021; v1 submitted 4 October, 2020; originally announced October 2020.

    Comments: AAAI 2021

  36. arXiv:2008.09094  [pdf, other

    cs.CL

    Scruples: A Corpus of Community Ethical Judgments on 32,000 Real-Life Anecdotes

    Authors: Nicholas Lourie, Ronan Le Bras, Ye** Choi

    Abstract: As AI systems become an increasing part of people's everyday lives, it becomes ever more important that they understand people's ethical norms. Motivated by descriptive ethics, a field of study that focuses on people's descriptive judgments rather than theoretical prescriptions on morality, we investigate a novel, data-driven approach to machine ethics. We introduce Scruples, the first large-sca… ▽ More

    Submitted 24 March, 2021; v1 submitted 20 August, 2020; originally announced August 2020.

    Comments: 18 pages, 14 tables, 18 figures. Accepted to AAAI 2021. For associated code and data, see https://github.com/allenai/scruples

  37. Generative Data Augmentation for Commonsense Reasoning

    Authors: Yiben Yang, Chaitanya Malaviya, Jared Fernandez, Swabha Swayamdipta, Ronan Le Bras, Ji-** Wang, Chandra Bhagavatula, Ye** Choi, Doug Downey

    Abstract: Recent advances in commonsense reasoning depend on large-scale human-annotated training data to achieve peak performance. However, manual curation of training examples is expensive and has been shown to introduce annotation artifacts that neural models can readily exploit and overfit on. We investigate G-DAUG^C, a novel generative data augmentation method that aims to achieve more accurate and rob… ▽ More

    Submitted 16 November, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: Findings of the Association for Computational Linguistics: EMNLP 2020

  38. arXiv:2004.05483  [pdf, other

    cs.CL

    Unsupervised Commonsense Question Answering with Self-Talk

    Authors: Vered Shwartz, Peter West, Ronan Le Bras, Chandra Bhagavatula, Ye** Choi

    Abstract: Natural language understanding involves reading between the lines with implicit background knowledge. Current systems either rely on pre-trained language models as the sole implicit source of world knowledge, or resort to external knowledge bases (KBs) to incorporate additional relevant knowledge. We propose an unsupervised framework based on self-talk as a novel alternative to multiple-choice com… ▽ More

    Submitted 15 September, 2020; v1 submitted 11 April, 2020; originally announced April 2020.

    Comments: EMNLP 2020

  39. arXiv:2002.04108  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Adversarial Filters of Dataset Biases

    Authors: Ronan Le Bras, Swabha Swayamdipta, Chandra Bhagavatula, Rowan Zellers, Matthew E. Peters, Ashish Sabharwal, Ye** Choi

    Abstract: Large neural models have demonstrated human-level performance on language and vision benchmarks, while their performance degrades considerably on adversarial or out-of-distribution samples. This raises the question of whether these models have learned to solve a dataset rather than the underlying task by overfitting to spurious dataset biases. We investigate one recently proposed approach, AFLite,… ▽ More

    Submitted 10 July, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

    Comments: Accepted to ICML 2020

  40. arXiv:1911.11641  [pdf, other

    cs.CL cs.AI cs.LG

    PIQA: Reasoning about Physical Commonsense in Natural Language

    Authors: Yonatan Bisk, Rowan Zellers, Ronan Le Bras, Jianfeng Gao, Ye** Choi

    Abstract: To apply eyeshadow without a brush, should I use a cotton swab or a toothpick? Questions requiring this kind of physical commonsense pose a challenge to today's natural language understanding systems. While recent pretrained models (such as BERT) have made progress on question answering over more abstract domains - such as news articles and encyclopedia entries, where text is plentiful - in more p… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: AAAI 2020

  41. arXiv:1911.03876  [pdf, other

    cs.CL

    Dynamic Neuro-Symbolic Knowledge Graph Construction for Zero-shot Commonsense Question Answering

    Authors: Antoine Bosselut, Ronan Le Bras, Ye** Choi

    Abstract: Understanding narratives requires reasoning about implicit world knowledge related to the causes, effects, and states of situations described in text. At the core of this challenge is how to access contextually relevant knowledge on demand and reason over it. In this paper, we present initial studies toward zero-shot commonsense question answering by formulating the task as inference over dynami… ▽ More

    Submitted 30 October, 2020; v1 submitted 10 November, 2019; originally announced November 2019.

  42. arXiv:1909.00277  [pdf, other

    cs.CL cs.AI

    Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning

    Authors: Lifu Huang, Ronan Le Bras, Chandra Bhagavatula, Ye** Choi

    Abstract: Understanding narratives requires reading between the lines, which in turn, requires interpreting the likely causes and effects of events, even when they are not mentioned explicitly. In this paper, we introduce Cosmos QA, a large-scale dataset of 35,600 problems that require commonsense-based reading comprehension, formulated as multiple-choice questions. In stark contrast to most existing readin… ▽ More

    Submitted 6 September, 2019; v1 submitted 31 August, 2019; originally announced September 2019.

    Comments: EMNLP'2019

  43. arXiv:1908.05739  [pdf, other

    cs.CL

    Abductive Commonsense Reasoning

    Authors: Chandra Bhagavatula, Ronan Le Bras, Chaitanya Malaviya, Keisuke Sakaguchi, Ari Holtzman, Hannah Rashkin, Doug Downey, Scott Wen-tau Yih, Ye** Choi

    Abstract: Abductive reasoning is inference to the most plausible explanation. For example, if Jenny finds her house in a mess when she returns from work, and remembers that she left a window open, she can hypothesize that a thief broke into her house and caused the mess, as the most plausible explanation. While abduction has long been considered to be at the core of how people interpret and read between the… ▽ More

    Submitted 13 February, 2020; v1 submitted 15 August, 2019; originally announced August 2019.

    Comments: ICLR 2020 Camera Ready

  44. arXiv:1907.10641  [pdf, other

    cs.CL

    WinoGrande: An Adversarial Winograd Schema Challenge at Scale

    Authors: Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, Ye** Choi

    Abstract: The Winograd Schema Challenge (WSC) (Levesque, Davis, and Morgenstern 2011), a benchmark for commonsense reasoning, is a set of 273 expert-crafted pronoun resolution problems originally designed to be unsolvable for statistical models that rely on selectional preferences or word associations. However, recent advances in neural language models have already reached around 90% accuracy on variants of… ▽ More

    Submitted 21 November, 2019; v1 submitted 24 July, 2019; originally announced July 2019.

  45. arXiv:1610.02005  [pdf

    cond-mat.mtrl-sci

    Automated Phase Map** with AgileFD and its Application to Light Absorber Discovery in the V-Mn-Nb Oxide System

    Authors: Santosh K. Suram, Yexiang Xue, Junwen Bai, Ronan Le Bras, Brendan Rappazzo, Richard Bernstein, Johan Bjorck, Lan Zhou, Robert B. van Dover, Carla P. Gomes, John M. Gregoire

    Abstract: Rapid construction of phase diagrams is a central tenet of combinatorial materials science with accelerated materials discovery efforts often hampered by challenges in interpreting combinatorial x-ray diffraction datasets, which we address by develo** AgileFD, an artificial intelligence algorithm that enables rapid phase map** from a combinatorial library of x-ray diffraction patterns. AgileFD… ▽ More

    Submitted 6 October, 2016; originally announced October 2016.

  46. arXiv:1610.00689  [pdf, other

    cs.AI

    Phase-Mapper: An AI Platform to Accelerate High Throughput Materials Discovery

    Authors: Yexiang Xue, Junwen Bai, Ronan Le Bras, Brendan Rappazzo, Richard Bernstein, Johan Bjorck, Liane Longpre, Santosh K. Suram, Robert B. van Dover, John Gregoire, Carla P. Gomes

    Abstract: High-Throughput materials discovery involves the rapid synthesis, measurement, and characterization of many different but structurally-related materials. A key problem in materials discovery, the phase map identification problem, involves the determination of the crystal phase diagram from the materials' composition and structural characterization data. We present Phase-Mapper, a novel AI platform… ▽ More

    Submitted 7 October, 2016; v1 submitted 3 October, 2016; originally announced October 2016.

  47. arXiv:1508.04032  [pdf, other

    cs.AI

    Variable Elimination in the Fourier Domain

    Authors: Yexiang Xue, Stefano Ermon, Ronan Le Bras, Carla P. Gomes, Bart Selman

    Abstract: The ability to represent complex high dimensional probability distributions in a compact form is one of the key insights in the field of graphical models. Factored representations are ubiquitous in machine learning and lead to major computational advantages. We explore a different type of compact representation based on discrete Fourier representations, complementing the classical approach based o… ▽ More

    Submitted 21 June, 2016; v1 submitted 17 August, 2015; originally announced August 2015.

    Comments: Proceedings of the 33rd International Conference on Machine Learning (ICML), 2016

  48. arXiv:1503.05495  [pdf, other

    physics.data-an physics.ao-ph

    On evaluation of ShARP passive rainfall retrievals over snow-covered land surfaces and coastal zones

    Authors: Ardeshir M. Ebtehaj, Rafael L. Bras, Efi Foufoula-Georgiou

    Abstract: For precipitation retrievals over land, using satellite measurements in microwave bands, it is important to properly discriminate the weak rainfall signals from strong and highly variable background surface emission. Traditionally, land rainfall retrieval methods often rely on a weak signal of rainfall scattering on high-frequency channels (85 GHz) and make use of empirical thresholding and regres… ▽ More

    Submitted 19 March, 2015; v1 submitted 18 March, 2015; originally announced March 2015.

    Comments: 18 pages, 11 figures, Figure 1 has been corrected in rev01

  49. arXiv:1411.7441  [pdf, other

    cs.AI cs.LG stat.ML

    Pattern Decomposition with Complex Combinatorial Constraints: Application to Materials Discovery

    Authors: Stefano Ermon, Ronan Le Bras, Santosh K. Suram, John M. Gregoire, Carla Gomes, Bart Selman, Robert B. van Dover

    Abstract: Identifying important components or factors in large amounts of noisy data is a key problem in machine learning and data mining. Motivated by a pattern decomposition problem in materials discovery, aimed at discovering new materials for renewable energy, e.g. for fuel and solar cells, we introduce CombiFD, a framework for factor based pattern decomposition that allows the incorporation of a-priori… ▽ More

    Submitted 26 November, 2014; originally announced November 2014.

  50. arXiv:1409.5068  [pdf, other

    physics.data-an

    Compressive Earth Observatory: An Insight from AIRS/AMSU Retrievals

    Authors: Ardeshir Mohammad Ebtehaj, Efi Foufoula-Georgiou, Gilad Lerman, Rafael Luis Bras

    Abstract: We demonstrate that the global fields of temperature, humidity and geopotential heights admit a nearly sparse representation in the wavelet domain, offering a viable path forward to explore new paradigms of sparsity-promoting data assimilation and compressive recovery of land surface-atmospheric states from space. We illustrate this idea using retrieval products of the Atmospheric Infrared Sounder… ▽ More

    Submitted 30 December, 2014; v1 submitted 17 September, 2014; originally announced September 2014.

    Comments: 12 pages, 8 figures, 1 table

    Journal ref: Geophys. Res. Lett. (2015), 42, 362--369