Skip to main content

Showing 1–50 of 91 results for author: Cohen, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14596  [pdf, other

    cs.CV cs.AI cs.LG

    ICAL: Continual Learning of Multimodal Agents by Transforming Trajectories into Actionable Insights

    Authors: Gabriel Sarch, Lawrence Jang, Michael J. Tarr, William W. Cohen, Kenneth Marino, Katerina Fragkiadaki

    Abstract: Large-scale generative language and vision-language models (LLMs and VLMs) excel in few-shot in-context learning for decision making and instruction following. However, they require high-quality exemplar demonstrations to be included in their context window. In this work, we ask: Can LLMs and VLMs generate their own prompt examples from generic, sub-optimal demonstrations? We propose In-Context Ab… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Project website: http://ical-learning.github.io/

  2. arXiv:2406.04291  [pdf, other

    cs.LG stat.ML

    Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation

    Authors: Adam Fisch, Joshua Maynez, R. Alex Hofer, Bhuwan Dhingra, Amir Globerson, William W. Cohen

    Abstract: Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. PPI achieves this by combining small amounts of human-labeled data with larger amounts of data labeled by a reasonably accurate -- but potentially biased -- automatic system, in a way that results in tighter confidence intervals for certain parameters of interest (e.g., the mean… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  3. arXiv:2405.06034  [pdf, other

    cs.LG

    Bayesian Prediction-Powered Inference

    Authors: R. Alex Hofer, Joshua Maynez, Bhuwan Dhingra, Adam Fisch, Amir Globerson, William W. Cohen

    Abstract: Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data. Specifically, PPI methods provide tighter confidence intervals by combining small amounts of human-labeled data with larger amounts of data labeled by a reasonably accurate, but potentially biased, automatic system. We propose a framework for PPI based on Bayesian inference that… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  4. arXiv:2401.01952  [pdf, other

    cs.CV cs.AI cs.CL

    Instruct-Imagen: Image Generation with Multi-modal Instruction

    Authors: Hexiang Hu, Kelvin C. K. Chan, Yu-Chuan Su, Wenhu Chen, Yandong Li, Kihyuk Sohn, Yang Zhao, Xue Ben, Boqing Gong, William Cohen, Ming-Wei Chang, Xuhui Jia

    Abstract: This paper presents instruct-imagen, a model that tackles heterogeneous image generation tasks and generalizes across unseen tasks. We introduce *multi-modal instruction* for image generation, a task representation articulating a range of generation intents with precision. It uses natural language to amalgamate disparate modalities (e.g., text, edge, style, subject, etc.), such that abundant gener… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: 20 pages, 18 figures

  5. arXiv:2311.10083  [pdf, ps, other

    cs.CL

    Characterizing Tradeoffs in Language Model Decoding with Informational Interpretations

    Authors: Chung-Ching Chang, William W. Cohen, Yun-Hsuan Sung

    Abstract: We propose a theoretical framework for formulating language model decoder algorithms with dynamic programming and information theory. With dynamic programming, we lift the design of decoder algorithms from the logit space to the action-state value function space, and show that the decoding algorithms are consequences of optimizing the action-state value functions. Each component in the action-stat… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  6. arXiv:2311.06697  [pdf, other

    cs.CL

    Trusted Source Alignment in Large Language Models

    Authors: Vasilisa Bashlovkina, Zhaobin Kuang, Riley Matthews, Edward Clifford, Yennie Jun, William W. Cohen, Simon Baumgartner

    Abstract: Large language models (LLMs) are trained on web-scale corpora that inevitably include contradictory factual information from sources of varying reliability. In this paper, we propose measuring an LLM property called trusted source alignment (TSA): the model's propensity to align with content produced by trusted publishers in the face of uncertainty or controversy. We present FactCheckQA, a TSA eva… ▽ More

    Submitted 11 November, 2023; originally announced November 2023.

  7. arXiv:2311.04886  [pdf, other

    cs.CL cs.AI cs.LG

    SEMQA: Semi-Extractive Multi-Source Question Answering

    Authors: Tal Schuster, Adam D. Lelkes, Haitian Sun, Jai Gupta, Jonathan Berant, William W. Cohen, Donald Metzler

    Abstract: Recently proposed long-form question answering (QA) systems, supported by large language models (LLMs), have shown promising capabilities. Yet, attributing and verifying their generated abstractive answers can be difficult, and automatically evaluating their accuracy remains an ongoing challenge. In this work, we introduce a new QA task for answering multi-answer questions by summarizing multipl… ▽ More

    Submitted 30 June, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: NAACL 2024

  8. arXiv:2308.14903  [pdf, other

    cs.CL

    MEMORY-VQ: Compression for Tractable Internet-Scale Memory

    Authors: Yury Zemlyanskiy, Michiel de Jong, Luke Vilnis, Santiago Ontañón, William W. Cohen, Sumit Sanghai, Joshua Ainslie

    Abstract: Retrieval augmentation is a powerful but expensive method to make language models more knowledgeable about the world. Memory-based methods like LUMEN pre-compute token representations for retrieved passages to drastically speed up inference. However, memory also leads to much greater storage requirements from storing pre-computed representations. We propose MEMORY-VQ, a new method to reduce stor… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

  9. arXiv:2308.08661  [pdf, other

    cs.CL cs.AI

    Answering Ambiguous Questions with a Database of Questions, Answers, and Revisions

    Authors: Haitian Sun, William W. Cohen, Ruslan Salakhutdinov

    Abstract: Many open-domain questions are under-specified and thus have multiple possible answers, each of which is correct under a different interpretation of the question. Answering such ambiguous questions is challenging, as it requires retrieving and then reasoning about diverse information from multiple passages. We present a new state-of-the-art for answering ambiguous questions that exploits a databas… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  10. arXiv:2306.10231  [pdf, other

    cs.CL cs.AI cs.LG

    GLIMMER: generalized late-interaction memory reranker

    Authors: Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Sumit Sanghai, William W. Cohen, Joshua Ainslie

    Abstract: Memory-augmentation is a powerful approach for efficiently incorporating external information into language models, but leads to reduced performance relative to retrieving text. Recent work introduced LUMEN, a memory-retrieval hybrid that partially pre-computes memory and updates memory representations on the fly with a smaller live encoder. We propose GLIMMER, which improves on this approach th… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

  11. arXiv:2304.00186  [pdf, other

    cs.CV cs.AI

    Subject-driven Text-to-Image Generation via Apprenticeship Learning

    Authors: Wenhu Chen, Hexiang Hu, Yandong Li, Nataniel Ruiz, Xuhui Jia, Ming-Wei Chang, William W. Cohen

    Abstract: Recent text-to-image generation models like DreamBooth have made remarkable progress in generating highly customized images of a target subject, by fine-tuning an ``expert model'' for a given subject from a few examples. However, this process is expensive, since a new expert model must be learned for each subject. In this paper, we present SuTI, a Subject-driven Text-to-Image generator that replac… ▽ More

    Submitted 2 October, 2023; v1 submitted 31 March, 2023; originally announced April 2023.

    Comments: Accepted at NeurIPS 2023. Model Service to be appear as Google Vertex AI - Instant Tuning (https://cloud.google.com/vertex-ai/docs/generative-ai/image/fine-tune-model). The link to demo video: https://www.youtube.com/watch?v=Q2xQ91D_dhM&t=2071s&ab_channel=GoogleCloud

  12. arXiv:2301.10448  [pdf, other

    cs.CL cs.AI cs.LG

    Pre-computed memory or on-the-fly encoding? A hybrid approach to retrieval augmentation makes the most of your compute

    Authors: Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Joshua Ainslie, Sumit Sanghai, Fei Sha, William Cohen

    Abstract: Retrieval-augmented language models such as Fusion-in-Decoder are powerful, setting the state of the art on a variety of knowledge-intensive tasks. However, they are also expensive, due to the need to encode a large number of retrieved passages. Some work avoids this cost by pre-encoding a text corpus into a memory and retrieving dense representations directly. However, pre-encoding memory incurs… ▽ More

    Submitted 2 June, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: ICML 2023

  13. arXiv:2212.10726  [pdf, other

    cs.CL cs.LG

    Beyond Contrastive Learning: A Variational Generative Model for Multilingual Retrieval

    Authors: John Wieting, Jonathan H. Clark, William W. Cohen, Graham Neubig, Taylor Berg-Kirkpatrick

    Abstract: Contrastive learning has been successfully used for retrieval of semantically aligned sentences, but it often requires large batch sizes or careful engineering to work well. In this paper, we instead propose a generative model for learning multilingual text embeddings which can be used to retrieve or score sentence pairs. Our model operates on parallel data in $N$ languages and, through an approxi… ▽ More

    Submitted 4 June, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Published as a long paper at ACL 2023

  14. arXiv:2212.08153  [pdf, other

    cs.CL cs.AI cs.LG

    FiDO: Fusion-in-Decoder optimized for stronger performance and faster inference

    Authors: Michiel de Jong, Yury Zemlyanskiy, Joshua Ainslie, Nicholas FitzGerald, Sumit Sanghai, Fei Sha, William Cohen

    Abstract: Fusion-in-Decoder (FiD) is a powerful retrieval-augmented language model that sets the state-of-the-art on many knowledge-intensive NLP tasks. However, the architecture used for FiD was chosen by making minimal modifications to a standard T5 model, which our analysis shows to be highly suboptimal for a retrieval-augmented model. In particular, FiD allocates the bulk of FLOPs to the encoder, while… ▽ More

    Submitted 2 June, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: ACL Findings 2023

  15. arXiv:2212.08037  [pdf, other

    cs.CL

    Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models

    Authors: Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Lierni Sestorain Saralegui, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster

    Abstract: Large language models (LLMs) have shown impressive results while requiring little or no direct supervision. Further, there is mounting evidence that LLMs may have potential in information-seeking scenarios. We believe the ability of an LLM to attribute the text that it generates is likely to be crucial in this setting. We formulate and study Attributed QA as a key first step in the development of… ▽ More

    Submitted 10 February, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

  16. arXiv:2211.12588  [pdf, other

    cs.CL cs.AI

    Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks

    Authors: Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen

    Abstract: Recently, there has been significant progress in teaching language models to perform step-by-step reasoning to solve complex numerical reasoning tasks. Chain-of-thoughts prompting (CoT) is by far the state-of-art method for these tasks. CoT uses language models to perform both reasoning and computation in the multi-step `thought' process. To disentangle computation from reasoning, we propose `Prog… ▽ More

    Submitted 22 October, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: Published at TMLR 2023

  17. arXiv:2210.12378  [pdf, other

    cs.CL

    Correcting Diverse Factual Errors in Abstractive Summarization via Post-Editing and Language Model Infilling

    Authors: Vidhisha Balachandran, Hannaneh Hajishirzi, William W. Cohen, Yulia Tsvetkov

    Abstract: Abstractive summarization models often generate inconsistent summaries containing factual errors or hallucinated content. Recent works focus on correcting factual errors in generated summaries via post-editing. Such correction models are trained using adversarial non-factual summaries constructed using heuristic rules for injecting errors. However, generating non-factual summaries using heuristics… ▽ More

    Submitted 31 October, 2022; v1 submitted 22 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022

  18. arXiv:2210.02928  [pdf, other

    cs.CL cs.AI cs.CV

    MuRAG: Multimodal Retrieval-Augmented Generator for Open Question Answering over Images and Text

    Authors: Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen

    Abstract: While language Models store a massive amount of world knowledge implicitly in their parameters, even very large models often fail to encode information about rare entities and events, while incurring huge computational costs. Recently, retrieval-augmented models, such as REALM, RAG, and RETRO, have incorporated world knowledge into language generation by leveraging an external non-parametric index… ▽ More

    Submitted 20 October, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022 main conference

  19. arXiv:2209.14491  [pdf, other

    cs.CV cs.AI cs.LG

    Re-Imagen: Retrieval-Augmented Text-to-Image Generator

    Authors: Wenhu Chen, Hexiang Hu, Chitwan Saharia, William W. Cohen

    Abstract: Research on text-to-image generation has witnessed significant progress in generating diverse and photo-realistic images, driven by diffusion and auto-regressive models trained on large-scale image-text data. Though state-of-the-art models can generate high-quality images of common entities, they often have difficulty generating images of uncommon entities, such as `Chortai (dog)' or `Picarones (f… ▽ More

    Submitted 21 November, 2022; v1 submitted 28 September, 2022; originally announced September 2022.

    Comments: 9 pages

  20. arXiv:2209.12153  [pdf, other

    cs.CL cs.AI

    WinoDict: Probing language models for in-context word acquisition

    Authors: Julian Martin Eisenschlos, Jeremy R. Cole, Fangyu Liu, William W. Cohen

    Abstract: We introduce a new in-context learning paradigm to measure Large Language Models' (LLMs) ability to learn novel words during inference. In particular, we rewrite Winograd-style co-reference resolution problems by replacing the key concept word with a synthetic but plausible word that the model must understand to complete the task. Solving this task requires the model to make use of the dictionary… ▽ More

    Submitted 25 September, 2022; originally announced September 2022.

  21. arXiv:2207.00630  [pdf, other

    cs.AI

    QA Is the New KR: Question-Answer Pairs as Knowledge Bases

    Authors: Wenhu Chen, William W. Cohen, Michiel De Jong, Nitish Gupta, Alessandro Presta, Pat Verga, John Wieting

    Abstract: In this position paper, we propose a new approach to generating a type of knowledge base (KB) from text, based on question generation and entity linking. We argue that the proposed type of KB has many of the key advantages of a traditional symbolic KB: in particular, it consists of small modular components, which can be combined compositionally to answer complex queries, including relational queri… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

  22. arXiv:2205.12898  [pdf, other

    cs.CL cs.AI

    Reasoning over Logically Interacted Conditions for Question Answering

    Authors: Haitian Sun, William W. Cohen, Ruslan Salakhutdinov

    Abstract: Some questions have multiple answers that are not equally correct, i.e. answers are different under different conditions. Conditions are used to distinguish answers as well as to provide additional information to support them. In this paper, we study a more challenging task where answers are constrained by a list of conditions that logically interact, which requires performing logical reasoning ov… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

  23. arXiv:2204.04581  [pdf, other

    cs.CL cs.AI cs.LG

    Augmenting Pre-trained Language Models with QA-Memory for Open-Domain Question Answering

    Authors: Wenhu Chen, Pat Verga, Michiel de Jong, John Wieting, William Cohen

    Abstract: Retrieval augmented language models have recently become the standard for knowledge intensive tasks. Rather than relying purely on latent semantics within the parameters of large neural models, these methods enlist a semi-parametric memory to encode an index of knowledge for the model to retrieve over. Most prior work has employed text passages as the unit of knowledge, which has high coverage at… ▽ More

    Submitted 23 January, 2023; v1 submitted 9 April, 2022; originally announced April 2022.

    Comments: Accepted by EACL 2023

  24. arXiv:2202.06991  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Transformer Memory as a Differentiable Search Index

    Authors: Yi Tay, Vinh Q. Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Gupta, Tal Schuster, William W. Cohen, Donald Metzler

    Abstract: In this paper, we demonstrate that information retrieval can be accomplished with a single Transformer, in which all information about the corpus is encoded in the parameters of the model. To this end, we introduce the Differentiable Search Index (DSI), a new paradigm that learns a text-to-text model that maps string queries directly to relevant docids; in other words, a DSI model answers queries… ▽ More

    Submitted 21 October, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: NeurIPS 2022

  25. arXiv:2112.09669  [pdf, other

    cs.CL

    Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations

    Authors: Siddhant Arora, Danish Pruthi, Norman Sadeh, William W. Cohen, Zachary C. Lipton, Graham Neubig

    Abstract: In attempts to "explain" predictions of machine learning models, researchers have proposed hundreds of techniques for attributing predictions to features that are deemed important. While these attributions are often claimed to hold the potential to improve human "understanding" of the models, surprisingly little work explicitly evaluates progress towards this aspiration. In this paper, we conduct… ▽ More

    Submitted 21 August, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

    Comments: AAAI 2022

  26. arXiv:2110.06884  [pdf, other

    cs.CL cs.AI

    ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers

    Authors: Haitian Sun, William W. Cohen, Ruslan Salakhutdinov

    Abstract: We describe a Question Answering (QA) dataset that contains complex questions with conditional answers, i.e. the answers are only applicable when certain conditions apply. We call this dataset ConditionalQA. In addition to conditional answers, the dataset also features: (1) long context documents with information that is related in logically complex ways; (2) multi-hop questions that require compo… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

  27. arXiv:2110.06176  [pdf, other

    cs.CL cs.AI cs.LG

    Mention Memory: incorporating textual knowledge into Transformers through entity mention attention

    Authors: Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Fei Sha, William Cohen

    Abstract: Natural language understanding tasks such as open-domain question answering often require retrieving and assimilating factual information from multiple sources. We propose to address this problem by integrating a semi-parametric representation of a large text corpus into a Transformer model as a source of factual knowledge. Specifically, our method represents knowledge with `mention memory', a tab… ▽ More

    Submitted 19 April, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

  28. arXiv:2109.14364  [pdf, other

    cs.CL

    Multilingual Fact Linking

    Authors: Keshav Kolluru, Martin Rezk, Pat Verga, William W. Cohen, Partha Talukdar

    Abstract: Knowledge-intensive NLP tasks can benefit from linking natural language text with facts from a Knowledge Graph (KG). Although facts themselves are language-agnostic, the fact labels (i.e., language-specific representation of the fact) in the KG are often present only in a few languages. This makes it challenging to link KG facts to sentences in languages other than the limited set of languages. To… ▽ More

    Submitted 30 September, 2021; v1 submitted 29 September, 2021; originally announced September 2021.

    Comments: AKBC 2021

  29. arXiv:2109.04312  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    MATE: Multi-view Attention for Table Transformer Efficiency

    Authors: Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen

    Abstract: This work presents a sparse-attention Transformer architecture for modeling documents that contain large tables. Tables are ubiquitous on the web, and are rich in information. However, more than 20% of relational tables on the web have 20 or more rows (Cafarella et al., 2008), and these large tables present a challenge for current Transformer models, which are typically limited to 512 tokens. Here… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: Accepted to EMNLP 2021

  30. Time-Aware Language Models as Temporal Knowledge Bases

    Authors: Bhuwan Dhingra, Jeremy R. Cole, Julian Martin Eisenschlos, Daniel Gillick, Jacob Eisenstein, William W. Cohen

    Abstract: Many facts come with an expiration date, from the name of the President to the basketball team Lebron James plays for. But language models (LMs) are trained on snapshots of data collected at a specific moment in time, and this can limit their utility, especially in the closed-book setting where the pretraining corpus must contain the facts the model should memorize. We introduce a diagnostic datas… ▽ More

    Submitted 23 April, 2022; v1 submitted 29 June, 2021; originally announced June 2021.

    Comments: Version accepted to TACL

    Journal ref: Transactions of the Association for Computational Linguistics 2022; 10 257-273

  31. arXiv:2106.00200  [pdf, other

    cs.CL cs.AI

    Iterative Hierarchical Attention for Answering Complex Questions over Long Documents

    Authors: Haitian Sun, William W. Cohen, Ruslan Salakhutdinov

    Abstract: We propose a new model, DocHopper, that iteratively attends to different parts of long, hierarchically structured documents to answer complex questions. Similar to multi-hop question-answering (QA) systems, at each step, DocHopper uses a query $q$ to attend to information from a document, combines this ``retrieved'' information with $q$ to produce the next query. However, in contrast to most previ… ▽ More

    Submitted 21 October, 2021; v1 submitted 31 May, 2021; originally announced June 2021.

  32. arXiv:2104.01940  [pdf, ps, other

    cs.CL

    What's the best place for an AI conference, Vancouver or ______: Why completing comparative questions is difficult

    Authors: Avishai Zagoury, Einat Minkov, Idan Szpektor, William W. Cohen

    Abstract: Although large neural language models (LMs) like BERT can be finetuned to yield state-of-the-art results on many NLP tasks, it is often unclear what these models actually learn. Here we study using such LMs to fill in entities in human-authored comparative questions, like ``Which country is older, India or ______?'' -- i.e., we study the ability of neural LMs to ask (not answer) reasonable questio… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: AAAI 2021; preprint

  33. arXiv:2102.07043  [pdf, other

    cs.AI cs.CL cs.LG

    Reasoning Over Virtual Knowledge Bases With Open Predicate Relations

    Authors: Haitian Sun, Pat Verga, Bhuwan Dhingra, Ruslan Salakhutdinov, William W. Cohen

    Abstract: We present the Open Predicate Query Language (OPQL); a method for constructing a virtual KB (VKB) trained entirely from text. Large Knowledge Bases (KBs) are indispensable for a wide-range of industry applications such as question answering and recommendation. Typically, KBs encode world knowledge in a structured, readily accessible form derived from laborious human annotation efforts. Unfortunate… ▽ More

    Submitted 14 June, 2021; v1 submitted 13 February, 2021; originally announced February 2021.

    Comments: Accepted at the 38th International Conference on Machine Learning, PMLR 139, 2021

  34. arXiv:2012.00893  [pdf, other

    cs.CL cs.LG

    Evaluating Explanations: How much do explanations from the teacher aid students?

    Authors: Danish Pruthi, Rachit Bansal, Bhuwan Dhingra, Livio Baldini Soares, Michael Collins, Zachary C. Lipton, Graham Neubig, William W. Cohen

    Abstract: While many methods purport to explain predictions by highlighting salient features, what aims these explanations serve and how they ought to be evaluated often go unstated. In this work, we introduce a framework to quantify the value of explanations via the accuracy gains that they confer on a student model trained to simulate a teacher model. Crucially, the explanations are available to the stude… ▽ More

    Submitted 16 December, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

    Comments: TACL 2021 (pre-MIT Press publication version)

  35. arXiv:2010.14439  [pdf, other

    cs.CL cs.AI cs.LG

    Differentiable Open-Ended Commonsense Reasoning

    Authors: Bill Yuchen Lin, Haitian Sun, Bhuwan Dhingra, Manzil Zaheer, Xiang Ren, William W. Cohen

    Abstract: Current commonsense reasoning research focuses on develo** models that use commonsense knowledge to answer multiple-choice questions. However, systems designed to answer multiple-choice questions may not be useful in applications that do not provide a small list of candidate answers to choose from. As a step towards making commonsense reasoning research more realistic, we propose to study open-e… ▽ More

    Submitted 6 June, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: Accepted to NAACL 2021. Project website: https://open-csr.github.io

  36. arXiv:2010.10439  [pdf, other

    cs.CL cs.AI

    Open Question Answering over Tables and Text

    Authors: Wenhu Chen, Ming-Wei Chang, Eva Schlinger, William Wang, William W. Cohen

    Abstract: In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question. Most open QA systems have considered only retrieving information from unstructured text. Here we consider for the first time open QA over both tabular and textual data and present a new large-scale dataset Open Table-and-Text Question Answerin… ▽ More

    Submitted 10 February, 2021; v1 submitted 20 October, 2020; originally announced October 2020.

    Comments: Accepted to ICLR 2021. Main paper has 9 pages

  37. arXiv:2007.00849  [pdf, other

    cs.CL cs.AI cs.LG

    Facts as Experts: Adaptable and Interpretable Neural Memory over Symbolic Knowledge

    Authors: Pat Verga, Haitian Sun, Livio Baldini Soares, William W. Cohen

    Abstract: Massive language models are the core of modern NLP modeling and have been shown to encode impressive amounts of commonsense and factual information. However, that knowledge exists only within the latent parameters of the model, inaccessible to inspection and interpretation, and even worse, factual information memorized from the training corpora is likely to become stale as the world changes. Knowl… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

  38. arXiv:2004.12554  [pdf, other

    cs.LG cs.AI cs.CE stat.ML

    Forecasting in Non-stationary Environments with Fuzzy Time Series

    Authors: Petrônio Cândido de Lima e Silva, Carlos Alberto Severiano Junior, Marcos Antonio Alves, Rodrigo Silva, Miri Weiss Cohen, Frederico Gadelha Guimarães

    Abstract: In this paper we introduce a Non-Stationary Fuzzy Time Series (NSFTS) method with time varying parameters adapted from the distribution of the data. In this approach, we employ Non-Stationary Fuzzy Sets, in which perturbation functions are used to adapt the membership function parameters in the knowledge base in response to statistical changes in the time series. The proposed method is capable of… ▽ More

    Submitted 26 April, 2020; originally announced April 2020.

    Comments: 21 pages, 7 figures, submitted to Applied Soft Computing

  39. arXiv:2004.03658  [pdf, other

    cs.LG cs.CL stat.ML

    Faithful Embeddings for Knowledge Base Queries

    Authors: Haitian Sun, Andrew O. Arnold, Tania Bedrax-Weiss, Fernando Pereira, William W. Cohen

    Abstract: The deductive closure of an ideal knowledge base (KB) contains exactly the logical queries that the KB can answer. However, in practice KBs are both incomplete and over-specified, failing to answer some queries that have real-world answers. \emph{Query embedding} (QE) techniques have been recently proposed where KB entities and KB queries are represented jointly in an embedding space, supporting r… ▽ More

    Submitted 28 January, 2021; v1 submitted 7 April, 2020; originally announced April 2020.

    Comments: Published at 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada

  40. arXiv:2002.10640  [pdf, other

    cs.CL cs.LG

    Differentiable Reasoning over a Virtual Knowledge Base

    Authors: Bhuwan Dhingra, Manzil Zaheer, Vidhisha Balachandran, Graham Neubig, Ruslan Salakhutdinov, William W. Cohen

    Abstract: We consider the task of answering complex multi-hop questions using a corpus as a virtual knowledge base (KB). In particular, we describe a neural module, DrKIT, that traverses textual data like a KB, softly following paths of relations between mentions of entities in the corpus. At each step the module uses a combination of sparse-matrix TFIDF indices and a maximum inner product search (MIPS) on… ▽ More

    Submitted 24 February, 2020; originally announced February 2020.

    Comments: ICLR 2020

  41. arXiv:2002.06115  [pdf, other

    cs.CL cs.LG stat.ML

    Scalable Neural Methods for Reasoning With a Symbolic Knowledge Base

    Authors: William W. Cohen, Haitian Sun, R. Alex Hofer, Matthew Siegler

    Abstract: We describe a novel way of representing a symbolic knowledge base (KB) called a sparse-matrix reified KB. This representation enables neural modules that are fully differentiable, faithful to the original semantics of the KB, expressive enough to model multi-hop inferences, and scalable enough to use with realistically large KBs. The sparse-matrix reified KB can be distributed across multiple GPUs… ▽ More

    Submitted 14 February, 2020; originally announced February 2020.

    Comments: Also published in ICLR2020 https://openreview.net/forum?id=BJlguT4YPr&noteId=BJlguT4YPr

  42. arXiv:1912.06074  [pdf, other

    cs.LG cs.AI stat.ML

    Game Design for Eliciting Distinguishable Behavior

    Authors: Fan Yang, Liu Leqi, Yifan Wu, Zachary C. Lipton, Pradeep Ravikumar, William W. Cohen, Tom Mitchell

    Abstract: The ability to inferring latent psychological traits from human behavior is key to develo** personalized human-interacting machine learning systems. Approaches to infer such traits range from surveys to manually-constructed experiments and games. However, these traditional games are limited because they are typically designed based on heuristics. In this paper, we formulate the task of designing… ▽ More

    Submitted 12 December, 2019; originally announced December 2019.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)

  43. arXiv:1911.06111  [pdf, other

    cs.CL cs.IR cs.LG stat.ML

    Instance-based Transfer Learning for Multilingual Deep Retrieval

    Authors: Andrew O. Arnold, William W. Cohen

    Abstract: We focus on the problem of search in the multilingual setting. Examining the problems of next-sentence prediction and inverse cloze, we show that at large scale, instance-based transfer learning is surprisingly effective in the multilingual setting, leading to positive transfer on all of the 35 target languages and two tasks tested. We analyze this improvement and argue that the most natural expla… ▽ More

    Submitted 15 April, 2021; v1 submitted 8 November, 2019; originally announced November 2019.

    Journal ref: The Web Conference Workshop on Multilingual Search, 2021

  44. arXiv:1909.06146  [pdf, other

    cs.CL cs.LG q-bio.QM

    PubMedQA: A Dataset for Biomedical Research Question Answering

    Authors: Qiao **, Bhuwan Dhingra, Zheng** Liu, William W. Cohen, Xinghua Lu

    Abstract: We introduce PubMedQA, a novel biomedical question answering (QA) dataset collected from PubMed abstracts. The task of PubMedQA is to answer research questions with yes/no/maybe (e.g.: Do preoperative statins reduce atrial fibrillation after coronary artery bypass grafting?) using the corresponding abstracts. PubMedQA has 1k expert-annotated, 61.2k unlabeled and 211.3k artificially generated QA in… ▽ More

    Submitted 13 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019

  45. arXiv:1906.01081  [pdf, other

    cs.CL

    Handling Divergent Reference Texts when Evaluating Table-to-Text Generation

    Authors: Bhuwan Dhingra, Manaal Faruqui, Ankur Parikh, Ming-Wei Chang, Dipanjan Das, William W. Cohen

    Abstract: Automatically constructed datasets for generating text from semi-structured data (tables), such as WikiBio, often contain reference texts that diverge from the information in the corresponding semi-structured data. We show that metrics which rely solely on the reference texts, such as BLEU and ROUGE, show poor correlation with human judgments when those references diverge. We propose a new metric,… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

    Comments: To appear at ACL 2019

  46. arXiv:1905.10417  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Differentiable Representations For Multihop Inference Rules

    Authors: William W. Cohen, Haitian Sun, R. Alex Hofer, Matthew Siegler

    Abstract: We present efficient differentiable implementations of second-order multi-hop reasoning using a large symbolic knowledge base (KB). We introduce a new operation which can be used to compositionally construct second-order multi-hop templates in a neural model, and evaluate a number of alternative implementations, with different time and memory trade offs. These techniques scale to KBs with millions… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

  47. arXiv:1905.06209  [pdf, other

    cs.LG cs.AI cs.DB

    Neural Query Language: A Knowledge Base Query Language for Tensorflow

    Authors: William W. Cohen, Matthew Siegler, Alex Hofer

    Abstract: Large knowledge bases (KBs) are useful for many AI tasks, but are difficult to integrate into modern gradient-based learning systems. Here we describe a framework for accessing soft symbolic database using only differentiable operators. For example, this framework makes it easy to conveniently write neural models that adjust confidences associated with facts in a soft KB; incorporate prior knowled… ▽ More

    Submitted 15 May, 2019; originally announced May 2019.

  48. arXiv:1904.09537  [pdf, other

    cs.CL cs.LG

    PullNet: Open Domain Question Answering with Iterative Retrieval on Knowledge Bases and Text

    Authors: Haitian Sun, Tania Bedrax-Weiss, William W. Cohen

    Abstract: We consider open-domain queston answering (QA) where answers are drawn from either a corpus, a knowledge base (KB), or a combination of both of these. We focus on a setting in which a corpus is supplemented with a large but incomplete KB, and on questions that require non-trivial (e.g., ``multi-hop'') reasoning. We describe PullNet, an integrated framework for (1) learning what to retrieve (from t… ▽ More

    Submitted 20 April, 2019; originally announced April 2019.

  49. arXiv:1904.02181  [pdf, other

    cs.CL

    Probing Biomedical Embeddings from Language Models

    Authors: Qiao **, Bhuwan Dhingra, William W. Cohen, Xinghua Lu

    Abstract: Contextualized word embeddings derived from pre-trained language models (LMs) show significant improvements on downstream NLP tasks. Pre-training on domain-specific corpora, such as biomedical articles, further improves their performance. In this paper, we conduct probing experiments to determine what additional information is carried intrinsically by the in-domain trained contextualized embedding… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

    Comments: NAACL-HLT 2019 Workshop on Evaluating Vector Space Representations for NLP (RepEval)

  50. arXiv:1901.04936  [pdf, other

    cs.CL cs.AI

    Incremental Reading for Question Answering

    Authors: Samira Abnar, Tania Bedrax-weiss, Tom Kwiatkowski, William W. Cohen

    Abstract: Any system which performs goal-directed continual learning must not only learn incrementally but process and absorb information incrementally. Such a system also has to understand when its goals have been achieved. In this paper, we consider these issues in the context of question answering. Current state-of-the-art question answering models reason over an entire passage, not incrementally. As we… ▽ More

    Submitted 15 January, 2019; originally announced January 2019.