Skip to main content

Showing 1–50 of 95 results for author: Daumé, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12232  [pdf, other

    cs.AI cs.CL

    "You Gotta be a Doctor, Lin": An Investigation of Name-Based Bias of Large Language Models in Employment Recommendations

    Authors: Huy Nghiem, John Prindle, Jieyu Zhao, Hal Daumé III

    Abstract: Social science research has shown that candidates with names indicative of certain races or genders often face discrimination in employment practices. Similarly, Large Language Models (LLMs) have demonstrated racial and gender biases in various applications. In this study, we utilize GPT-3.5-Turbo and Llama 3-70B-Instruct to simulate hiring decisions and salary recommendations for candidates with… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: preprint, 18 pages

  2. arXiv:2403.11456  [pdf, other

    cs.CL cs.AI cs.SI

    HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models

    Authors: Huy Nghiem, Hal Daumé III

    Abstract: The widespread use of social media necessitates reliable and efficient detection of offensive content to mitigate harmful effects. Although sophisticated models perform well on individual datasets, they often fail to generalize due to varying definitions and labeling of "offensive content." In this paper, we introduce HateCOT, an English dataset with over 52,000 samples from diverse sources, featu… ▽ More

    Submitted 16 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Preprint Version 3

  3. arXiv:2403.01015  [pdf, other

    cs.CY cs.DL

    A Randomized Controlled Trial on Anonymizing Reviewers to Each Other in Peer Review Discussions

    Authors: Charvi Rastogi, Xiangchen Song, Zhi**g **, Ivan Stelmakh, Hal Daumé III, Kun Zhang, Nihar B. Shah

    Abstract: Peer review often involves reviewers submitting their independent reviews, followed by a discussion among reviewers of each paper. A question among policymakers is whether the reviewers of a paper should be anonymous to each other during the discussion. We shed light on this by conducting a randomized controlled trial at the UAI 2022 conference. We randomly split the reviewers and papers into two… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 18 pages, 4 figures, 3 tables

  4. arXiv:2402.16973  [pdf, other

    cs.AI cs.CL cs.HC

    Successfully Guiding Humans with Imperfect Instructions by Highlighting Potential Errors and Suggesting Corrections

    Authors: Lingjun Zhao, Khanh Nguyen, Hal Daumé III

    Abstract: This paper addresses the challenge of leveraging imperfect language models to guide human decision-making in the context of a grounded navigation task. We show that an imperfect instruction generation model can be complemented with an effective communication mechanism to become more successful at guiding humans. The communication mechanism we build comprises models that can detect potential halluc… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  5. arXiv:2402.10450  [pdf, other

    cs.LG

    PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control

    Authors: Ruijie Zheng, Ching-An Cheng, Hal Daumé III, Furong Huang, Andrey Kolobov

    Abstract: Temporal action abstractions, along with belief state representations, are a powerful knowledge sharing mechanism for sequential decision making. In this work, we propose a novel view that treats inducing temporal action abstractions as a sequence compression problem. To do so, we bring a subtle but critical component of LLM training pipelines -- input tokenization via byte pair encoding (BPE) --… ▽ More

    Submitted 6 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: Accepted at the Forty-first International Conference on Machine Learning (ICML 2024)

  6. arXiv:2402.06187  [pdf, other

    cs.LG cs.AI cs.RO

    Premier-TACO is a Few-Shot Policy Learner: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss

    Authors: Ruijie Zheng, Yongyuan Liang, Xiyao Wang, Shuang Ma, Hal Daumé III, Huazhe Xu, John Langford, Praveen Palanisamy, Kalyan Shankar Basu, Furong Huang

    Abstract: We present Premier-TACO, a multitask feature representation learning approach designed to improve few-shot policy learning efficiency in sequential decision-making tasks. Premier-TACO leverages a subset of multitask offline datasets for pretraining a general feature representation, which captures critical environmental dynamics and is fine-tuned using minimal expert demonstrations. It advances the… ▽ More

    Submitted 23 May, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: Accepted at Forty-first International Conference on Machine Learning (ICML 2024)

  7. arXiv:2312.07141  [pdf, other

    cs.CL

    Multilingual large language models leak human stereotypes across language boundaries

    Authors: Yang Trista Cao, Anna Sotnikova, Jieyu Zhao, Linda X. Zou, Rachel Rudinger, Hal Daume III

    Abstract: Multilingual large language models have been increasingly popular for their proficiency in processing and generating text across various languages. Previous research has shown that the presence of stereotypes and biases in monolingual large language models can be attributed to the nature of their training data, which is collected from humans and reflects societal biases. Multilingual language mode… ▽ More

    Submitted 8 May, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

  8. arXiv:2311.07879  [pdf, other

    cs.CL cs.AI

    Toxicity Detection is NOT all you Need: Measuring the Gaps to Supporting Volunteer Content Moderators

    Authors: Yang Trista Cao, Lovely-Frances Domingo, Sarah Ann Gilbert, Michelle Mazurek, Katie Shilton, Hal Daumé III

    Abstract: Extensive efforts in automated approaches for content moderation have been focused on develo** models to identify toxic, offensive, and hateful content with the aim of lightening the load for moderators. Yet, it remains uncertain whether improvements on those tasks have truly addressed moderators' needs in accomplishing their work. In this paper, we surface gaps between past research efforts tha… ▽ More

    Submitted 16 February, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

  9. arXiv:2310.19668  [pdf, other

    cs.LG cs.CV

    DrM: Mastering Visual Reinforcement Learning through Dormant Ratio Minimization

    Authors: Guowei Xu, Ruijie Zheng, Yongyuan Liang, Xiyao Wang, Zhecheng Yuan, Tianying Ji, Yu Luo, Xiaoyu Liu, Jiaxin Yuan, Pu Hua, Shuzhen Li, Yanjie Ze, Hal Daumé III, Furong Huang, Huazhe Xu

    Abstract: Visual reinforcement learning (RL) has shown promise in continuous control tasks. Despite its progress, current algorithms are still unsatisfactory in virtually every aspect of the performance such as sample efficiency, asymptotic performance, and their robustness to the choice of random seeds. In this paper, we identify a major shortcoming in existing visual RL methods that is the agents often ex… ▽ More

    Submitted 13 February, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted at The Twelfth International Conference on Learning Representations (ICLR 2024)

  10. arXiv:2310.15319  [pdf, other

    cs.CL cs.AI cs.LG

    Hallucination Detection for Grounded Instruction Generation

    Authors: Lingjun Zhao, Khanh Nguyen, Hal Daumé III

    Abstract: We investigate the problem of generating instructions to guide humans to navigate in simulated residential environments. A major issue with current models is hallucination: they generate references to actions or objects that are inconsistent with what a human follower would perform or encounter along the described path. We develop a model that detects these hallucinated references by adopting a mo… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  11. arXiv:2310.15055  [pdf, other

    cs.CL cs.AI cs.HC

    Towards Conceptualization of "Fair Explanation": Disparate Impacts of anti-Asian Hate Speech Explanations on Content Moderators

    Authors: Tin Nguyen, Jiannan Xu, Aayushi Roy, Hal Daumé III, Marine Carpuat

    Abstract: Recent research at the intersection of AI explainability and fairness has focused on how explanations can improve human-plus-AI task performance as assessed by fairness measures. We propose to characterize what constitutes an explanation that is itself "fair" -- an explanation that does not adversely impact specific populations. We formulate a novel evaluation method of "fair explanations" using n… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Main Conference (Long Paper)

  12. arXiv:2310.13004  [pdf, other

    cs.LG cs.AI cs.HC

    Progressively Efficient Learning

    Authors: Ruijie Zheng, Khanh Nguyen, Hal Daumé III, Furong Huang, Karthik Narasimhan

    Abstract: Assistant AI agents should be capable of rapidly acquiring novel skills and adapting to new user preferences. Traditional frameworks like imitation learning and reinforcement learning do not facilitate this capability because they support only low-level, inefficient forms of communication. In contrast, humans communicate with progressive efficiency by defining and sharing abstract intentions. Repr… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  13. arXiv:2310.12558  [pdf, other

    cs.CL cs.HC

    Large Language Models Help Humans Verify Truthfulness -- Except When They Are Convincingly Wrong

    Authors: Chenglei Si, Navita Goyal, Sherry Tongshuang Wu, Chen Zhao, Shi Feng, Hal Daumé III, Jordan Boyd-Graber

    Abstract: Large Language Models (LLMs) are increasingly used for accessing information on the web. Their truthfulness and factuality are thus of great interest. To help users make the right decisions about the information they get, LLMs should not only provide information but also help users fact-check it. Our experiments with 80 crowdworkers compare language models with search engines (information retrieva… ▽ More

    Submitted 1 April, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: NAACL 2024

  14. The Impact of Explanations on Fairness in Human-AI Decision-Making: Protected vs Proxy Features

    Authors: Navita Goyal, Connor Baumler, Tin Nguyen, Hal Daumé III

    Abstract: AI systems have been known to amplify biases in real-world data. Explanations may help human-AI teams address these biases for fairer decision-making. Typically, explanations focus on salient input features. If a model is biased against some protected group, explanations may include features that demonstrate this bias, but when biases are realized through proxy features, the relationship between t… ▽ More

    Submitted 14 March, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: IUI 2024

  15. arXiv:2306.13229  [pdf, other

    cs.LG cs.AI

    TACO: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning

    Authors: Ruijie Zheng, Xiyao Wang, Yanchao Sun, Shuang Ma, Jieyu Zhao, Huazhe Xu, Hal Daumé III, Furong Huang

    Abstract: Despite recent progress in reinforcement learning (RL) from raw pixel data, sample inefficiency continues to present a substantial obstacle. Prior works have attempted to address this challenge by creating self-supervised auxiliary tasks, aiming to enrich the agent's learned representations with control-relevant information for future state prediction. However, these objectives are often insuffici… ▽ More

    Submitted 23 May, 2024; v1 submitted 22 June, 2023; originally announced June 2023.

    Comments: Accepted at 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  16. arXiv:2306.05949  [pdf, other

    cs.CY cs.AI

    Evaluating the Social Impact of Generative AI Systems in Systems and Society

    Authors: Irene Solaiman, Zeerak Talat, William Agnew, Lama Ahmad, Dylan Baker, Su Lin Blodgett, Canyu Chen, Hal Daumé III, Jesse Dodge, Isabella Duan, Ellie Evans, Felix Friedrich, Avijit Ghosh, Usman Gohar, Sara Hooker, Yacine Jernite, Ria Kalluri, Alberto Lusoli, Alina Leidinger, Michelle Lin, Xiuzhu Lin, Sasha Luccioni, Jennifer Mickel, Margaret Mitchell, Jessica Newman , et al. (6 additional authors not shown)

    Abstract: Generative AI systems across modalities, ranging from text (including code), image, audio, and video, have broad social impacts, but there is no official standard for means of evaluating those impacts or for which impacts should be evaluated. In this paper, we present a guide that moves toward a standard approach in evaluating a base generative AI system for any modality in two overarching categor… ▽ More

    Submitted 28 June, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

    Comments: Forthcoming in Hacker, Engel, Hammer, Mittelstadt (eds), Oxford Handbook on the Foundations and Regulation of Generative AI. Oxford University Press

  17. arXiv:2305.14331  [pdf, other

    cs.CL cs.AI

    What Else Do I Need to Know? The Effect of Background Information on Users' Reliance on QA Systems

    Authors: Navita Goyal, Eleftheria Briakou, Amanda Liu, Connor Baumler, Claire Bonial, Jeffrey Micher, Clare R. Voss, Marine Carpuat, Hal Daumé III

    Abstract: NLP systems have shown impressive performance at answering questions by retrieving relevant context. However, with the increasingly large models, it is impossible and often undesirable to constrain models' knowledge or reasoning to only the retrieved context. This leads to a mismatch between the information that the models access to derive the answer and the information that is available to the us… ▽ More

    Submitted 25 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

  18. arXiv:2305.09022  [pdf, other

    cs.CL

    It Takes Two to Tango: Navigating Conceptualizations of NLP Tasks and Measurements of Performance

    Authors: Arjun Subramonian, Xingdi Yuan, Hal Daumé III, Su Lin Blodgett

    Abstract: Progress in NLP is increasingly measured through benchmarks; hence, contextualizing progress requires understanding when and why practitioners may disagree about the validity of benchmarks. We develop a taxonomy of disagreement, drawing on tools from measurement modeling, and distinguish between two types of disagreement: 1) how tasks are conceptualized and 2) how measurements of model performance… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Journal ref: Findings of the Association for Computational Linguistics: ACL 2023

  19. arXiv:2304.05934  [pdf, other

    cs.CV cs.CL

    ASL Citizen: A Community-Sourced Dataset for Advancing Isolated Sign Language Recognition

    Authors: Aashaka Desai, Lauren Berger, Fyodor O. Minakov, Vanessa Milan, Chinmay Singh, Kriston Pumphrey, Richard E. Ladner, Hal Daumé III, Alex X. Lu, Naomi Caselli, Danielle Bragg

    Abstract: Sign languages are used as a primary language by approximately 70 million D/deaf people world-wide. However, most communication technologies operate in spoken and written languages, creating inequities in access. To help tackle this problem, we release ASL Citizen, the first crowdsourced Isolated Sign Language Recognition (ISLR) dataset, collected with consent and containing 83,399 videos for 2,73… ▽ More

    Submitted 19 June, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

  20. arXiv:2301.05149  [pdf, other

    cs.CL cs.AI cs.HC cs.LG cs.RO

    Define, Evaluate, and Improve Task-Oriented Cognitive Capabilities for Instruction Generation Models

    Authors: Lingjun Zhao, Khanh Nguyen, Hal Daumé III

    Abstract: Recent work studies the cognitive capabilities of language models through psychological tests designed for humans. While these studies are helpful for understanding the general capabilities of these models, there is no guarantee that a model possessing sufficient capabilities to pass those tests would actually use those capabilities in performing real-life tasks. In this work, we formulate task-or… ▽ More

    Submitted 28 May, 2023; v1 submitted 20 December, 2022; originally announced January 2023.

    Comments: Findings of ACL 2023

  21. arXiv:2211.12966  [pdf, other

    cs.LG cs.DB cs.DL

    How do Authors' Perceptions of their Papers Compare with Co-authors' Perceptions and Peer-review Decisions?

    Authors: Charvi Rastogi, Ivan Stelmakh, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, Jennifer Wortman Vaughan, Zhenyu Xue, Hal Daumé III, Emma Pierson, Nihar B. Shah

    Abstract: How do author perceptions match up to the outcomes of the peer-review process and perceptions of others? In a top-tier computer science conference (NeurIPS 2021) with more than 23,000 submitting authors and 9,000 submitted papers, we survey the authors on three questions: (i) their predicted probability of acceptance for each of their papers, (ii) their perceived ranking of their own papers based… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

  22. arXiv:2211.06753  [pdf, other

    cs.HC cs.AI

    Seamful XAI: Operationalizing Seamful Design in Explainable AI

    Authors: Upol Ehsan, Q. Vera Liao, Samir Passi, Mark O. Riedl, Hal Daume III

    Abstract: Mistakes in AI systems are inevitable, arising from both technical limitations and sociotechnical gaps. While black-boxing AI systems can make the user experience seamless, hiding the seams risks disempowering users to mitigate fallouts from AI mistakes. Instead of hiding these AI imperfections, can we leverage them to help the user? While Explainable AI (XAI) has predominantly tackled algorithmic… ▽ More

    Submitted 5 March, 2024; v1 submitted 12 November, 2022; originally announced November 2022.

    Journal ref: ACM CSCW 2024

  23. arXiv:2210.14966  [pdf, other

    cs.CL cs.AI cs.CV

    What's Different between Visual Question Answering for Machine "Understanding" Versus for Accessibility?

    Authors: Yang Trista Cao, Kyle Seelman, Kyungjun Lee, Hal Daumé III

    Abstract: In visual question answering (VQA), a machine must answer a question given an associated image. Recently, accessibility researchers have explored whether VQA can be deployed in a real-world setting where users with visual impairments learn about their environment by capturing their visual surroundings and asking questions. However, most of the existing benchmarking datasets for VQA focus on machin… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Journal ref: AACL-IJCNLP 2022 The 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing

  24. arXiv:2206.11684  [pdf, other

    cs.CL

    Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models

    Authors: Yang Trista Cao, Anna Sotnikova, Hal Daumé III, Rachel Rudinger, Linda Zou

    Abstract: NLP models trained on text have been shown to reproduce human stereotypes, which can magnify harms to marginalized groups when systems are deployed at scale. We adapt the Agency-Belief-Communion (ABC) stereotype model of Koch et al. (2016) from social psychology as a framework for the systematic study and discovery of stereotypic group-trait associations in language models (LMs). We introduce the… ▽ More

    Submitted 23 June, 2022; originally announced June 2022.

  25. arXiv:2205.06828  [pdf, other

    cs.CL cs.AI

    Deconstructing NLG Evaluation: Evaluation Practices, Assumptions, and Their Implications

    Authors: Kaitlyn Zhou, Su Lin Blodgett, Adam Trischler, Hal Daumé III, Kaheer Suleman, Alexandra Olteanu

    Abstract: There are many ways to express similar things in text, which makes evaluating natural language generation (NLG) systems difficult. Compounding this difficulty is the need to assess varying quality criteria depending on the deployment setting. While the landscape of NLG evaluation has been well-mapped, practitioners' goals, assumptions, and constraints -- which inform decisions about what, when, an… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: Camera Ready for NAACL 2022 (Main Conference)

  26. arXiv:2110.08258  [pdf, other

    cs.LG cs.AI cs.HC cs.RO

    A Framework for Learning to Request Rich and Contextually Useful Information from Humans

    Authors: Khanh Nguyen, Yonatan Bisk, Hal Daumé III

    Abstract: When deployed, AI agents will encounter problems that are beyond their autonomous problem-solving capabilities. Leveraging human assistance can help agents overcome their inherent limitations and robustly cope with unfamiliar situations. We present a general interactive framework that enables an agent to request and interpret rich, contextually useful information from an assistant that has knowled… ▽ More

    Submitted 22 June, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: Accepted to ICML 2022

  27. arXiv:2110.04889  [pdf, other

    cs.CL

    Distantly-Supervised Evidence Retrieval Enables Question Answering without Evidence Annotation

    Authors: Chen Zhao, Chenyan Xiong, Jordan Boyd-Graber, Hal Daumé III

    Abstract: Open-domain question answering answers a question based on evidence retrieved from a large corpus. State-of-the-art neural approaches require intermediate evidence annotations for training. However, such intermediate annotations are expensive, and methods that rely on them cannot transfer to the more common setting, where only question-answer pairs are available. This paper investigates whether mo… ▽ More

    Submitted 10 October, 2021; originally announced October 2021.

    Comments: EMNLP 2021

  28. arXiv:2104.13299  [pdf, other

    cs.AI cs.LG

    From Human Explanation to Model Interpretability: A Framework Based on Weight of Evidence

    Authors: David Alvarez-Melis, Harmanpreet Kaur, Hal Daumé III, Hanna Wallach, Jennifer Wortman Vaughan

    Abstract: We take inspiration from the study of human explanation to inform the design and evaluation of interpretability methods in machine learning. First, we survey the literature on human explanation in philosophy, cognitive science, and the social sciences, and propose a list of design principles for machine-generated explanations that are meaningful to humans. Using the concept of weight of evidence f… ▽ More

    Submitted 20 September, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

    Comments: HCOMP 2021

  29. arXiv:2104.05883  [pdf, other

    cs.CL

    Multi-Step Reasoning Over Unstructured Text with Beam Dense Retrieval

    Authors: Chen Zhao, Chenyan Xiong, Jordan Boyd-Graber, Hal Daumé III

    Abstract: Complex question answering often requires finding a reasoning chain that consists of multiple evidence pieces. Current approaches incorporate the strengths of structured knowledge and unstructured text, assuming text corpora is semi-structured. Building on dense retrieval methods, we propose a new multi-step retrieval approach (BeamDR) that iteratively forms an evidence chain through beam search i… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: NAACL 2021

  30. arXiv:2011.15083  [pdf, other

    cs.HC cs.LG stat.AP

    A Large Scale Randomized Controlled Trial on Herding in Peer-Review Discussions

    Authors: Ivan Stelmakh, Charvi Rastogi, Nihar B. Shah, Aarti Singh, Hal Daumé III

    Abstract: Peer review is the backbone of academia and humans constitute a cornerstone of this process, being responsible for reviewing papers and making the final acceptance/rejection decisions. Given that human decision making is known to be susceptible to various cognitive biases, it is important to understand which (if any) biases are present in the peer-review process and design the pipeline such that t… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

  31. arXiv:2011.15050  [pdf, other

    cs.HC cs.LG

    A Novice-Reviewer Experiment to Address Scarcity of Qualified Reviewers in Large Conferences

    Authors: Ivan Stelmakh, Nihar B. Shah, Aarti Singh, Hal Daumé III

    Abstract: Conference peer review constitutes a human-computation process whose importance cannot be overstated: not only it identifies the best submissions for acceptance, but, ultimately, it impacts the future of the whole research area by promoting some ideas and restraining others. A surge in the number of submissions received by leading AI conferences has challenged the sustainability of the review proc… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

  32. arXiv:2011.14646  [pdf, other

    cs.DL cs.LG stat.AP

    Prior and Prejudice: The Novice Reviewers' Bias against Resubmissions in Conference Peer Review

    Authors: Ivan Stelmakh, Nihar B. Shah, Aarti Singh, Hal Daumé III

    Abstract: Modern machine learning and computer science conferences are experiencing a surge in the number of submissions that challenges the quality of peer review as the number of competent reviewers is growing at a much slower rate. To curb this trend and reduce the burden on reviewers, several conferences have started encouraging or even requiring authors to declare the previous submission history of the… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

  33. arXiv:2010.11246  [pdf, other

    cs.CL cs.AI

    On the Potential of Lexico-logical Alignments for Semantic Parsing to SQL Queries

    Authors: Tianze Shi, Chen Zhao, Jordan Boyd-Graber, Hal Daumé III, Lillian Lee

    Abstract: Large-scale semantic parsing datasets annotated with logical forms have enabled major advances in supervised approaches. But can richer supervision help even more? To explore the utility of fine-grained, lexical-level supervision, we introduce Squall, a dataset that enriches 11,276 WikiTableQuestions English-language questions with manually created SQL equivalents plus alignments between SQL and q… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

    Comments: Findings of ACL: EMNLP 2020

    ACM Class: I.2.7

    Journal ref: Findings of ACL: EMNLP 2020

  34. arXiv:2006.07777  [pdf, other

    cs.LG cs.HC stat.ML

    Active Imitation Learning from Multiple Non-Deterministic Teachers: Formulation, Challenges, and Algorithms

    Authors: Khanh Nguyen, Hal Daumé III

    Abstract: We formulate the problem of learning to imitate multiple, non-deterministic teachers with minimal interaction cost. Rather than learning a specific policy as in standard imitation learning, the goal in this problem is to learn a distribution over a policy space. We first present a general framework that efficiently models and estimates such a distribution by learning continuous representations of… ▽ More

    Submitted 13 June, 2020; originally announced June 2020.

  35. arXiv:2005.14050  [pdf, other

    cs.CL cs.CY

    Language (Technology) is Power: A Critical Survey of "Bias" in NLP

    Authors: Su Lin Blodgett, Solon Barocas, Hal Daumé III, Hanna Wallach

    Abstract: We survey 146 papers analyzing "bias" in NLP systems, finding that their motivations are often vague, inconsistent, and lacking in normative reasoning, despite the fact that analyzing "bias" is an inherently normative process. We further find that these papers' proposed quantitative techniques for measuring or mitigating "bias" are poorly matched to their motivations and do not engage with the rel… ▽ More

    Submitted 29 May, 2020; v1 submitted 28 May, 2020; originally announced May 2020.

  36. arXiv:2005.13718  [pdf, other

    cs.CY cs.IR cs.LG

    Operationalizing the Legal Principle of Data Minimization for Personalization

    Authors: Asia J. Biega, Peter Potash, Hal Daumé III, Fernando Diaz, Michèle Finck

    Abstract: Article 5(1)(c) of the European Union's General Data Protection Regulation (GDPR) requires that "personal data shall be [...] adequate, relevant, and limited to what is necessary in relation to the purposes for which they are processed (`data minimisation')". To date, the legal and computational definitions of `purpose limitation' and `data minimization' remain largely unclear. In particular, the… ▽ More

    Submitted 27 May, 2020; originally announced May 2020.

    Comments: SIGIR 2020 paper: In Proc. of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval

  37. arXiv:2005.12801  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Active Imitation Learning with Noisy Guidance

    Authors: Kianté Brantley, Amr Sharaf, Hal Daumé III

    Abstract: Imitation learning algorithms provide state-of-the-art results on many structured prediction tasks by learning near-optimal search policies. Such algorithms assume training-time access to an expert that can provide the optimal action at any queried state; unfortunately, the number of such queries is often prohibitive, frequently rendering these approaches impractical. To combat this query complexi… ▽ More

    Submitted 26 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  38. arXiv:2004.05109  [pdf, ps, other

    cs.CL

    Towards Automatic Generation of Questions from Long Answers

    Authors: Shlok Kumar Mishra, Pranav Goel, Abhishek Sharma, Abhyuday Jagannatha, David Jacobs, Hal Daumé III

    Abstract: Automatic question generation (AQG) has broad applicability in domains such as tutoring systems, conversational agents, healthcare literacy, and information retrieval. Existing efforts at AQG have been limited to short answer lengths of up to two or three sentences. However, several real-world applications require question generation from answers that span several sentences. Therefore, we propose… ▽ More

    Submitted 15 April, 2020; v1 submitted 10 April, 2020; originally announced April 2020.

  39. arXiv:2004.02745  [pdf, other

    cs.CL

    Meta-Learning for Few-Shot NMT Adaptation

    Authors: Amr Sharaf, Hany Hassan, Hal Daumé III

    Abstract: We present META-MT, a meta-learning approach to adapt Neural Machine Translation (NMT) systems in a few-shot setting. META-MT provides a new approach to make NMT models easily adaptable to many target domains with the minimal amount of in-domain data. We frame the adaptation of NMT systems as a meta-learning problem, where we learn to adapt to new unseen domains based on simulated offline meta-tra… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

  40. Toward Gender-Inclusive Coreference Resolution

    Authors: Yang Trista Cao, Hal Daumé III

    Abstract: Correctly resolving textual mentions of people fundamentally entails making inferences about those people. Such inferences raise the risk of systemic biases in coreference resolution systems, including biases that can harm binary and non-binary trans and cis stakeholders. To better understand such biases, we foreground nuanced conceptualizations of gender from sociology and sociolinguistics, and d… ▽ More

    Submitted 2 December, 2020; v1 submitted 30 October, 2019; originally announced October 2019.

    Comments: 28 pages; ACL version

    Journal ref: Association for Computational Linguistics. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020) 4568-4595

  41. arXiv:1910.13503  [pdf, other

    cs.LG cs.AI stat.ML

    Weight of Evidence as a Basis for Human-Oriented Explanations

    Authors: David Alvarez-Melis, Hal Daumé III, Jennifer Wortman Vaughan, Hanna Wallach

    Abstract: Interpretability is an elusive but highly sought-after characteristic of modern machine learning methods. Recent work has focused on interpretability via $\textit{explanations}$, which justify individual model predictions. In this work, we take a step towards reconciling machine explanations with those that humans produce and prefer by taking inspiration from the study of explanation in philosophy… ▽ More

    Submitted 29 October, 2019; originally announced October 2019.

    Comments: Human-Centric Machine Learning (HCML) Workshop @ NeurIPS 2019

  42. arXiv:1910.00421  [pdf, other

    cs.CL cs.IR cs.LG

    Global Voices: Crossing Borders in Automatic News Summarization

    Authors: Khanh Nguyen, Hal Daumé III

    Abstract: We construct Global Voices, a multilingual dataset for evaluating cross-lingual summarization methods. We extract social-network descriptions of Global Voices news articles to cheaply collect evaluation data for into-English and from-English summarization in 15 languages. Especially, for the into-English summarization task, we crowd-source a high-quality evaluation dataset based on guidelines that… ▽ More

    Submitted 15 June, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

    Comments: NewSum workshop at EMNLP 2019, 8 pages

  43. arXiv:1909.01871  [pdf, other

    cs.HC cs.AI cs.CL cs.CV cs.LG

    Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning

    Authors: Khanh Nguyen, Hal Daumé III

    Abstract: Mobile agents that can leverage help from humans can potentially accomplish more complex tasks than they could entirely on their own. We develop "Help, Anna!" (HANNA), an interactive photo-realistic simulator in which an agent fulfills object-finding tasks by requesting and interpreting natural language-and-vision assistance. An agent solving tasks in a HANNA environment can leverage simulated hum… ▽ More

    Submitted 22 November, 2019; v1 submitted 4 September, 2019; originally announced September 2019.

    Comments: In EMNLP 2019

  44. arXiv:1906.09323  [pdf, other

    cs.LG cs.AI cs.GT stat.ML

    Reinforcement Learning with Convex Constraints

    Authors: Sobhan Miryoosefi, Kianté Brantley, Hal Daumé III, Miroslav Dudik, Robert Schapire

    Abstract: In standard reinforcement learning (RL), a learning agent seeks to optimize the overall reward. However, many key aspects of a desired behavior are more naturally expressed as constraints. For instance, the designer may want to limit the use of unsafe actions, increase the diversity of trajectories to enable exploration, or approximate expert trajectories when rewards are sparse. In this paper, we… ▽ More

    Submitted 11 November, 2019; v1 submitted 21 June, 2019; originally announced June 2019.

    Journal ref: Advances in Neural Information Processing Systems 32 (2019), 14093-14102

  45. arXiv:1904.02281  [pdf, other

    cs.CL

    Answer-based Adversarial Training for Generating Clarification Questions

    Authors: Sudha Rao, Hal Daumé III

    Abstract: We present an approach for generating clarification questions with the goal of eliciting new information that would make the given textual context more complete. We propose that modeling hypothetical answers (to clarification questions) as latent variables can guide our approach into generating more useful clarification questions. We develop a Generative Adversarial Network (GAN) where the generat… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

    Comments: Accepted at NAACL 2019

  46. arXiv:1902.02192  [pdf, other

    cs.CL cs.LG stat.ML

    Non-Monotonic Sequential Text Generation

    Authors: Sean Welleck, Kianté Brantley, Hal Daumé III, Kyunghyun Cho

    Abstract: Standard sequential generation methods assume a pre-specified generation order, such as text generation methods which generate words from left to right. In this work, we propose a framework for training models of text generation that operate in non-monotonic orders; the model directly learns good orders, without any additional annotation. Our framework operates by generating a word at an arbitrary… ▽ More

    Submitted 23 October, 2019; v1 submitted 5 February, 2019; originally announced February 2019.

    Comments: ICML 2019

  47. arXiv:1901.08159  [pdf, other

    cs.LG stat.ML

    Meta-Learning for Contextual Bandit Exploration

    Authors: Amr Sharaf, Hal Daumé III

    Abstract: We describe MELEE, a meta-learning algorithm for learning a good exploration policy in the interactive contextual bandit setting. Here, an algorithm must take actions based on contexts, and learn based only on a reward signal from the action taken, thereby generating an exploration/exploitation trade-off. MELEE addresses this trade-off by learning a good exploration strategy for offline tasks base… ▽ More

    Submitted 23 January, 2019; originally announced January 2019.

  48. arXiv:1901.00301  [pdf, other

    cs.LG stat.ML

    Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback

    Authors: Chicheng Zhang, Alekh Agarwal, Hal Daumé III, John Langford, Sahand N Negahban

    Abstract: We investigate the feasibility of learning from a mix of both fully-labeled supervised data and contextual bandit data. We specifically consider settings in which the underlying learning signal may be different between these two data sources. Theoretically, we state and prove no-regret algorithms for learning that is robust to misaligned cost distributions between the two sources. Empirically, we… ▽ More

    Submitted 21 June, 2019; v1 submitted 2 January, 2019; originally announced January 2019.

    Comments: 42 pages, 21 figures, ICML 2019

  49. arXiv:1812.05239  [pdf, other

    cs.HC cs.CY cs.LG cs.SE

    Improving fairness in machine learning systems: What do industry practitioners need?

    Authors: Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miro Dudík, Hanna Wallach

    Abstract: The potential for machine learning (ML) systems to amplify social inequities and unfairness is receiving increasing popular and academic attention. A surge of recent work has focused on the development of algorithmic tools to assess and mitigate such unfairness. If these tools are to have a positive impact on industry practice, however, it is crucial that their design be informed by an understandi… ▽ More

    Submitted 7 January, 2019; v1 submitted 12 December, 2018; originally announced December 2018.

    Comments: To appear in the 2019 ACM CHI Conference on Human Factors in Computing Systems (CHI 2019)

  50. arXiv:1810.12343  [pdf, other

    cs.CL

    Content Selection in Deep Learning Models of Summarization

    Authors: Chris Kedzie, Kathleen McKeown, Hal Daume III

    Abstract: We carry out experiments with deep learning models of summarization across the domains of news, personal stories, meetings, and medical articles in order to understand how content selection is performed. We find that many sophisticated features of state of the art extractive summarizers do not improve performance over simpler models. These results suggest that it is easier to create a summarizer f… ▽ More

    Submitted 18 February, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: Revised to correct for error in AMI oracle results. Originally published at EMNLP 2018