Skip to main content

Showing 1–6 of 6 results for author: Sun, D Q

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.14727  [pdf, other

    cs.CY cs.CL cs.LG

    Protected group bias and stereotypes in Large Language Models

    Authors: Hadas Kotek, David Q. Sun, Zidi Xiu, Margit Bowler, Christopher Klein

    Abstract: As modern Large Language Models (LLMs) shatter many state-of-the-art benchmarks in a variety of domains, this paper investigates their behavior in the domains of ethics and fairness, focusing on protected group bias. We conduct a two-part study: first, we solicit sentence continuations describing the occupations of individuals from different protected groups, including gender, sexuality, religion,… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  2. arXiv:2310.18130  [pdf, other

    cs.CL cs.HC

    DELPHI: Data for Evaluating LLMs' Performance in Handling Controversial Issues

    Authors: David Q. Sun, Artem Abzaliev, Hadas Kotek, Zidi Xiu, Christopher Klein, Jason D. Williams

    Abstract: Controversy is a reflection of our zeitgeist, and an important aspect to any discourse. The rise of large language models (LLMs) as conversational systems has increased public reliance on these systems for answers to their various questions. Consequently, it is crucial to systematically examine how these models respond to questions that pertaining to ongoing debates. However, few such datasets exi… ▽ More

    Submitted 7 November, 2023; v1 submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP Industry Track 2023

  3. arXiv:2308.14921  [pdf, other

    cs.CL cs.CY cs.LG

    Gender bias and stereotypes in Large Language Models

    Authors: Hadas Kotek, Rikker Dockum, David Q. Sun

    Abstract: Large Language Models (LLMs) have made substantial progress in the past several months, shattering state-of-the-art benchmarks in many domains. This paper investigates LLMs' behavior with respect to gender stereotypes, a known issue for prior models. We use a simple paradigm to test the presence of gender bias, building on but differing from WinoBias, a commonly used gender bias dataset, which is… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: ACM Collective Intelligence

    Journal ref: In Collective Intelligence Conference (CI '23), November 06-09, 2023, Delft, Netherlands. ACM, New York, NY, USA (2023)

  4. arXiv:2308.03905  [pdf, other

    cs.CL cs.AI cs.LG

    Intelligent Assistant Language Understanding On Device

    Authors: Cecilia Aas, Hisham Abdelsalam, Irina Belousova, Shruti Bhargava, Jianpeng Cheng, Robert Daland, Joris Driesen, Federico Flego, Tristan Guigue, Anders Johannsen, Partha Lal, Jiarui Lu, Joel Ruben Antony Moniz, Nathan Perkins, Dhivya Piraviperumal, Stephen Pulman, Diarmuid Ó Séaghdha, David Q. Sun, John Torr, Marco Del Vecchio, Jay Wacker, Jason D. Williams, Hong Yu

    Abstract: It has recently become feasible to run personal digital assistants on phones and other personal devices. In this paper we describe a design for a natural language understanding system that runs on device. In comparison to a server-based assistant, this system is more private, more reliable, faster, more expressive, and more accurate. We describe what led to key choices about architecture and techn… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  5. arXiv:2303.10255  [pdf, other

    cs.HC cs.CL

    Feedback Effect in User Interaction with Intelligent Assistants: Delayed Engagement, Adaption and Drop-out

    Authors: Zidi Xiu, Kai-Chen Cheng, David Q. Sun, Jiannan Lu, Hadas Kotek, Yuhan Zhang, Paul McCarthy, Christopher Klein, Stephen Pulman, Jason D. Williams

    Abstract: With the growing popularity of intelligent assistants (IAs), evaluating IA quality becomes an increasingly active field of research. This paper identifies and quantifies the feedback effect, a novel component in IA-user interactions: how the capabilities and limitations of the IA influence user behavior over time. First, we demonstrate that unhelpful responses from the IA cause users to delay or r… ▽ More

    Submitted 18 April, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

    Comments: PAKDD 2023

  6. arXiv:2012.04169  [pdf, other

    cs.CL

    Improving Human-Labeled Data through Dynamic Automatic Conflict Resolution

    Authors: David Q. Sun, Hadas Kotek, Christopher Klein, Mayank Gupta, William Li, Jason D. Williams

    Abstract: This paper develops and implements a scalable methodology for (a) estimating the noisiness of labels produced by a typical crowdsourcing semantic annotation task, and (b) reducing the resulting error of the labeling process by as much as 20-30% in comparison to other common labeling strategies. Importantly, this new approach to the labeling process, which we name Dynamic Automatic Conflict Resolut… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

    Comments: Conference Paper at COLING 2020: https://www.aclweb.org/anthology/2020.coling-main.316/