Skip to main content

Showing 1–27 of 27 results for author: Ettinger, A

.
  1. arXiv:2406.18510  [pdf, other

    cs.CL

    WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

    Authors: Liwei Jiang, Kavel Rao, Seungju Han, Allyson Ettinger, Faeze Brahman, Sachin Kumar, Niloofar Mireshghallah, Ximing Lu, Maarten Sap, Ye** Choi, Nouha Dziri

    Abstract: We introduce WildTeaming, an automatic LLM safety red-teaming framework that mines in-the-wild user-chatbot interactions to discover 5.7K unique clusters of novel jailbreak tactics, and then composes multiple tactics for systematic exploration of novel jailbreaks. Compared to prior work that performed red-teaming via recruited human workers, gradient-based optimization, or iterative revision with… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.18495  [pdf, other

    cs.CL

    WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

    Authors: Seungju Han, Kavel Rao, Allyson Ettinger, Liwei Jiang, Bill Yuchen Lin, Nathan Lambert, Ye** Choi, Nouha Dziri

    Abstract: We introduce WildGuard -- an open, light-weight moderation tool for LLM safety that achieves three goals: (1) identifying malicious intent in user prompts, (2) detecting safety risks of model responses, and (3) determining model refusal rate. Together, WildGuard serves the increasing needs for automatic safety moderation and evaluation of LLM interactions, providing a one-stop tool with enhanced a… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: First two authors contributed equally. Third and fourth authors contributed equally

  3. arXiv:2404.09129  [pdf, other

    cs.CL

    When Hindsight is Not 20/20: Testing Limits on Reflective Thinking in Large Language Models

    Authors: Yanhong Li, Chenghao Yang, Allyson Ettinger

    Abstract: Recent studies suggest that self-reflective prompting can significantly enhance the reasoning capabilities of Large Language Models (LLMs). However, the use of external feedback as a stop criterion raises doubts about the true extent of LLMs' ability to emulate human-like self-reflection. In this paper, we set out to clarify these capabilities under a more stringent evaluation setting in which we… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

    Comments: NAACL 2024 Findings paper (Camera-Ready Version)

  4. arXiv:2401.06640  [pdf, other

    cs.CL cs.AI

    Experimental Contexts Can Facilitate Robust Semantic Property Inference in Language Models, but Inconsistently

    Authors: Kanishka Misra, Allyson Ettinger, Kyle Mahowald

    Abstract: Recent zero-shot evaluations have highlighted important limitations in the abilities of language models (LMs) to perform meaning extraction. However, it is now well known that LMs can demonstrate radical improvements in the presence of experimental contexts such as in-context examples and instructions. How well does this translate to previously studied meaning-sensitive tasks? We present a case-st… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

  5. arXiv:2311.00059  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    The Generative AI Paradox: "What It Can Create, It May Not Understand"

    Authors: Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li, Jena D. Hwang, Liwei Jiang, Jillian Fisher, Abhilasha Ravichander, Khyathi Chandu, Benjamin Newman, Pang Wei Koh, Allyson Ettinger, Ye** Choi

    Abstract: The recent wave of generative AI has sparked unprecedented global attention, with both excitement and concern over potentially superhuman levels of artificial intelligence: models now take only seconds to produce outputs that would challenge or exceed the capabilities even of expert humans. At the same time, models still show basic errors in understanding that would not be expected even in non-exp… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

  6. arXiv:2310.17793  [pdf, other

    cs.CL cs.AI

    "You Are An Expert Linguistic Annotator": Limits of LLMs as Analyzers of Abstract Meaning Representation

    Authors: Allyson Ettinger, Jena D. Hwang, Valentina Pyatkin, Chandra Bhagavatula, Ye** Choi

    Abstract: Large language models (LLMs) show amazing proficiency and fluency in the use of language. Does this mean that they have also acquired insightful linguistic knowledge about the language, to an extent that they can serve as an "expert linguistic annotator"? In this paper, we examine the successes and limitations of the GPT-3, ChatGPT, and GPT-4 models in analysis of sentence meaning structure, focus… ▽ More

    Submitted 11 December, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Findings (short)

  7. arXiv:2310.16135  [pdf, other

    cs.CL

    Can You Follow Me? Testing Situational Understanding in ChatGPT

    Authors: Chenghao Yang, Allyson Ettinger

    Abstract: Understanding sentence meanings and updating information states appropriately across time -- what we call "situational understanding" (SU) -- is a critical ability for human-like AI agents. SU is essential in particular for chat models, such as ChatGPT, to enable consistent, coherent, and effective dialogue between humans and AI. Previous works have identified certain SU limitations in non-chatbot… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Main Paper (Camera Ready)

  8. arXiv:2305.18654  [pdf, other

    cs.CL cs.AI cs.LG

    Faith and Fate: Limits of Transformers on Compositionality

    Authors: Nouha Dziri, Ximing Lu, Melanie Sclar, Xiang Lorraine Li, Liwei Jiang, Bill Yuchen Lin, Peter West, Chandra Bhagavatula, Ronan Le Bras, Jena D. Hwang, Soumya Sanyal, Sean Welleck, Xiang Ren, Allyson Ettinger, Zaid Harchaoui, Ye** Choi

    Abstract: Transformer large language models (LLMs) have sparked admiration for their exceptional performance on tasks that demand intricate multi-step reasoning. Yet, these models simultaneously show failures on surprisingly trivial problems. This begs the question: Are these errors incidental, or do they signal more substantial limitations? In an attempt to demystify transformer LLMs, we investigate the li… ▽ More

    Submitted 31 October, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: 10 pages + appendix (40 pages)

  9. arXiv:2305.16572  [pdf, other

    cs.CL

    Counterfactual reasoning: Testing language models' understanding of hypothetical scenarios

    Authors: Jiaxuan Li, Lang Yu, Allyson Ettinger

    Abstract: Current pre-trained language models have enabled remarkable improvements in downstream tasks, but it remains difficult to distinguish effects of statistical correlation from more systematic logical reasoning grounded on the understanding of real world. We tease these factors apart by leveraging counterfactual conditionals, which force language models to predict unusual consequences based on hypoth… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023. arXiv admin note: substantial text overlap with arXiv:2212.03278

  10. arXiv:2212.03278  [pdf, ps, other

    cs.CL

    Counterfactual reasoning: Do language models need world knowledge for causal understanding?

    Authors: Jiaxuan Li, Lang Yu, Allyson Ettinger

    Abstract: Current pre-trained language models have enabled remarkable improvements in downstream tasks, but it remains difficult to distinguish effects of statistical correlation from more systematic logical reasoning grounded on understanding of the real world. In this paper we tease these factors apart by leveraging counterfactual conditionals, which force language models to predict unusual consequences b… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: 5 pages, accepted to nCSI workshop at NeurIPS 2022

  11. arXiv:2210.02526  [pdf, other

    cs.CL

    "No, they did not": Dialogue response dynamics in pre-trained language models

    Authors: Sanghee J. Kim, Lang Yu, Allyson Ettinger

    Abstract: A critical component of competence in language is being able to identify relevant components of an utterance and reply appropriately. In this paper we examine the extent of such dialogue response sensitivity in pre-trained language models, conducting a series of experiments with a particular focus on sensitivity to dynamics involving phenomena of at-issueness and ellipsis. We find that models show… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: 12 pages, 8 figures, COLING 2022, see https://github.com/sangheek16/dialogue-response-dynamics for codes and material

  12. arXiv:2210.01963  [pdf, other

    cs.CL cs.AI

    COMPS: Conceptual Minimal Pair Sentences for testing Robust Property Knowledge and its Inheritance in Pre-trained Language Models

    Authors: Kanishka Misra, Julia Taylor Rayz, Allyson Ettinger

    Abstract: A characteristic feature of human semantic cognition is its ability to not only store and retrieve the properties of concepts observed through experience, but to also facilitate the inheritance of properties (can breathe) from superordinate concepts (animal) to their subordinates (dog) -- i.e. demonstrate property inheritance. In this paper, we present COMPS, a collection of minimal pair sentences… ▽ More

    Submitted 8 February, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: EACL 2023 Camera Ready version. Code can be found at https://github.com/kanishkamisra/comps

  13. arXiv:2205.06910  [pdf, other

    cs.CL

    A Property Induction Framework for Neural Language Models

    Authors: Kanishka Misra, Julia Taylor Rayz, Allyson Ettinger

    Abstract: To what extent can experience from language contribute to our conceptual knowledge? Computational explorations of this question have shed light on the ability of powerful neural language models (LMs) -- informed solely through text input -- to encode and elicit information about concepts and properties. To extend this line of research, we present a framework that uses neural-network language model… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: CogSci 2022 camera ready version, with hyperref-compatible citations. Code and Supplemental Material can be found in https://github.com/kanishkamisra/lm-induction

  14. arXiv:2111.06644  [pdf, other

    cs.CL

    Variation and generality in encoding of syntactic anomaly information in sentence embeddings

    Authors: Qinxuan Wu, Allyson Ettinger

    Abstract: While sentence anomalies have been applied periodically for testing in NLP, we have yet to establish a picture of the precise status of anomaly information in representations from NLP models. In this paper we aim to fill two primary gaps, focusing on the domain of syntactic anomalies. First, we explore fine-grained differences in anomaly encoding by designing probing tasks that vary the hierarchic… ▽ More

    Submitted 12 November, 2021; originally announced November 2021.

    Comments: BlackBoxNLP, EMNLP

  15. arXiv:2109.12951  [pdf, other

    cs.CL

    Pragmatic competence of pre-trained language models through the lens of discourse connectives

    Authors: Lalchand Pandia, Yan Cong, Allyson Ettinger

    Abstract: As pre-trained language models (LMs) continue to dominate NLP, it is increasingly important that we understand the depth of language capabilities in these models. In this paper, we target pre-trained LMs' competence in pragmatics, with a focus on pragmatics relating to discourse connectives. We formulate cloze-style tests using a combination of naturally-occurring data and controlled inputs drawn… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

    Comments: Accepted at CoNLL 2021

    MSC Class: 68T50 ACM Class: I.2.7

  16. arXiv:2109.12393  [pdf, other

    cs.CL

    Sorting through the noise: Testing robustness of information processing in pre-trained language models

    Authors: Lalchand Pandia, Allyson Ettinger

    Abstract: Pre-trained LMs have shown impressive performance on downstream NLP tasks, but we have yet to establish a clear understanding of their sophistication when it comes to processing, retaining, and applying information presented in their input. In this paper we tackle a component of this question by examining robustness of models' ability to deploy relevant context information in the face of distracti… ▽ More

    Submitted 25 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP 2021

    MSC Class: 68T50 ACM Class: I.2.7

  17. arXiv:2105.14668  [pdf, ps, other

    cs.CL

    On the Interplay Between Fine-tuning and Composition in Transformers

    Authors: Lang Yu, Allyson Ettinger

    Abstract: Pre-trained transformer language models have shown remarkable performance on a variety of NLP tasks. However, recent research has suggested that phrase-level representations in these models reflect heavy influences of lexical content, but lack evidence of sophisticated, compositional phrase information. Here we investigate the impact of fine-tuning on the capacity of contextualized embeddings to c… ▽ More

    Submitted 31 May, 2021; v1 submitted 30 May, 2021; originally announced May 2021.

    Comments: To appear in Findings of ACL 2021

  18. arXiv:2105.02987  [pdf, other

    cs.CL

    Do language models learn typicality judgments from text?

    Authors: Kanishka Misra, Allyson Ettinger, Julia Taylor Rayz

    Abstract: Building on research arguing for the possibility of conceptual and categorical knowledge acquisition through statistics contained in language, we evaluate predictive language models (LMs) -- informed solely by textual input -- on a prevalent phenomenon in cognitive science: typicality. Inspired by experiments that involve language processing and show robust typicality effects in humans, we propose… ▽ More

    Submitted 6 May, 2021; originally announced May 2021.

    Comments: Accepted as a talk to CogSci 2021

  19. arXiv:2010.03763  [pdf, ps, other

    cs.CL

    Assessing Phrasal Representation and Composition in Transformers

    Authors: Lang Yu, Allyson Ettinger

    Abstract: Deep transformer models have pushed performance on NLP tasks to new limits, suggesting sophisticated treatment of complex linguistic inputs, such as phrases. However, we have limited understanding of how these models handle representation of phrases, and whether this reflects sophisticated composition of phrase meaning like that done by humans. In this paper, we present systematic analysis of phra… ▽ More

    Submitted 13 October, 2020; v1 submitted 8 October, 2020; originally announced October 2020.

    Comments: Accepted at EMNLP 2020

  20. Exploring BERT's Sensitivity to Lexical Cues using Tests from Semantic Priming

    Authors: Kanishka Misra, Allyson Ettinger, Julia Taylor Rayz

    Abstract: Models trained to estimate word probabilities in context have become ubiquitous in natural language processing. How do these models use lexical cues in context to inform their word probabilities? To answer this question, we present a case study analyzing the pre-trained BERT model with tests informed by semantic priming. Using English lexical stimuli that show priming in humans, we find that BERT… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: Accepted for publication in Findings of ACL: EMNLP 2020

  21. arXiv:2010.02807  [pdf, other

    cs.CL cs.LG

    Learning to Ignore: Long Document Coreference with Bounded Memory Neural Networks

    Authors: Shubham Toshniwal, Sam Wiseman, Allyson Ettinger, Karen Livescu, Kevin Gimpel

    Abstract: Long document coreference resolution remains a challenging task due to the large memory and runtime requirements of current models. Recent work doing incremental coreference resolution using just the global representation of entities shows practical benefits but requires kee** all entities in memory, which can be impractical for long documents. We argue that kee** all entities in memory is unn… ▽ More

    Submitted 16 November, 2020; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: Post EMNLP 2020 camera ready updates

  22. arXiv:2008.07027  [pdf, other

    cs.CL

    Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size

    Authors: Davis Yoshida, Allyson Ettinger, Kevin Gimpel

    Abstract: Fine-tuning a pretrained transformer for a downstream task has become a standard method in NLP in the last few years. While the results from these models are impressive, applying them can be extremely computationally expensive, as is pretraining new models with the latest architectures. We present a novel method for applying pretrained transformer language models which lowers their memory requirem… ▽ More

    Submitted 16 August, 2020; originally announced August 2020.

    Comments: 12 pages, 5 figures

  23. arXiv:2005.02990  [pdf, other

    cs.CL cs.LG

    PeTra: A Sparsely Supervised Memory Model for People Tracking

    Authors: Shubham Toshniwal, Allyson Ettinger, Kevin Gimpel, Karen Livescu

    Abstract: We propose PeTra, a memory-augmented neural network designed to track entities in its memory slots. PeTra is trained using sparse annotation from the GAP pronoun resolution dataset and outperforms a prior memory model on the task while using a simpler architecture. We empirically compare key modeling choices, finding that we can simplify several aspects of the design of the memory module while ret… ▽ More

    Submitted 6 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  24. arXiv:2005.01810  [pdf, other

    cs.CL cs.AI

    Spying on your neighbors: Fine-grained probing of contextual embeddings for information about surrounding words

    Authors: Josef Klafka, Allyson Ettinger

    Abstract: Although models using contextual word embeddings have achieved state-of-the-art results on a host of NLP tasks, little is known about exactly what information these embeddings encode about the context words that they are understood to reflect. To address this question, we introduce a suite of probing tasks that enable fine-grained testing of contextual embeddings for encoding of information about… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  25. arXiv:1907.13528  [pdf, ps, other

    cs.CL cs.AI

    What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models

    Authors: Allyson Ettinger

    Abstract: Pre-training by language modeling has become a popular and successful approach to NLP tasks, but we have yet to understand exactly what linguistic capacities these pre-training processes confer upon models. In this paper we introduce a suite of diagnostics drawn from human language experiments, which allow us to ask targeted questions about the information used by language models for generating pr… ▽ More

    Submitted 13 July, 2020; v1 submitted 31 July, 2019; originally announced July 2019.

  26. arXiv:1809.03992  [pdf, other

    cs.CL

    Assessing Composition in Sentence Vector Representations

    Authors: Allyson Ettinger, Ahmed Elgohary, Colin Phillips, Philip Resnik

    Abstract: An important component of achieving language understanding is mastering the composition of sentence meaning, but an immediate challenge to solving this problem is the opacity of sentence vector representations produced by current neural sentence composition models. We present a method to address this challenge, develo** tasks that directly target compositional meaning information in sentence vec… ▽ More

    Submitted 11 September, 2018; originally announced September 2018.

    Comments: COLING 2018

    Journal ref: In Proceedings of the 27th International Conference on Computational Linguistics (pp. 1790-1801)

  27. arXiv:1711.01505  [pdf, other

    cs.CL

    Towards Linguistically Generalizable NLP Systems: A Workshop and Shared Task

    Authors: Allyson Ettinger, Sudha Rao, Hal Daumé III, Emily M. Bender

    Abstract: This paper presents a summary of the first Workshop on Building Linguistically Generalizable Natural Language Processing Systems, and the associated Build It Break It, The Language Edition shared task. The goal of this workshop was to bring together researchers in NLP and linguistics with a shared task aimed at testing the generalizability of NLP systems beyond the distributions of their training… ▽ More

    Submitted 4 November, 2017; originally announced November 2017.

    Comments: Updated version of the EMNLP Workshop and Shared Task description paper, Proceedings of the First Workshop on Building Linguistically Generalizable NLP Systems. 2017