Skip to main content

Showing 1–13 of 13 results for author: Kreiss, E

.
  1. arXiv:2406.09458  [pdf, other

    cs.CV cs.AI cs.CL

    Updating CLIP to Prefer Descriptions Over Captions

    Authors: Amir Zur, Elisa Kreiss, Karel D'Oosterlinck, Christopher Potts, Atticus Geiger

    Abstract: Although CLIPScore is a powerful generic metric that captures the similarity between a text and an image, it fails to distinguish between a caption that is meant to complement the information in an image and a description that is meant to replace an image entirely, e.g., for accessibility. We address this shortcoming by updating the CLIP model with the Concadia dataset to assign higher scores to d… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  2. arXiv:2402.15002  [pdf, other

    cs.CL cs.CV

    CommVQA: Situating Visual Question Answering in Communicative Contexts

    Authors: Nandita Shankar Naik, Christopher Potts, Elisa Kreiss

    Abstract: Current visual question answering (VQA) models tend to be trained and evaluated on image-question pairs in isolation. However, the questions people ask are dependent on their informational needs and prior knowledge about the image content. To evaluate how situating images within naturalistic contexts shapes visual questions, we introduce CommVQA, a VQA dataset consisting of images, image descripti… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  3. arXiv:2309.11710  [pdf, other

    cs.CL cs.CV

    ContextRef: Evaluating Referenceless Metrics For Image Description Generation

    Authors: Elisa Kreiss, Eric Zelikman, Christopher Potts, Nick Haber

    Abstract: Referenceless metrics (e.g., CLIPScore) use pretrained vision--language models to assess image descriptions directly without costly ground-truth reference texts. Such methods can facilitate rapid progress, but only if they truly align with human preference judgments. In this paper, we introduce ContextRef, a benchmark for assessing referenceless metrics for such alignment. ContextRef has two compo… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  4. arXiv:2307.15745  [pdf, other

    cs.CL cs.CV

    Context-VQA: Towards Context-Aware and Purposeful Visual Question Answering

    Authors: Nandita Naik, Christopher Potts, Elisa Kreiss

    Abstract: Visual question answering (VQA) has the potential to make the Internet more accessible in an interactive way, allowing people who cannot see images to ask questions about them. However, multiple studies have shown that people who are blind or have low-vision prefer image explanations that incorporate the context in which an image appears, yet current VQA datasets focus on images in isolation. We a… ▽ More

    Submitted 30 August, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: Proceedings of ICCV 2023 Workshop on Closing the Loop Between Vision and Language

  5. arXiv:2305.09038  [pdf

    cs.HC

    Characterizing Image Accessibility on Wikipedia across Languages

    Authors: Elisa Kreiss, Krishna Srinivasan, Tiziano Piccardi, Jesus Adolfo Hermosillo, Cynthia Bennett, Michael S. Bernstein, Meredith Ringel Morris, Christopher Potts

    Abstract: We make a first attempt to characterize image accessibility on Wikipedia across languages, present new experimental results that can inform efforts to assess description quality, and offer some strategies to improve Wikipedia's image accessibility.

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: Presented at Wiki Workshop 2023

  6. arXiv:2205.10646  [pdf, other

    cs.CL

    Context Matters for Image Descriptions for Accessibility: Challenges for Referenceless Evaluation Metrics

    Authors: Elisa Kreiss, Cynthia Bennett, Shayan Hooshmand, Eric Zelikman, Meredith Ringel Morris, Christopher Potts

    Abstract: Few images on the Web receive alt-text descriptions that would make them accessible to blind and low vision (BLV) users. Image-based NLG systems have progressed to the point where they can begin to address this persistent societal problem, but these systems will not be fully successful unless we evaluate them on metrics that guide their development correctly. Here, we argue against current referen… ▽ More

    Submitted 27 October, 2022; v1 submitted 21 May, 2022; originally announced May 2022.

    Comments: Proceedings of EMNLP 2022

  7. arXiv:2205.09172  [pdf, other

    cs.CL

    Color Overmodification Emerges from Data-Driven Learning and Pragmatic Reasoning

    Authors: Fei Fang, Kunal Sinha, Noah D. Goodman, Christopher Potts, Elisa Kreiss

    Abstract: Speakers' referential expressions often depart from communicative ideals in ways that help illuminate the nature of pragmatic language use. Patterns of overmodification, in which a speaker uses a modifier that is redundant given their communicative goal, have proven especially informative in this regard. It seems likely that these patterns are shaped by the environment a speaker is exposed to in c… ▽ More

    Submitted 18 May, 2022; originally announced May 2022.

    Comments: Proceedings of the Annual Meeting of the Cognitive Science Society (2022)

  8. arXiv:2112.02505  [pdf, other

    cs.CL cs.LG

    Causal Distillation for Language Models

    Authors: Zhengxuan Wu, Atticus Geiger, Josh Rozner, Elisa Kreiss, Hanson Lu, Thomas Icard, Christopher Potts, Noah D. Goodman

    Abstract: Distillation efforts have led to language models that are more compact and efficient without serious drops in performance. The standard approach to distillation trains a student model against two objectives: a task-specific objective (e.g., language modeling) and an imitation objective that encourages the hidden states of the student model to be similar to those of the larger teacher model. In thi… ▽ More

    Submitted 3 June, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

    Comments: 7 pages, 2 figures

    Journal ref: NAACL 2022

  9. arXiv:2112.00826  [pdf, other

    cs.LG

    Inducing Causal Structure for Interpretable Neural Networks

    Authors: Atticus Geiger, Zhengxuan Wu, Hanson Lu, Josh Rozner, Elisa Kreiss, Thomas Icard, Noah D. Goodman, Christopher Potts

    Abstract: In many areas, we have well-founded insights about causal structure that would be useful to bring into our trained models while still allowing them to learn in a data-driven fashion. To achieve this, we present the new method of interchange intervention training (IIT). In IIT, we (1) align variables in a causal model (e.g., a deterministic program or Bayesian network) with representations in a neu… ▽ More

    Submitted 20 July, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

  10. arXiv:2109.08994  [pdf, other

    cs.CL

    ReaSCAN: Compositional Reasoning in Language Grounding

    Authors: Zhengxuan Wu, Elisa Kreiss, Desmond C. Ong, Christopher Potts

    Abstract: The ability to compositionally map language to referents, relations, and actions is an essential component of language understanding. The recent gSCAN dataset (Ruis et al. 2020, NeurIPS) is an inspiring attempt to assess the capacity of models to learn this kind of grounding in scenarios involving navigational instructions. However, we show that gSCAN's highly constrained design means that it does… ▽ More

    Submitted 18 September, 2021; originally announced September 2021.

    Comments: 26 pages, 8 figures

    Journal ref: NeurIPS 2021 (Dataset and Benchmarks Track)

  11. arXiv:2104.08376  [pdf, other

    cs.CL

    Concadia: Towards Image-Based Text Generation with a Purpose

    Authors: Elisa Kreiss, Fei Fang, Noah D. Goodman, Christopher Potts

    Abstract: Current deep learning models often achieve excellent results on benchmark image-to-text datasets but fail to generate texts that are useful in practice. We argue that to close this gap, it is vital to distinguish descriptions from captions based on their distinct communicative roles. Descriptions focus on visual features and are meant to replace an image (often to increase accessibility), whereas… ▽ More

    Submitted 27 October, 2022; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: Proceedings of EMNLP 2022

  12. arXiv:2006.09589  [pdf, other

    cs.CL

    Modeling Subjective Assessments of Guilt in Newspaper Crime Narratives

    Authors: Elisa Kreiss, Zijian Wang, Christopher Potts

    Abstract: Crime reporting is a prevalent form of journalism with the power to shape public perceptions and social policies. How does the language of these reports act on readers? We seek to address this question with the SuspectGuilt Corpus of annotated crime stories from English-language newspapers in the U.S. For SuspectGuilt, annotators read short crime articles and provided text-level ratings concerning… ▽ More

    Submitted 14 October, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

    Comments: CoNLL 2020

  13. arXiv:1903.08237  [pdf, other

    cs.CL

    When redundancy is useful: A Bayesian approach to 'overinformative' referring expressions

    Authors: Judith Degen, Robert D. Hawkins, Caroline Graf, Elisa Kreiss, Noah D. Goodman

    Abstract: Referring is one of the most basic and prevalent uses of language. How do speakers choose from the wealth of referring expressions at their disposal? Rational theories of language use have come under attack for decades for not being able to account for the seemingly irrational overinformativeness ubiquitous in referring expressions. Here we present a novel production model of referring expressions… ▽ More

    Submitted 10 December, 2019; v1 submitted 19 March, 2019; originally announced March 2019.