Skip to main content

Showing 1–5 of 5 results for author: Holtermann, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02333  [pdf, other

    cs.CL cs.CV

    Why do LLaVA Vision-Language Models Reply to Images in English?

    Authors: Musashi Hinck, Carolin Holtermann, Matthew Lyle Olson, Florian Schneider, Sungduk Yu, Anahita Bhiwandiwalla, Anne Lauscher, Shaoyen Tseng, Vasudev Lal

    Abstract: We uncover a surprising multilingual bias occurring in a popular class of multimodal vision-language models (VLMs). Including an image in the query to a LLaVA-style VLM significantly increases the likelihood of the model returning an English response, regardless of the language of the query. This paper investigates the causes of this loss with a two-pronged approach that combines extensive ablatio… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Pre-print

  2. arXiv:2403.03814  [pdf, other

    cs.CL cs.AI

    Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ

    Authors: Carolin Holtermann, Paul Röttger, Timm Dill, Anne Lauscher

    Abstract: Large language models (LLMs) need to serve everyone, including a global majority of non-English speakers. However, most LLMs today, and open LLMs in particular, are often intended for use in just English (e.g. Llama2, Mistral) or a small handful of high-resource languages (e.g. Mixtral, Qwen). Recent research shows that, despite limits in their intended use, people prompt LLMs in many different la… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  3. arXiv:2401.12756  [pdf, other

    cs.CL cs.AI

    What the Weight?! A Unified Framework for Zero-Shot Knowledge Composition

    Authors: Carolin Holtermann, Markus Frohmann, Navid Rekabsaz, Anne Lauscher

    Abstract: The knowledge encapsulated in a model is the core factor determining its final performance on downstream tasks. Much research in NLP has focused on efficient methods for storing and adapting different types of knowledge, e.g., in dedicated modularized structures, and on how to effectively combine these, e.g., by learning additional parameters. However, given the many possible options, a thorough u… ▽ More

    Submitted 25 January, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted to Findings of the ACL: EACL 2024

  4. arXiv:2310.01217  [pdf, other

    cs.LG cs.AI cs.CL

    ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale

    Authors: Markus Frohmann, Carolin Holtermann, Shahed Masoudian, Anne Lauscher, Navid Rekabsaz

    Abstract: Multi-task learning (MTL) has shown considerable practical benefits, particularly when using language models (LMs). While this is commonly achieved by learning $n$ tasks under a joint optimization procedure, some methods, such as AdapterFusion, divide the problem into two stages: (i) task learning, where knowledge specific to a task is encapsulated within sets of parameters (e.g., adapters), and (… ▽ More

    Submitted 17 May, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: Accepted to Findings of the ACL: ACL 2024

  5. arXiv:2204.04026  [pdf, other

    cs.CL

    Fair and Argumentative Language Modeling for Computational Argumentation

    Authors: Carolin Holtermann, Anne Lauscher, Simone Paolo Ponzetto

    Abstract: Although much work in NLP has focused on measuring and mitigating stereotypical bias in semantic spaces, research addressing bias in computational argumentation is still in its infancy. In this paper, we address this research gap and conduct a thorough investigation of bias in argumentative language models. To this end, we introduce ABBA, a novel resource for bias measurement specifically tailored… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

    Comments: ACL 2022