Skip to main content

Showing 1–4 of 4 results for author: Sterz, H

.
  1. arXiv:2407.01091  [pdf, other

    cs.CL

    M2QA: Multi-domain Multilingual Question Answering

    Authors: Leon Engländer, Hannah Sterz, Clifton Poth, Jonas Pfeiffer, Ilia Kuznetsov, Iryna Gurevych

    Abstract: Generalization and robustness to input variation are core desiderata of machine learning research. Language varies along several axes, most importantly, language instance (e.g. French) and domain (e.g. news). While adapting NLP models to new languages within a single domain, or to new domains within a single language, is widely studied, research in joint adaptation is hampered by the lack of evalu… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2401.16405  [pdf, other

    cs.CL cs.AI cs.LG

    Scaling Sparse Fine-Tuning to Large Language Models

    Authors: Alan Ansell, Ivan Vulić, Hannah Sterz, Anna Korhonen, Edoardo M. Ponti

    Abstract: Large Language Models (LLMs) are difficult to fully fine-tune (e.g., with instructions or human feedback) due to their sheer number of parameters. A family of parameter-efficient sparse fine-tuning methods have proven promising in terms of performance but their memory requirements increase proportionally to the size of the LLMs. In this work, we scale sparse fine-tuning to state-of-the-art LLMs li… ▽ More

    Submitted 2 February, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  3. arXiv:2311.11077  [pdf, other

    cs.CL cs.AI cs.LG

    Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning

    Authors: Clifton Poth, Hannah Sterz, Indraneil Paul, Sukannya Purkayastha, Leon Engländer, Timo Imhof, Ivan Vulić, Sebastian Ruder, Iryna Gurevych, Jonas Pfeiffer

    Abstract: We introduce Adapters, an open-source library that unifies parameter-efficient and modular transfer learning in large language models. By integrating 10 diverse adapter methods into a unified interface, Adapters offers ease of use and flexible configuration. Our library allows researchers and practitioners to leverage adapter modularity through composition blocks, enabling the design of complex ad… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023: Systems Demonstrations

  4. arXiv:2203.13693  [pdf, other

    cs.CL cs.IR

    UKP-SQUARE: An Online Platform for Question Answering Research

    Authors: Tim Baumgärtner, Kexin Wang, Rachneet Sachdeva, Max Eichler, Gregor Geigle, Clifton Poth, Hannah Sterz, Haritz Puerto, Leonardo F. R. Ribeiro, Jonas Pfeiffer, Nils Reimers, Gözde Gül Şahin, Iryna Gurevych

    Abstract: Recent advances in NLP and information retrieval have given rise to a diverse set of question answering tasks that are of different formats (e.g., extractive, abstractive), require different model architectures (e.g., generative, discriminative), and setups (e.g., with or without retrieval). Despite having a large number of powerful, specialized QA pipelines (which we refer to as Skills) that cons… ▽ More

    Submitted 28 March, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

    Comments: Accepted at ACL 2022 Demo Track