Skip to main content

Showing 1–10 of 10 results for author: Schuff, H

.
  1. arXiv:2407.03974  [pdf, other

    cs.CL

    LLM Roleplay: Simulating Human-Chatbot Interaction

    Authors: Hovhannes Tamoyan, Hendrik Schuff, Iryna Gurevych

    Abstract: The development of chatbots requires collecting a large number of human-chatbot dialogues to reflect the breadth of users' sociodemographic backgrounds and conversational goals. However, the resource requirements to conduct the respective user studies can be prohibitively high and often only allow for a narrow analysis of specific dialogue goals and participant demographics. In this paper, we prop… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2403.05338  [pdf, other

    cs.CL

    Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings

    Authors: Wei Zhou, Heike Adel, Hendrik Schuff, Ngoc Thang Vu

    Abstract: Attribution scores indicate the importance of different input parts and can, thus, explain model behaviour. Currently, prompt-based models are gaining popularity, i.a., due to their easier adaptability in low-resource settings. However, the quality of attribution scores extracted from prompt-based models has not been investigated yet. In this work, we address this topic by analyzing attribution sc… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  3. arXiv:2311.07230  [pdf, other

    cs.CL

    How are Prompts Different in Terms of Sensitivity?

    Authors: Sheng Lu, Hendrik Schuff, Iryna Gurevych

    Abstract: In-context learning (ICL) has become one of the most popular learning paradigms. While there is a growing body of literature focusing on prompt engineering, there is a lack of systematic analysis comparing the effects of prompts across different models and tasks. To address this gap, we present a comprehensive prompt analysis based on the sensitivity of a function. Our analysis reveals that sensit… ▽ More

    Submitted 20 June, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: NAACL 2024 Main

  4. arXiv:2309.07034  [pdf, other

    cs.CL cs.AI

    Sensitivity, Performance, Robustness: Deconstructing the Effect of Sociodemographic Prompting

    Authors: Tilman Beck, Hendrik Schuff, Anne Lauscher, Iryna Gurevych

    Abstract: Annotators' sociodemographic backgrounds (i.e., the individual compositions of their gender, age, educational background, etc.) have a strong impact on their decisions when working on subjective NLP tasks, such as toxic language detection. Often, heterogeneous backgrounds result in high disagreements. To model this variation, recent work has explored sociodemographic prompting, a technique, which… ▽ More

    Submitted 8 February, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: EACL 2024 camera-ready

  5. arXiv:2305.02679  [pdf, other

    cs.CL cs.HC

    Neighboring Words Affect Human Interpretation of Saliency Explanations

    Authors: Alon Jacovi, Hendrik Schuff, Heike Adel, Ngoc Thang Vu, Yoav Goldberg

    Abstract: Word-level saliency explanations ("heat maps over words") are often used to communicate feature-attribution in text-based models. Recent studies found that superficial factors such as word length can distort human interpretation of the communicated saliency scores. We conduct a user study to investigate how the marking of a word's neighboring words affect the explainee's perception of the word's i… ▽ More

    Submitted 6 May, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of ACL 2023

  6. arXiv:2210.07126  [pdf, other

    cs.CL cs.AI cs.HC

    Challenges in Explanation Quality Evaluation

    Authors: Hendrik Schuff, Heike Adel, Peng Qi, Ngoc Thang Vu

    Abstract: While much research focused on producing explanations, it is still unclear how the produced explanations' quality can be evaluated in a meaningful way. Today's predominant approach is to quantify explanations using proxy scores which compare explanations to (human-annotated) gold explanations. This approach assumes that explanations which reach higher proxy scores will also provide a greater benef… ▽ More

    Submitted 9 March, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: 41 pages, 11 figures

  7. arXiv:2201.11569  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Human Interpretation of Saliency-based Explanation Over Text

    Authors: Hendrik Schuff, Alon Jacovi, Heike Adel, Yoav Goldberg, Ngoc Thang Vu

    Abstract: While a lot of research in explainable AI focuses on producing effective explanations, less work is devoted to the question of how people understand and interpret the explanation. In this work, we focus on this question through a study of saliency-based explanations over textual data. Feature-attribution explanations of text models aim to communicate which parts of the input text were more influen… ▽ More

    Submitted 17 June, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

    Comments: FAccT 2022

  8. arXiv:2109.07833  [pdf, other

    cs.CL cs.HC

    Does External Knowledge Help Explainable Natural Language Inference? Automatic Evaluation vs. Human Ratings

    Authors: Hendrik Schuff, Hsiu-Yu Yang, Heike Adel, Ngoc Thang Vu

    Abstract: Natural language inference (NLI) requires models to learn and apply commonsense knowledge. These reasoning abilities are particularly important for explainable NLI systems that generate a natural language explanation in addition to their label prediction. The integration of external knowledge has been shown to improve NLI systems, here we investigate whether it can also improve their explanation c… ▽ More

    Submitted 13 October, 2021; v1 submitted 16 September, 2021; originally announced September 2021.

    Comments: BlackboxNLP @ EMNLP2021

  9. arXiv:2107.12220  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Thought Flow Nets: From Single Predictions to Trains of Model Thought

    Authors: Hendrik Schuff, Heike Adel, Ngoc Thang Vu

    Abstract: When humans solve complex problems, they typically create a sequence of ideas (involving an intuitive decision, reflection, error correction, etc.) in order to reach a conclusive decision. Contrary to this, today's models are mostly trained to map an input to one single and fixed output. In this paper, we investigate how we can give models the opportunity of a second, third and $k$-th thought. Tak… ▽ More

    Submitted 14 March, 2023; v1 submitted 26 July, 2021; originally announced July 2021.

    Comments: 15 pages, 7 figures

  10. arXiv:2010.06283  [pdf, other

    cs.CL

    F1 is Not Enough! Models and Evaluation Towards User-Centered Explainable Question Answering

    Authors: Hendrik Schuff, Heike Adel, Ngoc Thang Vu

    Abstract: Explainable question answering systems predict an answer together with an explanation showing why the answer has been selected. The goal is to enable users to assess the correctness of the system and understand its reasoning process. However, we show that current models and evaluation settings have shortcomings regarding the coupling of answer and explanation which might cause serious issues in us… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020