Skip to main content

Showing 1–23 of 23 results for author: Dreyer, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.10433  [pdf, other

    cs.CV cs.AI cs.LG

    Explainable concept map**s of MRI: Revealing the mechanisms underlying deep learning-based brain disease classification

    Authors: Christian Tinauer, Anna Damulina, Maximilian Sackl, Martin Soellradl, Reduan Achtibat, Maximilian Dreyer, Frederik Pahde, Sebastian Lapuschkin, Reinhold Schmidt, Stefan Ropele, Wojciech Samek, Christian Langkammer

    Abstract: Motivation. While recent studies show high accuracy in the classification of Alzheimer's disease using deep neural networks, the underlying learned concepts have not been investigated. Goals. To systematically identify changes in brain regions through concepts learned by the deep neural network for model validation. Approach. Using quantitative R2* maps we separated Alzheimer's patients (n=117… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  2. arXiv:2404.09601  [pdf, other

    cs.LG cs.AI cs.CV

    Reactive Model Correction: Mitigating Harm to Task-Relevant Features via Conditional Bias Suppression

    Authors: Dilyara Bareeva, Maximilian Dreyer, Frederik Pahde, Wojciech Samek, Sebastian Lapuschkin

    Abstract: Deep Neural Networks are prone to learning and relying on spurious correlations in the training data, which, for high-risk applications, can have fatal consequences. Various approaches to suppress model reliance on harmful features have been proposed that can be applied post-hoc without additional training. Whereas those methods can be applied with efficiency, they also tend to harm model performa… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  3. arXiv:2404.06453  [pdf, other

    cs.CV cs.AI cs.LG

    PURE: Turning Polysemantic Neurons Into Pure Features by Identifying Relevant Circuits

    Authors: Maximilian Dreyer, Erblina Purelku, Johanna Vielhaben, Wojciech Samek, Sebastian Lapuschkin

    Abstract: The field of mechanistic interpretability aims to study the role of individual neurons in Deep Neural Networks. Single neurons, however, have the capability to act polysemantically and encode for multiple (unrelated) features, which renders their interpretation difficult. We present a method for disentangling polysemanticity of any Deep Neural Network by decomposing a polysemantic neuron into mult… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 14 pages (4 pages manuscript, 2 pages references, 8 pages appendix)

  4. arXiv:2402.18479  [pdf, other

    cs.CL

    NewsQs: Multi-Source Question Generation for the Inquiring Mind

    Authors: Alyssa Hwang, Kalpit Dixit, Miguel Ballesteros, Yassine Benajiba, Vittorio Castelli, Markus Dreyer, Mohit Bansal, Kathleen McKeown

    Abstract: We present NewsQs (news-cues), a dataset that provides question-answer pairs for multiple news documents. To create NewsQs, we augment a traditional multi-document summarization dataset with questions automatically generated by a T5-Large model fine-tuned on FAQ-style news articles from the News On the Web corpus. We show that fine-tuning a model with control codes produces questions that are judg… ▽ More

    Submitted 15 June, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: minor wording change

  5. arXiv:2402.05602  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers

    Authors: Reduan Achtibat, Sayed Mohammad Vakilzadeh Hatefi, Maximilian Dreyer, Aakriti Jain, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek

    Abstract: Large Language Models are prone to biased predictions and hallucinations, underlining the paramount importance of understanding their model-internal reasoning process. However, achieving faithful attributions for the entirety of a black-box transformer model and maintaining computational efficiency is an unsolved challenge. By extending the Layer-wise Relevance Propagation attribution method to ha… ▽ More

    Submitted 10 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  6. arXiv:2311.16681  [pdf, other

    cs.CV cs.AI

    Understanding the (Extra-)Ordinary: Validating Deep Model Decisions with Prototypical Concept-based Explanations

    Authors: Maximilian Dreyer, Reduan Achtibat, Wojciech Samek, Sebastian Lapuschkin

    Abstract: Ensuring both transparency and safety is critical when deploying Deep Neural Networks (DNNs) in high-risk applications, such as medicine. The field of explainable AI (XAI) has proposed various methods to comprehend the decision-making processes of opaque DNNs. However, only few XAI methods are suitable of ensuring safety in practice as they heavily rely on repeated labor-intensive and possibly bia… ▽ More

    Submitted 29 April, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: 39 pages (8 pages manuscript, 3 pages references, 28 pages appendix)

  7. arXiv:2310.16197  [pdf, other

    cs.CL

    Background Summarization of Event Timelines

    Authors: Adithya Pratapa, Kevin Small, Markus Dreyer

    Abstract: Generating concise summaries of news events is a challenging natural language processing task. While journalists often curate timelines to highlight key sub-events, newcomers to a news event face challenges in catching up on its historical context. In this paper, we address this need by introducing the task of background news summarization, which complements each timeline update with a background… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 camera-ready

  8. arXiv:2310.10623  [pdf, other

    cs.CL cs.AI cs.LG

    Generating Summaries with Controllable Readability Levels

    Authors: Leonardo F. R. Ribeiro, Mohit Bansal, Markus Dreyer

    Abstract: Readability refers to how easily a reader can understand a written text. Several factors affect the readability level, such as the complexity of the text, its subject matter, and the reader's background knowledge. Generating summaries based on different readability levels is critical for enabling knowledge consumption by diverse audiences. However, current text generation approaches lack refined c… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted as an EMNLP 2023 main paper

  9. arXiv:2308.09437  [pdf, other

    cs.LG cs.AI cs.CV cs.CY

    From Hope to Safety: Unlearning Biases of Deep Models via Gradient Penalization in Latent Space

    Authors: Maximilian Dreyer, Frederik Pahde, Christopher J. Anders, Wojciech Samek, Sebastian Lapuschkin

    Abstract: Deep Neural Networks are prone to learning spurious correlations embedded in the training data, leading to potentially biased predictions. This poses risks when deploying these models for high-stake decision-making, such as in medical applications. Current methods for post-hoc model correction either require input-level annotations which are only possible for spatially localized biases, or augment… ▽ More

    Submitted 18 December, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: 35 pages (9 pages manuscript, 2 pages references, 24 pages appendix)

  10. arXiv:2307.01446  [pdf, other

    cs.CL cs.LG

    On Conditional and Compositional Language Model Differentiable Prompting

    Authors: Jonathan Pilault, Can Liu, Mohit Bansal, Markus Dreyer

    Abstract: Prompts have been shown to be an effective method to adapt a frozen Pretrained Language Model (PLM) to perform well on downstream tasks. Prompts can be represented by a human-engineered word sequence or by a learned continuous embedding. In this work, we investigate conditional and compositional differentiable prompting. We propose a new model, Prompt Production System (PRopS), which learns to tra… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: Accepted at International Joint Conference on Artificial Intelligence (IJCAI) 2023

  11. arXiv:2303.12641  [pdf, other

    cs.CV cs.AI

    Reveal to Revise: An Explainable AI Life Cycle for Iterative Bias Correction of Deep Models

    Authors: Frederik Pahde, Maximilian Dreyer, Wojciech Samek, Sebastian Lapuschkin

    Abstract: State-of-the-art machine learning models often learn spurious correlations embedded in the training data. This poses risks when deploying these models for high-stake decision-making, such as in medical applications like skin cancer detection. To tackle this problem, we propose Reveal to Revise (R2R), a framework entailing the entire eXplainable Artificial Intelligence (XAI) life cycle, enabling pr… ▽ More

    Submitted 27 March, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

  12. arXiv:2303.03278  [pdf, other

    cs.CL cs.AI cs.LG

    Faithfulness-Aware Decoding Strategies for Abstractive Summarization

    Authors: David Wan, Mengwen Liu, Kathleen McKeown, Markus Dreyer, Mohit Bansal

    Abstract: Despite significant progress in understanding and improving faithfulness in abstractive summarization, the question of how decoding strategies affect faithfulness is less studied. We present a systematic study of the effect of generation techniques such as beam search and nucleus sampling on faithfulness in abstractive summarization. We find a consistent trend where beam search with large beam siz… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

    Comments: EACL 2023 (17 pages)

  13. arXiv:2211.11426  [pdf, other

    cs.CV cs.AI cs.LG

    Revealing Hidden Context Bias in Segmentation and Object Detection through Concept-specific Explanations

    Authors: Maximilian Dreyer, Reduan Achtibat, Thomas Wiegand, Wojciech Samek, Sebastian Lapuschkin

    Abstract: Applying traditional post-hoc attribution methods to segmentation or object detection predictors offers only limited insights, as the obtained feature attribution maps at input level typically resemble the models' predicted segmentation mask or bounding box. In this work, we address the need for more informative explanations for these predictors by proposing the post-hoc eXplainable Artificial Int… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

  14. From Attribution Maps to Human-Understandable Explanations through Concept Relevance Propagation

    Authors: Reduan Achtibat, Maximilian Dreyer, Ilona Eisenbraun, Sebastian Bosse, Thomas Wiegand, Wojciech Samek, Sebastian Lapuschkin

    Abstract: The field of eXplainable Artificial Intelligence (XAI) aims to bring transparency to today's powerful but opaque deep learning models. While local XAI methods explain individual predictions in form of attribution maps, thereby identifying where important features occur (but not providing information about what they represent), global explanation techniques visualize what concepts a model has gener… ▽ More

    Submitted 6 January, 2024; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: 87 pages (13 pages manuscript, 8 pages references, 66 pages appendix) 63 figures (6 in manuscript, 57 in appendix) 3 tables (in appendix)

    Journal ref: Nature Machine Intelligence (year 2023, volume 5, pages 1006-1019)

  15. arXiv:2205.02170  [pdf, other

    cs.CL cs.AI cs.LG

    Efficient Few-Shot Fine-Tuning for Opinion Summarization

    Authors: Arthur Bražinskas, Ramesh Nallapati, Mohit Bansal, Markus Dreyer

    Abstract: Abstractive summarization models are typically pre-trained on large amounts of generic texts, then fine-tuned on tens or hundreds of thousands of annotated samples. However, in opinion summarization, large annotated datasets of reviews paired with reference summaries are not available and would be expensive to create. This calls for fine-tuning methods robust to overfitting on small datasets. In a… ▽ More

    Submitted 8 May, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: NAACL Findings 2022

  16. arXiv:2204.06508  [pdf, other

    cs.CL cs.AI cs.LG

    FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations

    Authors: Leonardo F. R. Ribeiro, Mengwen Liu, Iryna Gurevych, Markus Dreyer, Mohit Bansal

    Abstract: Despite recent improvements in abstractive summarization, most current approaches generate summaries that are not factually consistent with the source document, severely restricting their trust and usage in real-world applications. Recent works have shown promising improvements in factuality error identification using text or dependency arc entailments; however, they do not consider the entire sem… ▽ More

    Submitted 19 July, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

    Comments: NAACL 2022 (15 pages)

  17. arXiv:2202.03482  [pdf, other

    cs.CV cs.AI cs.LG

    Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence

    Authors: Frederik Pahde, Maximilian Dreyer, Leander Weber, Moritz Weckbecker, Christopher J. Anders, Thomas Wiegand, Wojciech Samek, Sebastian Lapuschkin

    Abstract: With a growing interest in understanding neural network prediction strategies, Concept Activation Vectors (CAVs) have emerged as a popular tool for modeling human-understandable concepts in the latent space. Commonly, CAVs are computed by leveraging linear classifiers optimizing the separability of latent representations of samples with and without a given concept. However, in this paper we show t… ▽ More

    Submitted 5 February, 2024; v1 submitted 7 February, 2022; originally announced February 2022.

  18. ECQ$^{\text{x}}$: Explainability-Driven Quantization for Low-Bit and Sparse DNNs

    Authors: Daniel Becking, Maximilian Dreyer, Wojciech Samek, Karsten Müller, Sebastian Lapuschkin

    Abstract: The remarkable success of deep neural networks (DNNs) in various applications is accompanied by a significant increase in network parameters and arithmetic operations. Such increases in memory and computational demands make deep learning prohibitive for resource-constrained hardware platforms such as mobile devices. Recent efforts aim to reduce these overheads, while preserving model performance a… ▽ More

    Submitted 16 February, 2022; v1 submitted 9 September, 2021; originally announced September 2021.

    Comments: 22 pages, 10 figures, 1 table

    Journal ref: xxAI - Beyond Explainable AI, Lecture Notes in Computer Science (LNAI Vol. 13200), Springer International Publishing, 2022

  19. arXiv:2108.02859  [pdf, other

    cs.CL

    Evaluating the Tradeoff Between Abstractiveness and Factuality in Abstractive Summarization

    Authors: Markus Dreyer, Mengwen Liu, Feng Nan, Sandeep Atluri, Sujith Ravi

    Abstract: Neural models for abstractive summarization tend to generate output that is fluent and well-formed but lacks semantic faithfulness, or factuality, with respect to the input documents. In this paper, we analyze the tradeoff between abstractiveness and factuality of generated summaries across multiple datasets and models, using extensive human evaluations of factuality. In our analysis, we visualize… ▽ More

    Submitted 24 April, 2023; v1 submitted 5 August, 2021; originally announced August 2021.

    Comments: Accepted at EACL 2023 (Findings)

  20. arXiv:2104.09500  [pdf, other

    cs.CL cs.AI cs.LG

    Transductive Learning for Abstractive News Summarization

    Authors: Arthur Bražinskas, Mengwen Liu, Ramesh Nallapati, Sujith Ravi, Markus Dreyer

    Abstract: Pre-trained and fine-tuned news summarizers are expected to generalize to news articles unseen in the fine-tuning (training) phase. However, these articles often contain specifics, such as new events and people, a summarizer could not learn about in training. This applies to scenarios such as a news publisher training a summarizer on dated news and summarizing incoming recent news. In this work, w… ▽ More

    Submitted 16 April, 2022; v1 submitted 17 April, 2021; originally announced April 2021.

  21. arXiv:1907.01791  [pdf, other

    cs.CL cs.AI cs.LG

    Multi-Task Networks With Universe, Group, and Task Feature Learning

    Authors: Shiva Pentyala, Mengwen Liu, Markus Dreyer

    Abstract: We present methods for multi-task learning that take advantage of natural grou**s of related tasks. Task groups may be defined along known properties of the tasks, such as task domain or language. Such task groups represent supervised information at the inter-task level and can be encoded into the model. We investigate two variants of neural network architectures that accomplish this, learning d… ▽ More

    Submitted 3 July, 2019; originally announced July 2019.

  22. arXiv:1711.00549  [pdf, other

    cs.CL cs.AI cs.NE cs.SE

    Just ASK: Building an Architecture for Extensible Self-Service Spoken Language Understanding

    Authors: Anjishnu Kumar, Arpit Gupta, Julian Chan, Sam Tucker, Bjorn Hoffmeister, Markus Dreyer, Stanislav Peshterliev, Ankur Gandhe, Denis Filiminov, Ariya Rastrow, Christian Monson, Agnika Kumar

    Abstract: This paper presents the design of the machine learning architecture that underlies the Alexa Skills Kit (ASK) a large scale Spoken Language Understanding (SLU) Software Development Kit (SDK) that enables developers to extend the capabilities of Amazon's virtual assistant, Alexa. At Amazon, the infrastructure powers over 25,000 skills deployed through the ASK, as well as AWS's Amazon Lex SLU Servic… ▽ More

    Submitted 2 March, 2018; v1 submitted 1 November, 2017; originally announced November 2017.

    Comments: Published at the 1st Workshop on Conversational AI at NIPS 2017 (NIPS-WCAI)

    MSC Class: 68T50

  23. arXiv:1706.04326  [pdf, other

    cs.CL cs.LG

    Transfer Learning for Neural Semantic Parsing

    Authors: Xing Fan, Emilio Monti, Lambert Mathias, Markus Dreyer

    Abstract: The goal of semantic parsing is to map natural language to a machine interpretable meaning representation language (MRL). One of the constraints that limits full exploration of deep learning technologies for semantic parsing is the lack of sufficient annotation training data. In this paper, we propose using sequence-to-sequence in a multi-task setup for semantic parsing with a focus on transfer le… ▽ More

    Submitted 14 June, 2017; originally announced June 2017.

    Comments: Accepted for ACL Repl4NLP 2017