-
Explainable concept map**s of MRI: Revealing the mechanisms underlying deep learning-based brain disease classification
Authors:
Christian Tinauer,
Anna Damulina,
Maximilian Sackl,
Martin Soellradl,
Reduan Achtibat,
Maximilian Dreyer,
Frederik Pahde,
Sebastian Lapuschkin,
Reinhold Schmidt,
Stefan Ropele,
Wojciech Samek,
Christian Langkammer
Abstract:
Motivation. While recent studies show high accuracy in the classification of Alzheimer's disease using deep neural networks, the underlying learned concepts have not been investigated.
Goals. To systematically identify changes in brain regions through concepts learned by the deep neural network for model validation.
Approach. Using quantitative R2* maps we separated Alzheimer's patients (n=117…
▽ More
Motivation. While recent studies show high accuracy in the classification of Alzheimer's disease using deep neural networks, the underlying learned concepts have not been investigated.
Goals. To systematically identify changes in brain regions through concepts learned by the deep neural network for model validation.
Approach. Using quantitative R2* maps we separated Alzheimer's patients (n=117) from normal controls (n=219) by using a convolutional neural network and systematically investigated the learned concepts using Concept Relevance Propagation and compared these results to a conventional region of interest-based analysis.
Results. In line with established histological findings and the region of interest-based analyses, highly relevant concepts were primarily found in and adjacent to the basal ganglia.
Impact. The identification of concepts learned by deep neural networks for disease classification enables validation of the models and could potentially improve reliability.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
Authors:
Reduan Achtibat,
Sayed Mohammad Vakilzadeh Hatefi,
Maximilian Dreyer,
Aakriti Jain,
Thomas Wiegand,
Sebastian Lapuschkin,
Wojciech Samek
Abstract:
Large Language Models are prone to biased predictions and hallucinations, underlining the paramount importance of understanding their model-internal reasoning process. However, achieving faithful attributions for the entirety of a black-box transformer model and maintaining computational efficiency is an unsolved challenge. By extending the Layer-wise Relevance Propagation attribution method to ha…
▽ More
Large Language Models are prone to biased predictions and hallucinations, underlining the paramount importance of understanding their model-internal reasoning process. However, achieving faithful attributions for the entirety of a black-box transformer model and maintaining computational efficiency is an unsolved challenge. By extending the Layer-wise Relevance Propagation attribution method to handle attention layers, we address these challenges effectively. While partial solutions exist, our method is the first to faithfully and holistically attribute not only input but also latent representations of transformer models with the computational efficiency similar to a single backward pass. Through extensive evaluations against existing methods on LLaMa 2, Mixtral 8x7b, Flan-T5 and vision transformer architectures, we demonstrate that our proposed approach surpasses alternative methods in terms of faithfulness and enables the understanding of latent representations, opening up the door for concept-based explanations. We provide an LRP library at https://github.com/rachtibat/LRP-eXplains-Transformers.
△ Less
Submitted 10 June, 2024; v1 submitted 8 February, 2024;
originally announced February 2024.
-
Understanding the (Extra-)Ordinary: Validating Deep Model Decisions with Prototypical Concept-based Explanations
Authors:
Maximilian Dreyer,
Reduan Achtibat,
Wojciech Samek,
Sebastian Lapuschkin
Abstract:
Ensuring both transparency and safety is critical when deploying Deep Neural Networks (DNNs) in high-risk applications, such as medicine. The field of explainable AI (XAI) has proposed various methods to comprehend the decision-making processes of opaque DNNs. However, only few XAI methods are suitable of ensuring safety in practice as they heavily rely on repeated labor-intensive and possibly bia…
▽ More
Ensuring both transparency and safety is critical when deploying Deep Neural Networks (DNNs) in high-risk applications, such as medicine. The field of explainable AI (XAI) has proposed various methods to comprehend the decision-making processes of opaque DNNs. However, only few XAI methods are suitable of ensuring safety in practice as they heavily rely on repeated labor-intensive and possibly biased human assessment. In this work, we present a novel post-hoc concept-based XAI framework that conveys besides instance-wise (local) also class-wise (global) decision-making strategies via prototypes. What sets our approach apart is the combination of local and global strategies, enabling a clearer understanding of the (dis-)similarities in model decisions compared to the expected (prototypical) concept use, ultimately reducing the dependence on human long-term assessment. Quantifying the deviation from prototypical behavior not only allows to associate predictions with specific model sub-strategies but also to detect outlier behavior. As such, our approach constitutes an intuitive and explainable tool for model validation. We demonstrate the effectiveness of our approach in identifying out-of-distribution samples, spurious model behavior and data quality issues across three datasets (ImageNet, CUB-200, and CIFAR-10) utilizing VGG, ResNet, and EfficientNet architectures. Code is available on https://github.com/maxdreyer/pcx.
△ Less
Submitted 29 April, 2024; v1 submitted 28 November, 2023;
originally announced November 2023.
-
Revealing Hidden Context Bias in Segmentation and Object Detection through Concept-specific Explanations
Authors:
Maximilian Dreyer,
Reduan Achtibat,
Thomas Wiegand,
Wojciech Samek,
Sebastian Lapuschkin
Abstract:
Applying traditional post-hoc attribution methods to segmentation or object detection predictors offers only limited insights, as the obtained feature attribution maps at input level typically resemble the models' predicted segmentation mask or bounding box. In this work, we address the need for more informative explanations for these predictors by proposing the post-hoc eXplainable Artificial Int…
▽ More
Applying traditional post-hoc attribution methods to segmentation or object detection predictors offers only limited insights, as the obtained feature attribution maps at input level typically resemble the models' predicted segmentation mask or bounding box. In this work, we address the need for more informative explanations for these predictors by proposing the post-hoc eXplainable Artificial Intelligence method L-CRP to generate explanations that automatically identify and visualize relevant concepts learned, recognized and used by the model during inference as well as precisely locate them in input space. Our method therefore goes beyond singular input-level attribution maps and, as an approach based on the recently published Concept Relevance Propagation technique, is efficiently applicable to state-of-the-art black-box architectures in segmentation and object detection, such as DeepLabV3+ and YOLOv6, among others. We verify the faithfulness of our proposed technique by quantitatively comparing different concept attribution methods, and discuss the effect on explanation complexity on popular datasets such as CityScapes, Pascal VOC and MS COCO 2017. The ability to precisely locate and communicate concepts is used to reveal and verify the use of background features, thereby highlighting possible biases of the model.
△ Less
Submitted 21 November, 2022;
originally announced November 2022.
-
From Attribution Maps to Human-Understandable Explanations through Concept Relevance Propagation
Authors:
Reduan Achtibat,
Maximilian Dreyer,
Ilona Eisenbraun,
Sebastian Bosse,
Thomas Wiegand,
Wojciech Samek,
Sebastian Lapuschkin
Abstract:
The field of eXplainable Artificial Intelligence (XAI) aims to bring transparency to today's powerful but opaque deep learning models. While local XAI methods explain individual predictions in form of attribution maps, thereby identifying where important features occur (but not providing information about what they represent), global explanation techniques visualize what concepts a model has gener…
▽ More
The field of eXplainable Artificial Intelligence (XAI) aims to bring transparency to today's powerful but opaque deep learning models. While local XAI methods explain individual predictions in form of attribution maps, thereby identifying where important features occur (but not providing information about what they represent), global explanation techniques visualize what concepts a model has generally learned to encode. Both types of methods thus only provide partial insights and leave the burden of interpreting the model's reasoning to the user. In this work we introduce the Concept Relevance Propagation (CRP) approach, which combines the local and global perspectives and thus allows answering both the "where" and "what" questions for individual predictions. We demonstrate the capability of our method in various settings, showcasing that CRP leads to more human interpretable explanations and provides deep insights into the model's representation and reasoning through concept atlases, concept composition analyses, and quantitative investigations of concept subspaces and their role in fine-grained decision making.
△ Less
Submitted 6 January, 2024; v1 submitted 7 June, 2022;
originally announced June 2022.