-
Physics-Informed Bayesian Optimization of Variational Quantum Circuits
Authors:
Kim A. Nicoli,
Christopher J. Anders,
Lena Funcke,
Tobias Hartung,
Karl Jansen,
Stefan Kühn,
Klaus-Robert Müller,
Paolo Stornati,
Pan Kessel,
Shinichi Nakajima
Abstract:
In this paper, we propose a novel and powerful method to harness Bayesian optimization for Variational Quantum Eigensolvers (VQEs) -- a hybrid quantum-classical protocol used to approximate the ground state of a quantum Hamiltonian. Specifically, we derive a VQE-kernel which incorporates important prior information about quantum circuits: the kernel feature map of the VQE-kernel exactly matches th…
▽ More
In this paper, we propose a novel and powerful method to harness Bayesian optimization for Variational Quantum Eigensolvers (VQEs) -- a hybrid quantum-classical protocol used to approximate the ground state of a quantum Hamiltonian. Specifically, we derive a VQE-kernel which incorporates important prior information about quantum circuits: the kernel feature map of the VQE-kernel exactly matches the known functional form of the VQE's objective function and thereby significantly reduces the posterior uncertainty. Moreover, we propose a novel acquisition function for Bayesian optimization called Expected Maximum Improvement over Confident Regions (EMICoRe) which can actively exploit the inductive bias of the VQE-kernel by treating regions with low predictive uncertainty as indirectly ``observed''. As a result, observations at as few as three points in the search domain are sufficient to determine the complete objective function along an entire one-dimensional subspace of the optimization landscape. Our numerical experiments demonstrate that our approach improves over state-of-the-art baselines.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Towards Fixing Clever-Hans Predictors with Counterfactual Knowledge Distillation
Authors:
Sidney Bender,
Christopher J. Anders,
Pattarawatt Chormai,
Heike Marxfeld,
Jan Herrmann,
Grégoire Montavon
Abstract:
This paper introduces a novel technique called counterfactual knowledge distillation (CFKD) to detect and remove reliance on confounders in deep learning models with the help of human expert feedback. Confounders are spurious features that models tend to rely on, which can result in unexpected errors in regulated or safety-critical domains. The paper highlights the benefit of CFKD in such domains…
▽ More
This paper introduces a novel technique called counterfactual knowledge distillation (CFKD) to detect and remove reliance on confounders in deep learning models with the help of human expert feedback. Confounders are spurious features that models tend to rely on, which can result in unexpected errors in regulated or safety-critical domains. The paper highlights the benefit of CFKD in such domains and shows some advantages of counterfactual explanations over other types of explanations. We propose an experiment scheme to quantitatively evaluate the success of CFKD and different teachers that can give feedback to the model. We also introduce a new metric that is better correlated with true test performance than validation accuracy. The paper demonstrates the effectiveness of CFKD on synthetically augmented datasets and on real-world histopathological datasets.
△ Less
Submitted 3 October, 2023; v1 submitted 2 October, 2023;
originally announced October 2023.
-
From Hope to Safety: Unlearning Biases of Deep Models via Gradient Penalization in Latent Space
Authors:
Maximilian Dreyer,
Frederik Pahde,
Christopher J. Anders,
Wojciech Samek,
Sebastian Lapuschkin
Abstract:
Deep Neural Networks are prone to learning spurious correlations embedded in the training data, leading to potentially biased predictions. This poses risks when deploying these models for high-stake decision-making, such as in medical applications. Current methods for post-hoc model correction either require input-level annotations which are only possible for spatially localized biases, or augment…
▽ More
Deep Neural Networks are prone to learning spurious correlations embedded in the training data, leading to potentially biased predictions. This poses risks when deploying these models for high-stake decision-making, such as in medical applications. Current methods for post-hoc model correction either require input-level annotations which are only possible for spatially localized biases, or augment the latent feature space, thereby ho** to enforce the right reasons. We present a novel method for model correction on the concept level that explicitly reduces model sensitivity towards biases via gradient penalization. When modeling biases via Concept Activation Vectors, we highlight the importance of choosing robust directions, as traditional regression-based approaches such as Support Vector Machines tend to result in diverging directions. We effectively mitigate biases in controlled and real-world settings on the ISIC, Bone Age, ImageNet and CelebA datasets using VGG, ResNet and EfficientNet architectures. Code is available on https://github.com/frederikpahde/rrclarc.
△ Less
Submitted 18 December, 2023; v1 submitted 18 August, 2023;
originally announced August 2023.
-
Detecting and Mitigating Mode-Collapse for Flow-based Sampling of Lattice Field Theories
Authors:
Kim A. Nicoli,
Christopher J. Anders,
Tobias Hartung,
Karl Jansen,
Pan Kessel,
Shinichi Nakajima
Abstract:
We study the consequences of mode-collapse of normalizing flows in the context of lattice field theory. Normalizing flows allow for independent sampling. For this reason, it is hoped that they can avoid the tunneling problem of local-update MCMC algorithms for multi-modal distributions. In this work, we first point out that the tunneling problem is also present for normalizing flows but is shifted…
▽ More
We study the consequences of mode-collapse of normalizing flows in the context of lattice field theory. Normalizing flows allow for independent sampling. For this reason, it is hoped that they can avoid the tunneling problem of local-update MCMC algorithms for multi-modal distributions. In this work, we first point out that the tunneling problem is also present for normalizing flows but is shifted from the sampling to the training phase of the algorithm. Specifically, normalizing flows often suffer from mode-collapse for which the training process assigns vanishingly low probability mass to relevant modes of the physical distribution. This may result in a significant bias when the flow is used as a sampler in a Markov-Chain or with Importance Sampling. We propose a metric to quantify the degree of mode-collapse and derive a bound on the resulting bias. Furthermore, we propose various mitigation strategies in particular in the context of estimating thermodynamic observables, such as the free energy.
△ Less
Submitted 3 November, 2023; v1 submitted 27 February, 2023;
originally announced February 2023.
-
Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence
Authors:
Frederik Pahde,
Maximilian Dreyer,
Leander Weber,
Moritz Weckbecker,
Christopher J. Anders,
Thomas Wiegand,
Wojciech Samek,
Sebastian Lapuschkin
Abstract:
With a growing interest in understanding neural network prediction strategies, Concept Activation Vectors (CAVs) have emerged as a popular tool for modeling human-understandable concepts in the latent space. Commonly, CAVs are computed by leveraging linear classifiers optimizing the separability of latent representations of samples with and without a given concept. However, in this paper we show t…
▽ More
With a growing interest in understanding neural network prediction strategies, Concept Activation Vectors (CAVs) have emerged as a popular tool for modeling human-understandable concepts in the latent space. Commonly, CAVs are computed by leveraging linear classifiers optimizing the separability of latent representations of samples with and without a given concept. However, in this paper we show that such a separability-oriented computation leads to solutions, which may diverge from the actual goal of precisely modeling the concept direction. This discrepancy can be attributed to the significant influence of distractor directions, i.e., signals unrelated to the concept, which are picked up by filters (i.e., weights) of linear models to optimize class-separability. To address this, we introduce pattern-based CAVs, solely focussing on concept signals, thereby providing more accurate concept directions. We evaluate various CAV methods in terms of their alignment with the true concept direction and their impact on CAV applications, including concept sensitivity testing and model correction for shortcut behavior caused by data artifacts. We demonstrate the benefits of pattern-based CAVs using the Pediatric Bone Age, ISIC2019, and FunnyBirds datasets with VGG, ResNet, and EfficientNet model architectures.
△ Less
Submitted 5 February, 2024; v1 submitted 7 February, 2022;
originally announced February 2022.
-
Software for Dataset-wide XAI: From Local Explanations to Global Insights with Zennit, CoRelAy, and ViRelAy
Authors:
Christopher J. Anders,
David Neumann,
Wojciech Samek,
Klaus-Robert Müller,
Sebastian Lapuschkin
Abstract:
Deep Neural Networks (DNNs) are known to be strong predictors, but their prediction strategies can rarely be understood. With recent advances in Explainable Artificial Intelligence (XAI), approaches are available to explore the reasoning behind those complex models' predictions. Among post-hoc attribution methods, Layer-wise Relevance Propagation (LRP) shows high performance. For deeper quantitati…
▽ More
Deep Neural Networks (DNNs) are known to be strong predictors, but their prediction strategies can rarely be understood. With recent advances in Explainable Artificial Intelligence (XAI), approaches are available to explore the reasoning behind those complex models' predictions. Among post-hoc attribution methods, Layer-wise Relevance Propagation (LRP) shows high performance. For deeper quantitative analysis, manual approaches exist, but without the right tools they are unnecessarily labor intensive. In this software paper, we introduce three software packages targeted at scientists to explore model reasoning using attribution approaches and beyond: (1) Zennit - a highly customizable and intuitive attribution framework implementing LRP and related approaches in PyTorch, (2) CoRelAy - a framework to easily and quickly construct quantitative analysis pipelines for dataset-wide analyses of explanations, and (3) ViRelAy - a web-application to interactively explore data, attributions, and analysis results. With this, we provide a standardized implementation solution for XAI, to contribute towards more reproducibility in our field.
△ Less
Submitted 28 February, 2023; v1 submitted 24 June, 2021;
originally announced June 2021.
-
Towards Robust Explanations for Deep Neural Networks
Authors:
Ann-Kathrin Dombrowski,
Christopher J. Anders,
Klaus-Robert Müller,
Pan Kessel
Abstract:
Explanation methods shed light on the decision process of black-box classifiers such as deep neural networks. But their usefulness can be compromised because they are susceptible to manipulations. With this work, we aim to enhance the resilience of explanations. We develop a unified theoretical framework for deriving bounds on the maximal manipulability of a model. Based on these theoretical insig…
▽ More
Explanation methods shed light on the decision process of black-box classifiers such as deep neural networks. But their usefulness can be compromised because they are susceptible to manipulations. With this work, we aim to enhance the resilience of explanations. We develop a unified theoretical framework for deriving bounds on the maximal manipulability of a model. Based on these theoretical insights, we present three different techniques to boost robustness against manipulation: training with weight decay, smoothing activation functions, and minimizing the Hessian of the network. Our experimental results confirm the effectiveness of these approaches.
△ Less
Submitted 18 December, 2020;
originally announced December 2020.
-
Fairwashing Explanations with Off-Manifold Detergent
Authors:
Christopher J. Anders,
Plamen Pasliev,
Ann-Kathrin Dombrowski,
Klaus-Robert Müller,
Pan Kessel
Abstract:
Explanation methods promise to make black-box classifiers more transparent. As a result, it is hoped that they can act as proof for a sensible, fair and trustworthy decision-making process of the algorithm and thereby increase its acceptance by the end-users. In this paper, we show both theoretically and experimentally that these hopes are presently unfounded. Specifically, we show that, for any c…
▽ More
Explanation methods promise to make black-box classifiers more transparent. As a result, it is hoped that they can act as proof for a sensible, fair and trustworthy decision-making process of the algorithm and thereby increase its acceptance by the end-users. In this paper, we show both theoretically and experimentally that these hopes are presently unfounded. Specifically, we show that, for any classifier $g$, one can always construct another classifier $\tilde{g}$ which has the same behavior on the data (same train, validation, and test error) but has arbitrarily manipulated explanation maps. We derive this statement theoretically using differential geometry and demonstrate it experimentally for various explanation methods, architectures, and datasets. Motivated by our theoretical insights, we then propose a modification of existing explanation methods which makes them significantly more robust.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
Estimation of Thermodynamic Observables in Lattice Field Theories with Deep Generative Models
Authors:
Kim A. Nicoli,
Christopher J. Anders,
Lena Funcke,
Tobias Hartung,
Karl Jansen,
Pan Kessel,
Shinichi Nakajima,
Paolo Stornati
Abstract:
In this work, we demonstrate that applying deep generative machine learning models for lattice field theory is a promising route for solving problems where Markov Chain Monte Carlo (MCMC) methods are problematic. More specifically, we show that generative models can be used to estimate the absolute value of the free energy, which is in contrast to existing MCMC-based methods which are limited to o…
▽ More
In this work, we demonstrate that applying deep generative machine learning models for lattice field theory is a promising route for solving problems where Markov Chain Monte Carlo (MCMC) methods are problematic. More specifically, we show that generative models can be used to estimate the absolute value of the free energy, which is in contrast to existing MCMC-based methods which are limited to only estimate free energy differences. We demonstrate the effectiveness of the proposed method for two-dimensional $φ^4$ theory and compare it to MCMC-based methods in detailed numerical experiments.
△ Less
Submitted 5 January, 2021; v1 submitted 14 July, 2020;
originally announced July 2020.
-
Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications
Authors:
Wojciech Samek,
Grégoire Montavon,
Sebastian Lapuschkin,
Christopher J. Anders,
Klaus-Robert Müller
Abstract:
With the broader and highly successful usage of machine learning in industry and the sciences, there has been a growing demand for Explainable AI. Interpretability and explanation methods for gaining a better understanding about the problem solving abilities and strategies of nonlinear Machine Learning, in particular, deep neural networks, are therefore receiving increased attention. In this work…
▽ More
With the broader and highly successful usage of machine learning in industry and the sciences, there has been a growing demand for Explainable AI. Interpretability and explanation methods for gaining a better understanding about the problem solving abilities and strategies of nonlinear Machine Learning, in particular, deep neural networks, are therefore receiving increased attention. In this work we aim to (1) provide a timely overview of this active emerging field, with a focus on 'post-hoc' explanations, and explain its theoretical foundations, (2) put interpretability algorithms to a test both from a theory and comparative evaluation perspective using extensive simulations, (3) outline best practice aspects i.e. how to best include interpretation methods into the standard usage of machine learning and (4) demonstrate successful usage of explainable AI in a representative selection of application scenarios. Finally, we discuss challenges and possible future directions of this exciting foundational field of machine learning.
△ Less
Submitted 25 February, 2021; v1 submitted 17 March, 2020;
originally announced March 2020.
-
Finding and Removing Clever Hans: Using Explanation Methods to Debug and Improve Deep Models
Authors:
Christopher J. Anders,
Leander Weber,
David Neumann,
Wojciech Samek,
Klaus-Robert Müller,
Sebastian Lapuschkin
Abstract:
Contemporary learning models for computer vision are typically trained on very large (benchmark) datasets with millions of samples. These may, however, contain biases, artifacts, or errors that have gone unnoticed and are exploitable by the model. In the worst case, the trained model does not learn a valid and generalizable strategy to solve the problem it was trained for, and becomes a 'Clever-Ha…
▽ More
Contemporary learning models for computer vision are typically trained on very large (benchmark) datasets with millions of samples. These may, however, contain biases, artifacts, or errors that have gone unnoticed and are exploitable by the model. In the worst case, the trained model does not learn a valid and generalizable strategy to solve the problem it was trained for, and becomes a 'Clever-Hans' (CH) predictor that bases its decisions on spurious correlations in the training data, potentially yielding an unrepresentative or unfair, and possibly even hazardous predictor. In this paper, we contribute by providing a comprehensive analysis framework based on a scalable statistical analysis of attributions from explanation methods for large data corpora. Based on a recent technique - Spectral Relevance Analysis - we propose the following technical contributions and resulting findings: (a) a scalable quantification of artifactual and poisoned classes where the machine learning models under study exhibit CH behavior, (b) several approaches denoted as Class Artifact Compensation (ClArC), which are able to effectively and significantly reduce a model's CH behavior. I.e., we are able to un-Hans models trained on (poisoned) datasets, such as the popular ImageNet data corpus. We demonstrate that ClArC, defined in a simple theoretical framework, may be implemented as part of a Neural Network's training or fine-tuning process, or in a post-hoc manner by injecting additional layers, preventing any further propagation of undesired CH features, into the network architecture. Using our proposed methods, we provide qualitative and quantitative analyses of the biases and artifacts in various datasets. We demonstrate that these insights can give rise to improved, more representative and fairer models operating on implicitly cleaned data corpora.
△ Less
Submitted 18 December, 2020; v1 submitted 22 December, 2019;
originally announced December 2019.
-
Explanations can be manipulated and geometry is to blame
Authors:
Ann-Kathrin Dombrowski,
Maximilian Alber,
Christopher J. Anders,
Marcel Ackermann,
Klaus-Robert Müller,
Pan Kessel
Abstract:
Explanation methods aim to make neural networks more trustworthy and interpretable. In this paper, we demonstrate a property of explanation methods which is disconcerting for both of these purposes. Namely, we show that explanations can be manipulated arbitrarily by applying visually hardly perceptible perturbations to the input that keep the network's output approximately constant. We establish t…
▽ More
Explanation methods aim to make neural networks more trustworthy and interpretable. In this paper, we demonstrate a property of explanation methods which is disconcerting for both of these purposes. Namely, we show that explanations can be manipulated arbitrarily by applying visually hardly perceptible perturbations to the input that keep the network's output approximately constant. We establish theoretically that this phenomenon can be related to certain geometrical properties of neural networks. This allows us to derive an upper bound on the susceptibility of explanations to manipulations. Based on this result, we propose effective mechanisms to enhance the robustness of explanations.
△ Less
Submitted 25 September, 2019; v1 submitted 19 June, 2019;
originally announced June 2019.