Skip to main content

Showing 1–40 of 40 results for author: Montavon, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.07592  [pdf, other

    cs.LG cs.AI stat.ML

    MambaLRP: Explaining Selective State Space Sequence Models

    Authors: Farnoush Rezaei Jafari, Grégoire Montavon, Klaus-Robert Müller, Oliver Eberle

    Abstract: Recent sequence modeling approaches using Selective State Space Sequence Models, referred to as Mamba models, have seen a surge of interest. These models allow efficient processing of long sequences in linear time and are rapidly being adopted in a wide range of applications such as language modeling, demonstrating promising performance. To foster their reliable use in real-world scenarios, it is… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  2. arXiv:2403.07486  [pdf, other

    cs.LG

    XpertAI: uncovering model strategies for sub-manifolds

    Authors: Simon Letzgus, Klaus-Robert Müller, Grégoire Montavon

    Abstract: In recent years, Explainable AI (XAI) methods have facilitated profound validation and knowledge extraction from ML models. While extensively studied for classification, few XAI solutions have addressed the challenges specific to regression models. In regression, explanations need to be precisely formulated to address specific user queries (e.g.\ distinguishing between `Why is the output above 0?'… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  3. arXiv:2401.17441  [pdf, other

    cs.LG cs.AI stat.ML

    Explaining Predictive Uncertainty by Exposing Second-Order Effects

    Authors: Florian Bley, Sebastian Lapuschkin, Wojciech Samek, Grégoire Montavon

    Abstract: Explainable AI has brought transparency into complex ML blackboxes, enabling, in particular, to identify which features these models use for their predictions. So far, the question of explaining predictive uncertainty, i.e. why a model 'doubts', has been scarcely studied. Our investigation reveals that predictive uncertainty is dominated by second-order effects, involving single features or produc… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Comments: 12 pages + supplement

  4. arXiv:2310.09091  [pdf, other

    cs.LG cs.AI cs.CY cs.DL

    Insightful analysis of historical sources at scales beyond human capabilities using unsupervised Machine Learning and XAI

    Authors: Oliver Eberle, Jochen Büttner, Hassan El-Hajj, Grégoire Montavon, Klaus-Robert Müller, Matteo Valleriani

    Abstract: Historical materials are abundant. Yet, piecing together how human knowledge has evolved and spread both diachronically and synchronically remains a challenge that can so far only be very selectively addressed. The vast volume of materials precludes comprehensive studies, given the restricted number of human specialists. However, as large amounts of historical materials are now available in digita… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  5. arXiv:2310.01011  [pdf, other

    cs.AI

    Towards Fixing Clever-Hans Predictors with Counterfactual Knowledge Distillation

    Authors: Sidney Bender, Christopher J. Anders, Pattarawatt Chormai, Heike Marxfeld, Jan Herrmann, Grégoire Montavon

    Abstract: This paper introduces a novel technique called counterfactual knowledge distillation (CFKD) to detect and remove reliance on confounders in deep learning models with the help of human expert feedback. Confounders are spurious features that models tend to rely on, which can result in unexpected errors in regulated or safety-critical domains. The paper highlights the benefit of CFKD in such domains… ▽ More

    Submitted 3 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

  6. Preemptively Pruning Clever-Hans Strategies in Deep Neural Networks

    Authors: Lorenz Linhardt, Klaus-Robert Müller, Grégoire Montavon

    Abstract: Robustness has become an important consideration in deep learning. With the help of explainable AI, mismatches between an explained model's decision strategy and the user's domain knowledge (e.g. Clever Hans effects) have been identified as a starting point for improving faulty models. However, it is less clear what to do when the user and the explanation agree. In this paper, we demonstrate that… ▽ More

    Submitted 10 November, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: 18 pages + supplement

  7. arXiv:2303.06365  [pdf, other

    cs.LG cs.AI cs.CV

    Explainable AI for Time Series via Virtual Inspection Layers

    Authors: Johanna Vielhaben, Sebastian Lapuschkin, Grégoire Montavon, Wojciech Samek

    Abstract: The field of eXplainable Artificial Intelligence (XAI) has greatly advanced in recent years, but progress has mainly been made in computer vision and natural language processing. For time series, where the input is often not interpretable, only limited research on XAI is available. In this work, we put forward a virtual inspection layer, that transforms the time series to an interpretable represen… ▽ More

    Submitted 11 March, 2023; originally announced March 2023.

    Comments: 13 pages, 7 figures

  8. Disentangled Explanations of Neural Network Predictions by Finding Relevant Subspaces

    Authors: Pattarawat Chormai, Jan Herrmann, Klaus-Robert Müller, Grégoire Montavon

    Abstract: Explainable AI aims to overcome the black-box nature of complex ML models like neural networks by generating explanations for their predictions. Explanations often take the form of a heatmap identifying input features (e.g. pixels) that are relevant to the model's decision. These explanations, however, entangle the potentially multiple factors that enter into the overall complex decision strategy.… ▽ More

    Submitted 15 April, 2024; v1 submitted 30 December, 2022; originally announced December 2022.

    Comments: 17 pages + supplement

  9. arXiv:2211.12486  [pdf, other

    cs.LG cs.CV

    Shortcomings of Top-Down Randomization-Based Sanity Checks for Evaluations of Deep Neural Network Explanations

    Authors: Alexander Binder, Leander Weber, Sebastian Lapuschkin, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek

    Abstract: While the evaluation of explanations is an important step towards trustworthy models, it needs to be done carefully, and the employed metrics need to be well-understood. Specifically model randomization testing is often overestimated and regarded as a sole criterion for selecting or discarding certain explanation methods. To address shortcomings of this test, we start by observing an experimental… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

    Comments: 23 pages

  10. arXiv:2202.07304  [pdf, other

    cs.LG

    XAI for Transformers: Better Explanations through Conservative Propagation

    Authors: Ameen Ali, Thomas Schnake, Oliver Eberle, Grégoire Montavon, Klaus-Robert Müller, Lior Wolf

    Abstract: Transformers have become an important workhorse of machine learning, with numerous applications. This necessitates the development of reliable methods for increasing their transparency. Multiple interpretability methods, often based on gradient information, have been proposed. We show that the gradient in a Transformer reflects the function only locally, and thus fails to reliably identify the con… ▽ More

    Submitted 23 June, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

  11. arXiv:2112.11407  [pdf, other

    cs.LG cs.AI stat.ML

    Toward Explainable AI for Regression Models

    Authors: Simon Letzgus, Patrick Wagner, Jonas Lederer, Wojciech Samek, Klaus-Robert Müller, Gregoire Montavon

    Abstract: In addition to the impressive predictive power of machine learning (ML) models, more recently, explanation methods have emerged that enable an interpretation of complex non-linear learning models such as deep neural networks. Gaining a better understanding is especially important e.g. for safety-critical ML applications or medical diagnostics etc. While such Explainable AI (XAI) techniques have re… ▽ More

    Submitted 17 January, 2023; v1 submitted 21 December, 2021; originally announced December 2021.

    Comments: 17 pages, 10 figures, published; changes: 1. references to code and xai-regression.org added (p. 1/2, end of introduction), 2. adjustment of sign-error in restructuring section (p. 8, just above Fig. 4)

    Journal ref: IEEE Signal Processing Magazine (Volume: 39, Issue: 4, July 2022) 40-58

  12. arXiv:2108.10105  [pdf, other

    astro-ph.EP cs.LG physics.flu-dyn physics.geo-ph

    Deep learning for surrogate modelling of 2D mantle convection

    Authors: Siddhant Agarwal, Nicola Tosi, Pan Kessel, Doris Breuer, Grégoire Montavon

    Abstract: Traditionally, 1D models based on scaling laws have been used to parameterized convective heat transfer rocks in the interior of terrestrial planets like Earth, Mars, Mercury and Venus to tackle the computational bottleneck of high-fidelity forward runs in 2D or 3D. However, these are limited in the amount of physics they can model (e.g. depth dependent material properties) and predict only mean q… ▽ More

    Submitted 5 November, 2021; v1 submitted 23 August, 2021; originally announced August 2021.

    Journal ref: Physical Review Fluids, vol. 6, no. 11, 2021

  13. Learning Domain Invariant Representations by Joint Wasserstein Distance Minimization

    Authors: Léo Andeol, Yusei Kawakami, Yuichiro Wada, Takafumi Kanamori, Klaus-Robert Müller, Grégoire Montavon

    Abstract: Domain shifts in the training data are common in practical applications of machine learning; they occur for instance when the data is coming from different sources. Ideally, a ML model should work well independently of these shifts, for example, by learning a domain-invariant representation. However, common ML losses do not give strong guarantees on how consistently the ML model performs for diffe… ▽ More

    Submitted 21 August, 2023; v1 submitted 9 June, 2021; originally announced June 2021.

    Comments: 23 pages + supplement

  14. arXiv:2009.11732  [pdf, other

    cs.LG cs.AI stat.ML

    A Unifying Review of Deep and Shallow Anomaly Detection

    Authors: Lukas Ruff, Jacob R. Kauffmann, Robert A. Vandermeulen, Grégoire Montavon, Wojciech Samek, Marius Kloft, Thomas G. Dietterich, Klaus-Robert Müller

    Abstract: Deep learning approaches to anomaly detection have recently improved the state of the art in detection performance on complex datasets such as large collections of images or text. These results have sparked a renewed interest in the anomaly detection problem and led to the introduction of a great variety of new methods. With the emergence of numerous such methods, including approaches based on gen… ▽ More

    Submitted 8 February, 2021; v1 submitted 24 September, 2020; originally announced September 2020.

    Comments: 40 pages; accepted for publication in the Proceedings of the IEEE;

    Journal ref: Proceedings of the IEEE (2021) 1-40

  15. arXiv:2008.05903  [pdf, other

    q-bio.QM cs.LG q-bio.GN stat.ML

    GraphKKE: Graph Kernel Koopman Embedding for Human Microbiome Analysis

    Authors: Kateryna Melnyk, Stefan Klus, Grégoire Montavon, Tim Conrad

    Abstract: More and more diseases have been found to be strongly correlated with disturbances in the microbiome constitution, e.g., obesity, diabetes, or some cancer types. Thanks to modern high-throughput omics technologies, it becomes possible to directly analyze human microbiome and its influence on the health status. Microbial communities are monitored over long periods of time and the associations betwe… ▽ More

    Submitted 19 November, 2020; v1 submitted 12 August, 2020; originally announced August 2020.

  16. arXiv:2006.10609  [pdf, other

    cs.LG cs.AI stat.ML

    The Clever Hans Effect in Anomaly Detection

    Authors: Jacob Kauffmann, Lukas Ruff, Grégoire Montavon, Klaus-Robert Müller

    Abstract: The 'Clever Hans' effect occurs when the learned model produces correct predictions based on the 'wrong' features. This effect which undermines the generalization capability of an ML model and goes undetected by standard validation techniques has been frequently observed for supervised learning where the training algorithm leverages spurious correlations in the data. The question whether Clever Ha… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Comments: 17 pages, preprint

  17. arXiv:2006.03589  [pdf, other

    cs.LG cs.AI stat.ML

    Higher-Order Explanations of Graph Neural Networks via Relevant Walks

    Authors: Thomas Schnake, Oliver Eberle, Jonas Lederer, Shinichi Nakajima, Kristof T. Schütt, Klaus-Robert Müller, Grégoire Montavon

    Abstract: Graph Neural Networks (GNNs) are a popular approach for predicting graph structured data. As GNNs tightly entangle the input graph into the neural network structure, common explainable AI approaches are not applicable. To a large extent, GNNs have remained black-boxes for the user so far. In this paper, we show that GNNs can in fact be naturally explained using higher-order expansions, i.e. by ide… ▽ More

    Submitted 26 November, 2020; v1 submitted 5 June, 2020; originally announced June 2020.

    Comments: 14 pages + 6 pages supplement

  18. arXiv:2003.07631  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    Explaining Deep Neural Networks and Beyond: A Review of Methods and Applications

    Authors: Wojciech Samek, Grégoire Montavon, Sebastian Lapuschkin, Christopher J. Anders, Klaus-Robert Müller

    Abstract: With the broader and highly successful usage of machine learning in industry and the sciences, there has been a growing demand for Explainable AI. Interpretability and explanation methods for gaining a better understanding about the problem solving abilities and strategies of nonlinear Machine Learning, in particular, deep neural networks, are therefore receiving increased attention. In this work… ▽ More

    Submitted 25 February, 2021; v1 submitted 17 March, 2020; originally announced March 2020.

    Comments: 30 pages, 20 figures

  19. Building and Interpreting Deep Similarity Models

    Authors: Oliver Eberle, Jochen Büttner, Florian Kräutli, Klaus-Robert Müller, Matteo Valleriani, Grégoire Montavon

    Abstract: Many learning algorithms such as kernel machines, nearest neighbors, clustering, or anomaly detection, are based on the concept of 'distance' or 'similarity'. Before similarities are used for training an actual machine learning model, we would like to verify that they are bound to meaningful patterns in the data. In this paper, we propose to make similarities interpretable by augmenting them with… ▽ More

    Submitted 11 March, 2020; originally announced March 2020.

    Comments: 12 pages, 10 figures

  20. Explaining and Interpreting LSTMs

    Authors: Leila Arras, Jose A. Arjona-Medina, Michael Widrich, Grégoire Montavon, Michael Gillhofer, Klaus-Robert Müller, Sepp Hochreiter, Wojciech Samek

    Abstract: While neural networks have acted as a strong unifying force in the design of modern AI systems, the neural network architectures themselves remain highly heterogeneous due to the variety of tasks to be solved. In this chapter, we explore how to adapt the Layer-wise Relevance Propagation (LRP) technique used for explaining the predictions of feed-forward networks to the LSTM architecture used for s… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

    Comments: 28 pages, 7 figures, book chapter, In: Explainable AI: Interpreting, Explaining and Visualizing Deep Learning, LNCS volume 11700, Springer 2019. arXiv admin note: text overlap with arXiv:1806.07857

  21. From Clustering to Cluster Explanations via Neural Networks

    Authors: Jacob Kauffmann, Malte Esders, Lukas Ruff, Grégoire Montavon, Wojciech Samek, Klaus-Robert Müller

    Abstract: A recent trend in machine learning has been to enrich learned models with the ability to explain their own predictions. The emerging field of Explainable AI (XAI) has so far mainly focused on supervised learning, in particular, deep neural network classifiers. In many practical problems however, label information is not given and the goal is instead to discover the underlying structure of the data… ▽ More

    Submitted 16 December, 2021; v1 submitted 18 June, 2019; originally announced June 2019.

    Comments: 15 pages + supplement

  22. arXiv:1902.10178  [pdf, other

    cs.AI cs.CV cs.LG cs.NE stat.ML

    Unmasking Clever Hans Predictors and Assessing What Machines Really Learn

    Authors: Sebastian Lapuschkin, Stephan Wäldchen, Alexander Binder, Grégoire Montavon, Wojciech Samek, Klaus-Robert Müller

    Abstract: Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly "intelligent" behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighte… ▽ More

    Submitted 26 February, 2019; originally announced February 2019.

    Comments: Accepted for publication in Nature Communications

  23. arXiv:1808.04260  [pdf, other

    cs.LG stat.ML

    iNNvestigate neural networks!

    Authors: Maximilian Alber, Sebastian Lapuschkin, Philipp Seegerer, Miriam Hägele, Kristof T. Schütt, Grégoire Montavon, Wojciech Samek, Klaus-Robert Müller, Sven Dähne, Pieter-Jan Kindermans

    Abstract: In recent years, deep neural networks have revolutionized many application domains of machine learning and are key components of many critical decision or predictive processes. Therefore, it is crucial that domain specialists can understand and analyze actions and pre- dictions, even of the most complex neural network architectures. Despite these arguments neural networks are often treated as blac… ▽ More

    Submitted 13 August, 2018; originally announced August 2018.

  24. arXiv:1806.11326  [pdf, other

    stat.ML cs.LG

    Unsupervised Detection and Explanation of Latent-class Contextual Anomalies

    Authors: Jacob Kauffmann, Grégoire Montavon, Luiz Alberto Lima, Shinichi Nakajima, Klaus-Robert Müller, Nico Görnitz

    Abstract: Detecting and explaining anomalies is a challenging effort. This holds especially true when data exhibits strong dependencies and single measurements need to be assessed and analyzed in their respective context. In this work, we consider scenarios where measurements are non-i.i.d, i.e. where samples are dependent on corresponding discrete latent variables which are connected through some given dep… ▽ More

    Submitted 29 June, 2018; originally announced June 2018.

  25. arXiv:1806.06926  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Understanding Patch-Based Learning by Explaining Predictions

    Authors: Christopher Anders, Grégoire Montavon, Wojciech Samek, Klaus-Robert Müller

    Abstract: Deep networks are able to learn highly predictive models of video data. Due to video length, a common strategy is to train them on small video snippets. We apply the deep Taylor / LRP technique to understand the deep network's classification decisions, and identify a "border effect": a tendency of the classifier to look mainly at the bordering frames of the input. This effect relates to the step s… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

    Comments: 7 pages, 6 figures

  26. Towards Explaining Anomalies: A Deep Taylor Decomposition of One-Class Models

    Authors: Jacob Kauffmann, Klaus-Robert Müller, Grégoire Montavon

    Abstract: A common machine learning task is to discriminate between normal and anomalous data points. In practice, it is not always sufficient to reach high accuracy at this task, one also would like to understand why a given data point has been predicted in a certain way. We present a new principled approach for one-class SVMs that decomposes outlier predictions in terms of input variables. The method firs… ▽ More

    Submitted 16 May, 2018; originally announced May 2018.

  27. arXiv:1707.06100  [pdf, other

    cs.CL

    Discovering topics in text datasets by visualizing relevant words

    Authors: Franziska Horn, Leila Arras, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek

    Abstract: When dealing with large collections of documents, it is imperative to quickly get an overview of the texts' contents. In this paper we show how this can be achieved by using a clustering algorithm to identify topics in the dataset and then selecting and visualizing relevant words, which distinguish a group of documents from the rest of the texts, to summarize the contents of the documents belongin… ▽ More

    Submitted 18 July, 2017; originally announced July 2017.

    Comments: arXiv admin note: substantial text overlap with arXiv:1707.05261

  28. arXiv:1707.05261  [pdf, other

    cs.CL

    Exploring text datasets by visualizing relevant words

    Authors: Franziska Horn, Leila Arras, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek

    Abstract: When working with a new dataset, it is important to first explore and familiarize oneself with it, before applying any advanced machine learning algorithms. However, to the best of our knowledge, no tools exist that quickly and reliably give insight into the contents of a selection of documents with respect to what distinguishes them from other documents belonging to different categories. In this… ▽ More

    Submitted 17 July, 2017; originally announced July 2017.

  29. Methods for Interpreting and Understanding Deep Neural Networks

    Authors: Grégoire Montavon, Wojciech Samek, Klaus-Robert Müller

    Abstract: This paper provides an entry point to the problem of interpreting a deep neural network model and explaining its predictions. It is based on a tutorial given at ICASSP 2017. It introduces some recently proposed techniques of interpretation, along with theory, tricks and recommendations, to make most efficient use of these techniques on real data. It also discusses a number of practical application… ▽ More

    Submitted 24 June, 2017; originally announced June 2017.

    Comments: 14 pages, 10 figures

  30. arXiv:1706.07206  [pdf, other

    cs.CL cs.AI cs.NE stat.ML

    Explaining Recurrent Neural Network Predictions in Sentiment Analysis

    Authors: Leila Arras, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek

    Abstract: Recently, a technique called Layer-wise Relevance Propagation (LRP) was shown to deliver insightful explanations in the form of input space relevances for understanding feed-forward neural network classification decisions. In the present work, we extend the usage of LRP to recurrent neural networks. We propose a specific propagation rule applicable to multiplicative connections as they arise in re… ▽ More

    Submitted 4 August, 2017; v1 submitted 22 June, 2017; originally announced June 2017.

    Comments: 9 pages, 4 figures, accepted for EMNLP'17 Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis (WASSA)

  31. arXiv:1612.07843  [pdf, other

    cs.CL cs.IR cs.LG stat.ML

    "What is Relevant in a Text Document?": An Interpretable Machine Learning Approach

    Authors: Leila Arras, Franziska Horn, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek

    Abstract: Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML) models have been trained to automatically map documents to these abstract concepts, allowing to annotate very large text collections, more than could be processed by a human in a lifetime. Besides predicting the text's category very accurately, it is also h… ▽ More

    Submitted 22 December, 2016; originally announced December 2016.

    Comments: 19 pages, 7 figures

  32. arXiv:1611.08191  [pdf, other

    stat.ML cs.LG

    Interpreting the Predictions of Complex ML Models by Layer-wise Relevance Propagation

    Authors: Wojciech Samek, Grégoire Montavon, Alexander Binder, Sebastian Lapuschkin, Klaus-Robert Müller

    Abstract: Complex nonlinear models such as deep neural network (DNNs) have become an important tool for image classification, speech recognition, natural language processing, and many other fields of application. These models however lack transparency due to their complex nonlinear structure and to the complex data distributions to which they typically apply. As a result, it is difficult to fully characteri… ▽ More

    Submitted 24 November, 2016; originally announced November 2016.

    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems

  33. arXiv:1606.07298  [pdf, other

    cs.CL cs.IR cs.LG cs.NE stat.ML

    Explaining Predictions of Non-Linear Classifiers in NLP

    Authors: Leila Arras, Franziska Horn, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek

    Abstract: Layer-wise relevance propagation (LRP) is a recently proposed technique for explaining predictions of complex non-linear classifiers in terms of input variables. In this paper, we apply LRP for the first time to natural language processing (NLP). More precisely, we use it to explain the predictions of a convolutional neural network (CNN) trained on a topic categorization task. Our analysis highlig… ▽ More

    Submitted 23 June, 2016; originally announced June 2016.

    Comments: 7 pages, 3 figures, Paper accepted for 1st Workshop on Representation Learning for NLP at ACL 2016

  34. arXiv:1606.07285  [pdf, other

    cs.CV cs.NE stat.ML

    Identifying individual facial expressions by deconstructing a neural network

    Authors: Farhad Arbabzadah, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek

    Abstract: This paper focuses on the problem of explaining predictions of psychological attributes such as attractiveness, happiness, confidence and intelligence from face photographs using deep neural networks. Since psychological attribute datasets typically suffer from small sample sizes, we apply transfer learning with two base models to avoid overfitting. These models were trained on an age and gender p… ▽ More

    Submitted 25 June, 2016; v1 submitted 23 June, 2016; originally announced June 2016.

    Comments: 12 pages, 7 figures, Paper accepted for GCPR 2016

  35. arXiv:1604.00825  [pdf, other

    cs.CV

    Layer-wise Relevance Propagation for Neural Networks with Local Renormalization Layers

    Authors: Alexander Binder, Grégoire Montavon, Sebastian Bach, Klaus-Robert Müller, Wojciech Samek

    Abstract: Layer-wise relevance propagation is a framework which allows to decompose the prediction of a deep neural network computed over a sample, e.g. an image, down to relevance scores for the single input dimensions of the sample such as subpixels of an image. While this approach can be applied directly to generalized linear map**s, product type non-linearities are not covered. This paper proposes an… ▽ More

    Submitted 4 April, 2016; originally announced April 2016.

  36. Explaining NonLinear Classification Decisions with Deep Taylor Decomposition

    Authors: Grégoire Montavon, Sebastian Bach, Alexander Binder, Wojciech Samek, Klaus-Robert Müller

    Abstract: Nonlinear methods such as Deep Neural Networks (DNNs) are the gold standard for various challenging machine learning problems, e.g., image classification, natural language processing or human action recognition. Although these methods perform impressively well, they have a significant disadvantage, the lack of transparency, limiting the interpretability of the solution and thus the scope of applic… ▽ More

    Submitted 8 December, 2015; originally announced December 2015.

    Comments: 20 pages, 15 figures

  37. arXiv:1512.00172  [pdf, other

    cs.CV

    Analyzing Classifiers: Fisher Vectors and Deep Neural Networks

    Authors: Sebastian Bach, Alexander Binder, Grégoire Montavon, Klaus-Robert Müller, Wojciech Samek

    Abstract: Fisher Vector classifiers and Deep Neural Networks (DNNs) are popular and successful algorithms for solving image classification problems. However, both are generally considered `black box' predictors as the non-linear transformations involved have so far prevented transparent and interpretable reasoning. Recently, a principled technique, Layer-wise Relevance Propagation (LRP), has been developed… ▽ More

    Submitted 1 December, 2015; originally announced December 2015.

    Comments: 17 pages (10 main document + references , 7 appendix) 1 Table 7 Figures 1 Algorithm submitted to CVPR on 06/11/2025

  38. arXiv:1509.06321  [pdf, other

    cs.CV

    Evaluating the visualization of what a Deep Neural Network has learned

    Authors: Wojciech Samek, Alexander Binder, Grégoire Montavon, Sebastian Bach, Klaus-Robert Müller

    Abstract: Deep Neural Networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multi-layer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision given a new unseen data sample. Recently, several approaches… ▽ More

    Submitted 21 September, 2015; originally announced September 2015.

    Comments: 13 pages, 8 Figures

  39. arXiv:1507.01972  [pdf, other

    stat.ML cs.LG

    Wasserstein Training of Boltzmann Machines

    Authors: Grégoire Montavon, Klaus-Robert Müller, Marco Cuturi

    Abstract: The Boltzmann machine provides a useful framework to learn highly complex, multimodal and multiscale data distributions that occur in the real world. The default method to learn its parameters consists of minimizing the Kullback-Leibler (KL) divergence from training samples to the Boltzmann model. We propose in this work a novel approach for Boltzmann training which assumes that a meaningful metri… ▽ More

    Submitted 7 July, 2015; originally announced July 2015.

    Comments: 9 pages, 6 figures

  40. arXiv:1203.3783  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Learning Feature Hierarchies with Centered Deep Boltzmann Machines

    Authors: Grégoire Montavon, Klaus-Robert Müller

    Abstract: Deep Boltzmann machines are in principle powerful models for extracting the hierarchical structure of data. Unfortunately, attempts to train layers jointly (without greedy layer-wise pretraining) have been largely unsuccessful. We propose a modification of the learning algorithm that initially recenters the output of the activation functions to zero. This modification leads to a better conditioned… ▽ More

    Submitted 16 March, 2012; originally announced March 2012.