-
Sha** Up SHAP: Enhancing Stability through Layer-Wise Neighbor Selection
Authors:
Gwladys Kelodjou,
Laurence Rozé,
Véronique Masson,
Luis Galárraga,
Romaric Gaudel,
Maurice Tchuente,
Alexandre Termier
Abstract:
Machine learning techniques, such as deep learning and ensemble methods, are widely used in various domains due to their ability to handle complex real-world tasks. However, their black-box nature has raised multiple concerns about the fairness, trustworthiness, and transparency of computer-assisted decision-making. This has led to the emergence of local post-hoc explainability methods, which offe…
▽ More
Machine learning techniques, such as deep learning and ensemble methods, are widely used in various domains due to their ability to handle complex real-world tasks. However, their black-box nature has raised multiple concerns about the fairness, trustworthiness, and transparency of computer-assisted decision-making. This has led to the emergence of local post-hoc explainability methods, which offer explanations for individual decisions made by black-box algorithms. Among these methods, Kernel SHAP is widely used due to its model-agnostic nature and its well-founded theoretical framework. Despite these strengths, Kernel SHAP suffers from high instability: different executions of the method with the same inputs can lead to significantly different explanations, which diminishes the relevance of the explanations. The contribution of this paper is two-fold. On the one hand, we show that Kernel SHAP's instability is caused by its stochastic neighbor selection procedure, which we adapt to achieve full stability without compromising explanation fidelity. On the other hand, we show that by restricting the neighbors generation to perturbations of size 1 -- which we call the coalitions of Layer 1 -- we obtain a novel feature-attribution method that is fully stable, computationally efficient, and still meaningful.
△ Less
Submitted 17 June, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
-
LCE: An Augmented Combination of Bagging and Boosting in Python
Authors:
Kevin Fauvel,
Élisa Fromont,
Véronique Masson,
Philippe Faverdin,
Alexandre Termier
Abstract:
lcensemble is a high-performing, scalable and user-friendly Python package for the general tasks of classification and regression. The package implements Local Cascade Ensemble (LCE), a machine learning method that further enhances the prediction performance of the current state-of-the-art methods Random Forest and XGBoost. LCE combines their strengths and adopts a complementary diversification ap…
▽ More
lcensemble is a high-performing, scalable and user-friendly Python package for the general tasks of classification and regression. The package implements Local Cascade Ensemble (LCE), a machine learning method that further enhances the prediction performance of the current state-of-the-art methods Random Forest and XGBoost. LCE combines their strengths and adopts a complementary diversification approach to obtain a better generalizing predictor. The package is compatible with scikit-learn, therefore it can interact with scikit-learn pipelines and model selection tools. It is distributed under the Apache 2.0 license, and its source code is available at https://github.com/LocalCascadeEnsemble/LCE.
△ Less
Submitted 15 August, 2023; v1 submitted 14 August, 2023;
originally announced August 2023.
-
XCM: An Explainable Convolutional Neural Network for Multivariate Time Series Classification
Authors:
Kevin Fauvel,
Tao Lin,
Véronique Masson,
Élisa Fromont,
Alexandre Termier
Abstract:
Multivariate Time Series (MTS) classification has gained importance over the past decade with the increase in the number of temporal datasets in multiple domains. The current state-of-the-art MTS classifier is a heavyweight deep learning approach, which outperforms the second-best MTS classifier only on large datasets. Moreover, this deep learning approach cannot provide faithful explanations as i…
▽ More
Multivariate Time Series (MTS) classification has gained importance over the past decade with the increase in the number of temporal datasets in multiple domains. The current state-of-the-art MTS classifier is a heavyweight deep learning approach, which outperforms the second-best MTS classifier only on large datasets. Moreover, this deep learning approach cannot provide faithful explanations as it relies on post hoc model-agnostic explainability methods, which could prevent its use in numerous applications. In this paper, we present XCM, an eXplainable Convolutional neural network for MTS classification. XCM is a new compact convolutional neural network which extracts information relative to the observed variables and time directly from the input data. Thus, XCM architecture enables a good generalization ability on both large and small datasets, while allowing the full exploitation of a faithful post hoc model-specific explainability method (Gradient-weighted Class Activation Map**) by precisely identifying the observed variables and timestamps of the input data that are important for predictions. We first show that XCM outperforms the state-of-the-art MTS classifiers on both the large and small public UEA datasets. Then, we illustrate how XCM reconciles performance and explainability on a synthetic dataset and show that XCM enables a more precise identification of the regions of the input data that are important for predictions compared to the current deep learning MTS classifier also providing faithful explainability. Finally, we present how XCM can outperform the current most accurate state-of-the-art algorithm on a real-world application while enhancing explainability by providing faithful and more informative explanations.
△ Less
Submitted 7 December, 2021; v1 submitted 10 September, 2020;
originally announced September 2020.
-
A Performance-Explainability Framework to Benchmark Machine Learning Methods: Application to Multivariate Time Series Classifiers
Authors:
Kevin Fauvel,
Véronique Masson,
Élisa Fromont
Abstract:
Our research aims to propose a new performance-explainability analytical framework to assess and benchmark machine learning methods. The framework details a set of characteristics that systematize the performance-explainability assessment of existing machine learning methods. In order to illustrate the use of the framework, we apply it to benchmark the current state-of-the-art multivariate time se…
▽ More
Our research aims to propose a new performance-explainability analytical framework to assess and benchmark machine learning methods. The framework details a set of characteristics that systematize the performance-explainability assessment of existing machine learning methods. In order to illustrate the use of the framework, we apply it to benchmark the current state-of-the-art multivariate time series classifiers.
△ Less
Submitted 19 November, 2021; v1 submitted 29 May, 2020;
originally announced May 2020.
-
XEM: An Explainable-by-Design Ensemble Method for Multivariate Time Series Classification
Authors:
Kevin Fauvel,
Élisa Fromont,
Véronique Masson,
Philippe Faverdin,
Alexandre Termier
Abstract:
We present XEM, an eXplainable-by-design Ensemble method for Multivariate time series classification. XEM relies on a new hybrid ensemble method that combines an explicit boosting-bagging approach to handle the bias-variance trade-off faced by machine learning models and an implicit divide-and-conquer approach to individualize classifier errors on different parts of the training data. Our evaluati…
▽ More
We present XEM, an eXplainable-by-design Ensemble method for Multivariate time series classification. XEM relies on a new hybrid ensemble method that combines an explicit boosting-bagging approach to handle the bias-variance trade-off faced by machine learning models and an implicit divide-and-conquer approach to individualize classifier errors on different parts of the training data. Our evaluation shows that XEM outperforms the state-of-the-art MTS classifiers on the public UEA datasets. Furthermore, XEM provides faithful explainability-by-design and manifests robust performance when faced with challenges arising from continuous data collection (different MTS length, missing data and noise).
△ Less
Submitted 15 February, 2022; v1 submitted 7 May, 2020;
originally announced May 2020.
-
Kidnap** Model: An Extension of Selten's Game
Authors:
Azhar Iqbal,
Virginie Masson,
Derek Abbott
Abstract:
Selten's game is a kidnap** model where the probability of capturing the kidnapper is independent of whether the hostage has been released or executed. Most often, in view of the elevated sensitivities involved, authorities put greater effort and resources into capturing the kidnapper if the hostage has been executed, in contrast to the case when a ransom is paid to secure the hostage's release.…
▽ More
Selten's game is a kidnap** model where the probability of capturing the kidnapper is independent of whether the hostage has been released or executed. Most often, in view of the elevated sensitivities involved, authorities put greater effort and resources into capturing the kidnapper if the hostage has been executed, in contrast to the case when a ransom is paid to secure the hostage's release. In this paper, we study the asymmetric game when the probability of capturing the kidnapper depends on whether the hostage has been executed or not and find a new uniquely determined perfect equilibrium point in Selten's game.
△ Less
Submitted 22 September, 2017;
originally announced September 2017.
-
Mining relevant interval rules
Authors:
Thomas Guyet,
René Quiniou,
Véronique Masson
Abstract:
This article extends the method of Garriga et al. for mining relevant rules to numerical attributes by extracting interval-based pattern rules. We propose an algorithm that extracts such rules from numerical datasets using the interval-pattern approach from Kaytoue et al. This algorithm has been implemented and evaluated on real datasets.
This article extends the method of Garriga et al. for mining relevant rules to numerical attributes by extracting interval-based pattern rules. We propose an algorithm that extracts such rules from numerical datasets using the interval-pattern approach from Kaytoue et al. This algorithm has been implemented and evaluated on real datasets.
△ Less
Submitted 11 September, 2017;
originally announced September 2017.