Search | arXiv e-print repository

LCE: An Augmented Combination of Bagging and Boosting in Python

Authors: Kevin Fauvel, Élisa Fromont, Véronique Masson, Philippe Faverdin, Alexandre Termier

Abstract: lcensemble is a high-performing, scalable and user-friendly Python package for the general tasks of classification and regression. The package implements Local Cascade Ensemble (LCE), a machine learning method that further enhances the prediction performance of the current state-of-the-art methods Random Forest and XGBoost. LCE combines their strengths and adopts a complementary diversification ap… ▽ More lcensemble is a high-performing, scalable and user-friendly Python package for the general tasks of classification and regression. The package implements Local Cascade Ensemble (LCE), a machine learning method that further enhances the prediction performance of the current state-of-the-art methods Random Forest and XGBoost. LCE combines their strengths and adopts a complementary diversification approach to obtain a better generalizing predictor. The package is compatible with scikit-learn, therefore it can interact with scikit-learn pipelines and model selection tools. It is distributed under the Apache 2.0 license, and its source code is available at https://github.com/LocalCascadeEnsemble/LCE. △ Less

Submitted 15 August, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

arXiv:2202.05535 [pdf, other]

A Lightweight, Efficient and Explainable-by-Design Convolutional Neural Network for Internet Traffic Classification

Authors: Kevin Fauvel, Fuxing Chen, Dario Rossi

Abstract: Traffic classification, i.e. the identification of the type of applications flowing in a network, is a strategic task for numerous activities (e.g., intrusion detection, routing). This task faces some critical challenges that current deep learning approaches do not address. The design of current approaches do not take into consideration the fact that networking hardware (e.g., routers) often runs… ▽ More Traffic classification, i.e. the identification of the type of applications flowing in a network, is a strategic task for numerous activities (e.g., intrusion detection, routing). This task faces some critical challenges that current deep learning approaches do not address. The design of current approaches do not take into consideration the fact that networking hardware (e.g., routers) often runs with limited computational resources. Further, they do not meet the need for faithful explainability highlighted by regulatory bodies. Finally, these traffic classifiers are evaluated on small datasets which fail to reflect the diversity of applications in real-world settings. Therefore, this paper introduces a new Lightweight, Efficient and eXplainable-by-design convolutional neural network (LEXNet) for Internet traffic classification, which relies on a new residual block (for lightweight and efficiency purposes) and prototype layer (for explainability). Based on a commercial-grade dataset, our evaluation shows that LEXNet succeeds to maintain the same accuracy as the best performing state-of-the-art neural network, while providing the additional features previously mentioned. Moreover, we illustrate the explainability feature of our approach, which stems from the communication of detected application prototypes to the end-user, and we highlight the faithfulness of LEXNet explanations through a comparison with post hoc methods. △ Less

Submitted 5 June, 2023; v1 submitted 11 February, 2022; originally announced February 2022.

Comments: Accepted for publication in the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, USA, 2023

arXiv:2009.04796 [pdf, other]

doi 10.3390/math9233137

XCM: An Explainable Convolutional Neural Network for Multivariate Time Series Classification

Authors: Kevin Fauvel, Tao Lin, Véronique Masson, Élisa Fromont, Alexandre Termier

Abstract: Multivariate Time Series (MTS) classification has gained importance over the past decade with the increase in the number of temporal datasets in multiple domains. The current state-of-the-art MTS classifier is a heavyweight deep learning approach, which outperforms the second-best MTS classifier only on large datasets. Moreover, this deep learning approach cannot provide faithful explanations as i… ▽ More Multivariate Time Series (MTS) classification has gained importance over the past decade with the increase in the number of temporal datasets in multiple domains. The current state-of-the-art MTS classifier is a heavyweight deep learning approach, which outperforms the second-best MTS classifier only on large datasets. Moreover, this deep learning approach cannot provide faithful explanations as it relies on post hoc model-agnostic explainability methods, which could prevent its use in numerous applications. In this paper, we present XCM, an eXplainable Convolutional neural network for MTS classification. XCM is a new compact convolutional neural network which extracts information relative to the observed variables and time directly from the input data. Thus, XCM architecture enables a good generalization ability on both large and small datasets, while allowing the full exploitation of a faithful post hoc model-specific explainability method (Gradient-weighted Class Activation Map**) by precisely identifying the observed variables and timestamps of the input data that are important for predictions. We first show that XCM outperforms the state-of-the-art MTS classifiers on both the large and small public UEA datasets. Then, we illustrate how XCM reconciles performance and explainability on a synthetic dataset and show that XCM enables a more precise identification of the regions of the input data that are important for predictions compared to the current deep learning MTS classifier also providing faithful explainability. Finally, we present how XCM can outperform the current most accurate state-of-the-art algorithm on a real-world application while enhancing explainability by providing faithful and more informative explanations. △ Less

Submitted 7 December, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

Comments: Accepted for publication in Mathematics. Another machine learning method for multivariate time series classification providing faithful explanations is presented in arXiv:2005.03645

arXiv:2005.14501 [pdf, other]

A Performance-Explainability Framework to Benchmark Machine Learning Methods: Application to Multivariate Time Series Classifiers

Authors: Kevin Fauvel, Véronique Masson, Élisa Fromont

Abstract: Our research aims to propose a new performance-explainability analytical framework to assess and benchmark machine learning methods. The framework details a set of characteristics that systematize the performance-explainability assessment of existing machine learning methods. In order to illustrate the use of the framework, we apply it to benchmark the current state-of-the-art multivariate time se… ▽ More Our research aims to propose a new performance-explainability analytical framework to assess and benchmark machine learning methods. The framework details a set of characteristics that systematize the performance-explainability assessment of existing machine learning methods. In order to illustrate the use of the framework, we apply it to benchmark the current state-of-the-art multivariate time series classifiers. △ Less

Submitted 19 November, 2021; v1 submitted 29 May, 2020; originally announced May 2020.

Comments: In Proceedings of the IJCAI-PRICAI 2020 Workshop on Explainable Artificial Intelligence. An example of this framework in use is available in arXiv:2005.03645

arXiv:2005.03645 [pdf, other]

doi 10.1007/s10618-022-00823-6

XEM: An Explainable-by-Design Ensemble Method for Multivariate Time Series Classification

Authors: Kevin Fauvel, Élisa Fromont, Véronique Masson, Philippe Faverdin, Alexandre Termier

Abstract: We present XEM, an eXplainable-by-design Ensemble method for Multivariate time series classification. XEM relies on a new hybrid ensemble method that combines an explicit boosting-bagging approach to handle the bias-variance trade-off faced by machine learning models and an implicit divide-and-conquer approach to individualize classifier errors on different parts of the training data. Our evaluati… ▽ More We present XEM, an eXplainable-by-design Ensemble method for Multivariate time series classification. XEM relies on a new hybrid ensemble method that combines an explicit boosting-bagging approach to handle the bias-variance trade-off faced by machine learning models and an implicit divide-and-conquer approach to individualize classifier errors on different parts of the training data. Our evaluation shows that XEM outperforms the state-of-the-art MTS classifiers on the public UEA datasets. Furthermore, XEM provides faithful explainability-by-design and manifests robust performance when faced with challenges arising from continuous data collection (different MTS length, missing data and noise). △ Less

Submitted 15 February, 2022; v1 submitted 7 May, 2020; originally announced May 2020.

Comments: Accepted for publication in Data Mining and Knowledge Discovery

Showing 1–5 of 5 results for author: Fauvel, K