Search | arXiv e-print repository

Interpretable Prognostics with Concept Bottleneck Models

Authors: Florent Forest, Katharina Rombach, Olga Fink

Abstract: Deep learning approaches have recently been extensively explored for the prognostics of industrial assets. However, they still suffer from a lack of interpretability, which hinders their adoption in safety-critical applications. To improve their trustworthiness, explainable AI (XAI) techniques have been applied in prognostics, primarily to quantify the importance of input variables for predicting… ▽ More Deep learning approaches have recently been extensively explored for the prognostics of industrial assets. However, they still suffer from a lack of interpretability, which hinders their adoption in safety-critical applications. To improve their trustworthiness, explainable AI (XAI) techniques have been applied in prognostics, primarily to quantify the importance of input variables for predicting the remaining useful life (RUL) using post-hoc attribution methods. In this work, we propose the application of Concept Bottleneck Models (CBMs), a family of inherently interpretable neural network architectures based on concept explanations, to the task of RUL prediction. Unlike attribution methods, which explain decisions in terms of low-level input features, concepts represent high-level information that is easily understandable by users. Moreover, once verified in actual applications, CBMs enable domain experts to intervene on the concept activations at test-time. We propose using the different degradation modes of an asset as intermediate concepts. Our case studies on the New Commercial Modular AeroPropulsion System Simulation (N-CMAPSS) aircraft engine dataset for RUL prediction demonstrate that the performance of CBMs can be on par or superior to black-box models, while being more interpretable, even when the available labeled concepts are limited. Code available at \href{https://github.com/EPFL-IMOS/concept-prognostics/}{\url{github.com/EPFL-IMOS/concept-prognostics/}}. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: Under review

arXiv:2312.02867 [pdf, other]

Semi-Supervised Health Index Monitoring with Feature Generation and Fusion

Authors: Gaëtan Frusque, Ismail Nejjar, Majid Nabavi, Olga Fink

Abstract: The Health Index (HI) is crucial for evaluating system health, aiding tasks like anomaly detection and predicting remaining useful life for systems demanding high safety and reliability. Tight monitoring is crucial for achieving high precision at a lower cost. Obtaining HI labels in real-world applications is often cost-prohibitive, requiring continuous, precise health measurements. Therefore, it… ▽ More The Health Index (HI) is crucial for evaluating system health, aiding tasks like anomaly detection and predicting remaining useful life for systems demanding high safety and reliability. Tight monitoring is crucial for achieving high precision at a lower cost. Obtaining HI labels in real-world applications is often cost-prohibitive, requiring continuous, precise health measurements. Therefore, it is more convenient to leverage run-to failure datasets that may provide potential indications of machine wear condition, making it necessary to apply semi-supervised tools for HI construction. In this study, we adapt the Deep Semi-supervised Anomaly Detection (DeepSAD) method for HI construction. We use the DeepSAD embedding as a condition indicators to address interpretability challenges and sensitivity to system-specific factors. Then, we introduce a diversity loss to enrich condition indicators. We employ an alternating projection algorithm with isotonic constraints to transform the DeepSAD embedding into a normalized HI with an increasing trend. Validation on the PHME 2010 milling dataset, a recognized benchmark with ground truth HIs demonstrates meaningful HIs estimations. Our contributions create opportunities for more accessible and reliable HI estimation, particularly in cases where obtaining ground truth HI labels is unfeasible. △ Less

Submitted 16 February, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

Comments: 13 pages, 8 figures

arXiv:2312.02826 [pdf, other]

Calibrated Adaptive Teacher for Domain Adaptive Intelligent Fault Diagnosis

Authors: Florent Forest, Olga Fink

Abstract: Intelligent Fault Diagnosis (IFD) based on deep learning has proven to be an effective and flexible solution, attracting extensive research. Deep neural networks can learn rich representations from vast amounts of representative labeled data for various applications. In IFD, they achieve high classification performance from signals in an end-to-end manner, without requiring extensive domain knowle… ▽ More Intelligent Fault Diagnosis (IFD) based on deep learning has proven to be an effective and flexible solution, attracting extensive research. Deep neural networks can learn rich representations from vast amounts of representative labeled data for various applications. In IFD, they achieve high classification performance from signals in an end-to-end manner, without requiring extensive domain knowledge. However, deep learning models usually only perform well on the data distribution they have been trained on. When applied to a different distribution, they may experience performance drops. This is also observed in IFD, where assets are often operated in working conditions different from those in which labeled data have been collected. Unsupervised domain adaptation (UDA) deals with the scenario where labeled data are available in a source domain, and only unlabeled data are available in a target domain, where domains may correspond to operating conditions. Recent methods rely on training with confident pseudo-labels for target samples. However, the confidence-based selection of pseudo-labels is hindered by poorly calibrated confidence estimates in the target domain, primarily due to over-confident predictions, which limits the quality of pseudo-labels and leads to error accumulation. In this paper, we propose a novel UDA method called Calibrated Adaptive Teacher (CAT), where we propose to calibrate the predictions of the teacher network throughout the self-training process, leveraging post-hoc calibration techniques. We evaluate CAT on domain-adaptive IFD and perform extensive experiments on the Paderborn benchmark for bearing fault diagnosis under varying operating conditions. Our proposed method achieves state-of-the-art performance on most transfer tasks. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Comments: 23 pages. Under review

MSC Class: 68T07; 62H30 ACM Class: I.2.6; J.2

arXiv:2201.11069 [pdf, other]

doi 10.1109/ICASSP43922.2022.9747491

Learnable Wavelet Packet Transform for Data-Adapted Spectrograms

Authors: Gaetan Frusque, Olga Fink

Abstract: Capturing high-frequency data concerning the condition of complex systems, e.g. by acoustic monitoring, has become increasingly prevalent. Such high-frequency signals typically contain time dependencies ranging over different time scales and different types of cyclic behaviors. Processing such signals requires careful feature engineering, particularly the extraction of meaningful time-frequency fe… ▽ More Capturing high-frequency data concerning the condition of complex systems, e.g. by acoustic monitoring, has become increasingly prevalent. Such high-frequency signals typically contain time dependencies ranging over different time scales and different types of cyclic behaviors. Processing such signals requires careful feature engineering, particularly the extraction of meaningful time-frequency features. This can be time-consuming and the performance is often dependent on the choice of parameters. To address these limitations, we propose a deep learning framework for learnable wavelet packet transforms, enabling to learn features automatically from data and optimise them with respect to the defined objective function. The learned features can be represented as a spectrogram, containing the important time-frequency information of the dataset. We evaluate the properties and performance of the proposed approach by evaluating its improved spectral leakage and by applying it to an anomaly detection task for acoustic monitoring. △ Less

Submitted 26 January, 2022; originally announced January 2022.

Comments: 4 pages, 3 figures, accepted to ICASSP 2022 conference

Journal ref: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

arXiv:2107.09519 [pdf, other]

Canonical Polyadic Decomposition and Deep Learning for Machine Fault Detection

Authors: Gaetan Frusque, Gabriel Michau, Olga Fink

Abstract: Acoustic monitoring for machine fault detection is a recent and expanding research path that has already provided promising results for industries. However, it is impossible to collect enough data to learn all types of faults from a machine. Thus, new algorithms, trained using data from healthy conditions only, were developed to perform unsupervised anomaly detection. A key issue in the developmen… ▽ More Acoustic monitoring for machine fault detection is a recent and expanding research path that has already provided promising results for industries. However, it is impossible to collect enough data to learn all types of faults from a machine. Thus, new algorithms, trained using data from healthy conditions only, were developed to perform unsupervised anomaly detection. A key issue in the development of these algorithms is the noise in the signals, as it impacts the anomaly detection performance. In this work, we propose a powerful data-driven and quasi non-parametric denoising strategy for spectral data based on a tensor decomposition: the Non-negative Canonical Polyadic (CP) decomposition. This method is particularly adapted for machine emitting stationary sound. We demonstrate in a case study, the Malfunctioning Industrial Machine Investigation and Inspection (MIMII) baseline, how the use of our denoising strategy leads to a sensible improvement of the unsupervised anomaly detection. Such approaches are capable to make sound-based monitoring of industrial processes more reliable. △ Less

Submitted 20 July, 2021; originally announced July 2021.

Comments: 9 pages, 5 figures, conference paper from PHM Society European Conference 2021 (Vol. 6, No. 1)

Journal ref: In PHM Society European Conference (Vol. 6, No. 1, pp. 9-9) 2021, June

arXiv:2104.03613 [pdf, other]

Uncertainty-aware Remaining Useful Life predictor

Authors: Luca Biggio, Alexander Wieland, Manuel Arias Chao, Iason Kastanis, Olga Fink

Abstract: Remaining Useful Life (RUL) estimation is the problem of inferring how long a certain industrial asset can be expected to operate within its defined specifications. Deploying successful RUL prediction methods in real-life applications is a prerequisite for the design of intelligent maintenance strategies with the potential of drastically reducing maintenance costs and machine downtimes. In light o… ▽ More Remaining Useful Life (RUL) estimation is the problem of inferring how long a certain industrial asset can be expected to operate within its defined specifications. Deploying successful RUL prediction methods in real-life applications is a prerequisite for the design of intelligent maintenance strategies with the potential of drastically reducing maintenance costs and machine downtimes. In light of their superior performance in a wide range of engineering fields, Machine Learning (ML) algorithms are natural candidates to tackle the challenges involved in the design of intelligent maintenance systems. In particular, given the potentially catastrophic consequences or substantial costs associated with maintenance decisions that are either too late or too early, it is desirable that ML algorithms provide uncertainty estimates alongside their predictions. However, standard data-driven methods used for uncertainty estimation in RUL problems do not scale well to large datasets or are not sufficiently expressive to model the high-dimensional map** from raw sensor data to RUL estimates. In this work, we consider Deep Gaussian Processes (DGPs) as possible solutions to the aforementioned limitations. We perform a thorough evaluation and comparison of several variants of DGPs applied to RUL predictions. The performance of the algorithms is evaluated on the N-CMAPSS (New Commercial Modular Aero-Propulsion System Simulation) dataset from NASA for aircraft engines. The results show that the proposed methods are able to provide very accurate RUL predictions along with sensible uncertainty estimates, providing more reliable solutions for (safety-critical) real-life industrial applications. △ Less

Submitted 8 April, 2021; originally announced April 2021.

arXiv:2009.14606 [pdf, other]

doi 10.1109/SMC42975.2020.9283002

Improving Generalization of Deep Fault Detection Models in the Presence of Mislabeled Data

Authors: Katharina Rombach, Gabriel Michau, Olga Fink

Abstract: Mislabeled samples are ubiquitous in real-world datasets as rule-based or expert labeling is usually based on incorrect assumptions or subject to biased opinions. Neural networks can "memorize" these mislabeled samples and, as a result, exhibit poor generalization. This poses a critical issue in fault detection applications, where not only the training but also the validation datasets are prone to… ▽ More Mislabeled samples are ubiquitous in real-world datasets as rule-based or expert labeling is usually based on incorrect assumptions or subject to biased opinions. Neural networks can "memorize" these mislabeled samples and, as a result, exhibit poor generalization. This poses a critical issue in fault detection applications, where not only the training but also the validation datasets are prone to contain mislabeled samples. In this work, we propose a novel two-step framework for robust training with label noise. In the first step, we identify outliers (including the mislabeled samples) based on the update in the hypothesis space. In the second step, we propose different approaches to modifying the training data based on the identified outliers and a data augmentation technique. Contrary to previous approaches, we aim at finding a robust solution that is suitable for real-world applications, such as fault detection, where no clean, "noise-free" validation dataset is available. Under an approximate assumption about the upper limit of the label noise, we significantly improve the generalization ability of the model trained under massive label noise. △ Less

Submitted 30 September, 2020; originally announced September 2020.

Comments: 12 pages, 3 figures, 5 tables

ACM Class: I.5.2; I.2.1

arXiv:2008.07815 [pdf, other]

doi 10.1016/j.knosys.2021.106816

Unsupervised Transfer Learning for Anomaly Detection: Application to Complementary Operating Condition Transfer

Authors: Gabriel Michau, Olga Fink

Abstract: Anomaly Detectors are trained on healthy operating condition data and raise an alarm when the measured samples deviate from the training data distribution. This means that the samples used to train the model should be sufficient in quantity and representative of the healthy operating conditions. But for industrial systems subject to changing operating conditions, acquiring such comprehensive sets… ▽ More Anomaly Detectors are trained on healthy operating condition data and raise an alarm when the measured samples deviate from the training data distribution. This means that the samples used to train the model should be sufficient in quantity and representative of the healthy operating conditions. But for industrial systems subject to changing operating conditions, acquiring such comprehensive sets of samples requires a long collection period and delay the point at which the anomaly detector can be trained and put in operation. A solution to this problem is to perform unsupervised transfer learning (UTL), to transfer complementary data between different units. In the literature however, UTL aims at finding common structure between the datasets, to perform clustering or dimensionality reduction. Yet, the task of transferring and combining complementary training data has not been studied. Our proposed framework is designed to transfer complementary operating conditions between different units in a completely unsupervised way to train more robust anomaly detectors. It differs, thereby, from other unsupervised transfer learning works as it focuses on a one-class classification problem. The proposed methodology enables to detect anomalies in operating conditions only experienced by other units. The proposed end-to-end framework uses adversarial deep learning to ensure alignment of the different units' distributions. The framework introduces a new loss, inspired by a dimensionality reduction tool, to enforce the conservation of the inherent variability of each dataset, and uses state-of-the art once-class approach to detect anomalies. We demonstrate the benefit of the proposed framework using three open source datasets. △ Less

Submitted 24 November, 2020; v1 submitted 18 August, 2020; originally announced August 2020.

Comments: 14 pages, 7 figures, 3 tables

arXiv:2005.07078 [pdf, other]

Anomaly Detection And Classification In Time Series With Kervolutional Neural Networks

Authors: Oliver Ammann, Gabriel Michau, Olga Fink

Abstract: Recently, with the development of deep learning, end-to-end neural network architectures have been increasingly applied to condition monitoring signals. They have demonstrated superior performance for fault detection and classification, in particular using convolutional neural networks. Even more recently, an extension of the concept of convolution to the concept of kervolution has been proposed w… ▽ More Recently, with the development of deep learning, end-to-end neural network architectures have been increasingly applied to condition monitoring signals. They have demonstrated superior performance for fault detection and classification, in particular using convolutional neural networks. Even more recently, an extension of the concept of convolution to the concept of kervolution has been proposed with some promising results in image classification tasks. In this paper, we explore the potential of kervolutional neural networks applied to time series data. We demonstrate that using a mixture of convolutional and kervolutional layers improves the model performance. The mixed model is first applied to a classification task in time series, as a benchmark dataset. Subsequently, the proposed mixed architecture is used to detect anomalies in time series data recorded by accelerometers on helicopters. We propose a residual-based anomaly detection approach using a temporal auto-encoder. We demonstrate that mixing kervolutional with convolutional layers in the encoder is more sensitive to variations in the input data and is able to detect anomalous time series in a better way. △ Less

Submitted 14 May, 2020; originally announced May 2020.

Comments: 9 pages, 1 figure, 4 tables

Journal ref: Proceedings of the 30th European Safety and Reliability Conference and 15th Probabilistic Safety Assessment and Management Conference (ESREL2020 PSAM15)

arXiv:2005.07031 [pdf, other]

doi 10.1177/1748006X21994446

Temporal signals to images: Monitoring the condition of industrial assets with deep learning image processing algorithms

Authors: Gabriel Rodriguez Garcia, Gabriel Michau, Mélanie Ducoffe, Jayant Sen Gupta, Olga Fink

Abstract: The ability to detect anomalies in time series is considered highly valuable in numerous application domains. The sequential nature of time series objects is responsible for an additional feature complexity, ultimately requiring specialized approaches in order to solve the task. Essential characteristics of time series, situated outside the time domain, are often difficult to capture with state-of… ▽ More The ability to detect anomalies in time series is considered highly valuable in numerous application domains. The sequential nature of time series objects is responsible for an additional feature complexity, ultimately requiring specialized approaches in order to solve the task. Essential characteristics of time series, situated outside the time domain, are often difficult to capture with state-of-the-art anomaly detection methods when no transformations have been applied to the time series. Inspired by the success of deep learning methods in computer vision, several studies have proposed transforming time series into image-like representations, used as inputs for deep learning models, and have led to very promising results in classification tasks. In this paper, we first review the signal to image encoding approaches found in the literature. Second, we propose modifications to some of their original formulations to make them more robust to the variability in large datasets. Third, we compare them on the basis of a common unsupervised task to demonstrate how the choice of the encoding can impact the results when used in the same deep learning architecture. We thus provide a comparison between six encoding algorithms with and without the proposed modifications. The selected encoding methods are Gramian Angular Field, Markov Transition Field, recurrence plot, grey scale encoding, spectrogram, and scalogram. We also compare the results achieved with the raw signal used as input for another deep learning model. We demonstrate that some encodings have a competitive advantage and might be worth considering within a deep learning framework. The comparison is performed on a dataset collected and released by Airbus SAS, containing highly complex vibration measurements from real helicopter flight tests. The different encodings provide competitive results for anomaly detection. △ Less

Submitted 26 February, 2021; v1 submitted 14 May, 2020; originally announced May 2020.

Comments: 13 pages, 5 figures, 2 tables

arXiv:1912.12502 [pdf, other]

Implicit supervision for fault detection and segmentation of emerging fault types with Deep Variational Autoencoders

Authors: Manuel Arias Chao, Bryan T. Adey, Olga Fink

Abstract: Data-driven fault diagnostics of safety-critical systems often faces the challenge of a complete lack of labeled data associated with faulty system conditions (i.e., fault types) at training time. Since an unknown number and nature of fault types can arise during deployment, data-driven fault diagnostics in this scenario is an open-set learning problem. Most of the algorithms for open-set diagnost… ▽ More Data-driven fault diagnostics of safety-critical systems often faces the challenge of a complete lack of labeled data associated with faulty system conditions (i.e., fault types) at training time. Since an unknown number and nature of fault types can arise during deployment, data-driven fault diagnostics in this scenario is an open-set learning problem. Most of the algorithms for open-set diagnostics are one-class classification and unsupervised algorithms that do not leverage all the available labeled and unlabeled data in the learning algorithm. As a result, their fault detection and segmentation performance (i.e., identifying and separating faults of different types) are sub-optimal. With this work, we propose training a variational autoencoder (VAE) with labeled and unlabeled samples while inducing implicit supervision on the latent representation of the healthy conditions. This, together with a modified sampling process of VAE, creates a compact and informative latent representation that allows good detection and segmentation of unseen fault types using existing one-class and clustering algorithms. We refer to the proposed methodology as "knowledge induced variational autoencoder with adaptive sampling" (KIL-AdaVAE). The fault detection and segmentation capabilities of the proposed methodology are demonstrated in a new simulated case study using the Advanced Geared Turbofan 30000 (AGTF30) dynamical model under real flight conditions. In an extensive comparison, we demonstrate that the proposed method outperforms other learning strategies (supervised learning, supervised learning with embedding and semi-supervised learning) and deep learning algorithms, yielding significant performance improvements on fault detection and fault segmentation. △ Less

Submitted 29 September, 2020; v1 submitted 28 December, 2019; originally announced December 2019.

Comments: 22 pages, 9 Figures

arXiv:1907.09204 [pdf, other]

doi 10.36001/ijphm.2019.v10i4.2613

Domain Adaptation for One-Class Classification: Monitoring the Health of Critical Systems Under Limited Information

Authors: Gabriel Michau, Olga Fink

Abstract: The failure of a complex and safety critical industrial asset can have extremely high consequences. Close monitoring for early detection of abnormal system conditions is therefore required. Data-driven solutions to this problem have been limited for two reasons: First, safety critical assets are designed and maintained to be highly reliable and faults are rare. Fault detection can thus not be solv… ▽ More The failure of a complex and safety critical industrial asset can have extremely high consequences. Close monitoring for early detection of abnormal system conditions is therefore required. Data-driven solutions to this problem have been limited for two reasons: First, safety critical assets are designed and maintained to be highly reliable and faults are rare. Fault detection can thus not be solved with supervised learning. Second, complex industrial systems usually have long lifetime during which they face very different operating conditions. In the early life of the system, the collected data is probably not representative of future operating conditions, making it challenging to train a robust model. In this paper, we propose a methodology to monitor the systems in their early life. To do so, we enhance the training dataset with other units from a fleet, for which longer observations are available. Since each unit has its own specificity, we propose to extract features made independent of their origin by three unsupervised feature alignment techniques. First, using a variational encoder, we impose a shared probabilistic encoder/decoder for both units. Second, we introduce a new loss designed to conserve inter-point spacial relationships between the input and the learned features. Last, we propose to train in an adversarial manner a discriminator on the origin of the features. Once aligned, the features are fed to a one-class classifier to monitor the health of the system. By exploring the different combinations of the proposed alignment strategies, and by testing them on a real case study, a fleet composed of 112 power plants operated in different geographical locations and under very different operating regimes, we demonstrate that this alignment is necessary and beneficial. △ Less

Submitted 30 September, 2019; v1 submitted 22 July, 2019; originally announced July 2019.

arXiv:1907.06481 [pdf, other]

doi 10.1109/ICPHM.2019.8819383

Unsupervised Fault Detection in Varying Operating Conditions

Authors: Gabriel Michau, Olga Fink

Abstract: Training data-driven approaches for complex industrial system health monitoring is challenging. When data on faulty conditions are rare or not available, the training has to be performed in a unsupervised manner. In addition, when the observation period, used for training, is kept short, to be able to monitor the system in its early life, the training data might not be representative of all the sy… ▽ More Training data-driven approaches for complex industrial system health monitoring is challenging. When data on faulty conditions are rare or not available, the training has to be performed in a unsupervised manner. In addition, when the observation period, used for training, is kept short, to be able to monitor the system in its early life, the training data might not be representative of all the system normal operating conditions. In this paper, we propose five approaches to perform fault detection in such context. Two approaches rely on the data from the unit to be monitored only: the baseline is trained on the early life of the unit. An incremental learning procedure tries to learn new operating conditions as they arise. Three other approaches take advantage of data from other similar units within a fleet. In two cases, units are directly compared to each other with similarity measures, and the data from similar units are combined in the training set. We propose, in the third case, a new deep-learning methodology to perform, first, a feature alignment of different units with an Unsupervised Feature Alignment Network (UFAN). Then, features of both units are combined in the training set of the fault detection neural network. The approaches are tested on a fleet comprising 112 units, observed over one year of data. All approaches proposed here are an improvement to the baseline, trained with two months of data only. As units in the fleet are found to be very dissimilar, the new architecture UFAN, that aligns units in the feature space, is outperforming others. △ Less

Submitted 15 July, 2019; originally announced July 2019.

Journal ref: Proceedings of the 2019 IEEE International Conference on Prognostics and Health Management, San Francisco

arXiv:1905.06004 [pdf, other]

Domain Adaptive Transfer Learning for Fault Diagnosis

Authors: Qin Wang, Gabriel Michau, Olga Fink

Abstract: Thanks to digitization of industrial assets in fleets, the ambitious goal of transferring fault diagnosis models fromone machine to the other has raised great interest. Solving these domain adaptive transfer learning tasks has the potential to save large efforts on manually labeling data and modifying models for new machines in the same fleet. Although data-driven methods have shown great potentia… ▽ More Thanks to digitization of industrial assets in fleets, the ambitious goal of transferring fault diagnosis models fromone machine to the other has raised great interest. Solving these domain adaptive transfer learning tasks has the potential to save large efforts on manually labeling data and modifying models for new machines in the same fleet. Although data-driven methods have shown great potential in fault diagnosis applications, their ability to generalize on new machines and new working conditions are limited because of their tendency to overfit to the training set in reality. One promising solution to this problem is to use domain adaptation techniques. It aims to improve model performance on the target new machine. Inspired by its successful implementation in computer vision, we introduced Domain-Adversarial Neural Networks (DANN) to our context, along with two other popular methods existing in previous fault diagnosis research. We then carefully justify the applicability of these methods in realistic fault diagnosis settings, and offer a unified experimental protocol for a fair comparison between domain adaptation methods for fault diagnosis problems. △ Less

Submitted 15 May, 2019; originally announced May 2019.

Comments: Presented at 2019 Prognostics and System Health Management Conference (PHM 2019) in Paris, France

Showing 1–14 of 14 results for author: Fink, O