-
Online Time Series Anomaly Detection with State Space Gaussian Processes
Authors:
Christian Bock,
François-Xavier Aubet,
Jan Gasthaus,
Andrey Kan,
Ming Chen,
Laurent Callot
Abstract:
We propose r-ssGPFA, an unsupervised online anomaly detection model for uni- and multivariate time series building on the efficient state space formulation of Gaussian processes. For high-dimensional time series, we propose an extension of Gaussian process factor analysis to identify the common latent processes of the time series, allowing us to detect anomalies efficiently in an interpretable man…
▽ More
We propose r-ssGPFA, an unsupervised online anomaly detection model for uni- and multivariate time series building on the efficient state space formulation of Gaussian processes. For high-dimensional time series, we propose an extension of Gaussian process factor analysis to identify the common latent processes of the time series, allowing us to detect anomalies efficiently in an interpretable manner. We gain explainability while speeding up computations by imposing an orthogonality constraint on the map** from the latent to the observed. Our model's robustness is improved by using a simple heuristic to skip Kalman updates when encountering anomalous observations. We investigate the behaviour of our model on synthetic data and show on standard benchmark datasets that our method is competitive with state-of-the-art methods while being computationally cheaper.
△ Less
Submitted 18 January, 2022;
originally announced January 2022.
-
Uncovering the Topology of Time-Varying fMRI Data using Cubical Persistence
Authors:
Bastian Rieck,
Tristan Yates,
Christian Bock,
Karsten Borgwardt,
Guy Wolf,
Nicholas Turk-Browne,
Smita Krishnaswamy
Abstract:
Functional magnetic resonance imaging (fMRI) is a crucial technology for gaining insights into cognitive processes in humans. Data amassed from fMRI measurements result in volumetric data sets that vary over time. However, analysing such data presents a challenge due to the large degree of noise and person-to-person variation in how information is represented in the brain. To address this challeng…
▽ More
Functional magnetic resonance imaging (fMRI) is a crucial technology for gaining insights into cognitive processes in humans. Data amassed from fMRI measurements result in volumetric data sets that vary over time. However, analysing such data presents a challenge due to the large degree of noise and person-to-person variation in how information is represented in the brain. To address this challenge, we present a novel topological approach that encodes each time point in an fMRI data set as a persistence diagram of topological features, i.e. high-dimensional voids present in the data. This representation naturally does not rely on voxel-by-voxel correspondence and is robust to noise. We show that these time-varying persistence diagrams can be clustered to find meaningful grou**s between participants, and that they are also useful in studying within-subject brain state trajectories of subjects performing a particular task. Here, we apply both clustering and trajectory analysis techniques to a group of participants watching the movie 'Partly Cloudy'. We observe significant differences in both brain state trajectories and overall topological activity between adults and children watching the same movie.
△ Less
Submitted 22 October, 2020; v1 submitted 14 June, 2020;
originally announced June 2020.
-
Path Imputation Strategies for Signature Models of Irregular Time Series
Authors:
Michael Moor,
Max Horn,
Christian Bock,
Karsten Borgwardt,
Bastian Rieck
Abstract:
The signature transform is a 'universal nonlinearity' on the space of continuous vector-valued paths, and has received attention for use in machine learning on time series. However, real-world temporal data is typically observed at discrete points in time, and must first be transformed into a continuous path before signature techniques can be applied. We make this step explicit by characterising i…
▽ More
The signature transform is a 'universal nonlinearity' on the space of continuous vector-valued paths, and has received attention for use in machine learning on time series. However, real-world temporal data is typically observed at discrete points in time, and must first be transformed into a continuous path before signature techniques can be applied. We make this step explicit by characterising it as an imputation problem, and empirically assess the impact of various imputation strategies when applying signature-based neural nets to irregular time series data. For one of these strategies, Gaussian process (GP) adapters, we propose an extension~(GP-PoM) that makes uncertainty information directly available to the subsequent classifier while at the same time preventing costly Monte-Carlo (MC) sampling. In our experiments, we find that the choice of imputation drastically affects shallow signature models, whereas deeper architectures are more robust. Next, we observe that uncertainty-aware predictions (based on GP-PoM or indicator imputations) are beneficial for predictive performance, even compared to the uncertainty-aware training of conventional GP adapters. In conclusion, we have demonstrated that the path construction is indeed crucial for signature models and that our proposed strategy leads to competitive performance in general, while improving robustness of signature models in particular.
△ Less
Submitted 6 June, 2020; v1 submitted 25 May, 2020;
originally announced May 2020.
-
Set Functions for Time Series
Authors:
Max Horn,
Michael Moor,
Christian Bock,
Bastian Rieck,
Karsten Borgwardt
Abstract:
Despite the eminent successes of deep neural networks, many architectures are often hard to transfer to irregularly-sampled and asynchronous time series that commonly occur in real-world datasets, especially in healthcare applications. This paper proposes a novel approach for classifying irregularly-sampled time series with unaligned measurements, focusing on high scalability and data efficiency.…
▽ More
Despite the eminent successes of deep neural networks, many architectures are often hard to transfer to irregularly-sampled and asynchronous time series that commonly occur in real-world datasets, especially in healthcare applications. This paper proposes a novel approach for classifying irregularly-sampled time series with unaligned measurements, focusing on high scalability and data efficiency. Our method SeFT (Set Functions for Time Series) is based on recent advances in differentiable set function learning, extremely parallelizable with a beneficial memory footprint, thus scaling well to large datasets of long time series and online monitoring scenarios. Furthermore, our approach permits quantifying per-observation contributions to the classification outcome. We extensively compare our method with existing algorithms on multiple healthcare time series datasets and demonstrate that it performs competitively whilst significantly reducing runtime.
△ Less
Submitted 14 September, 2020; v1 submitted 26 September, 2019;
originally announced September 2019.
-
Machine learning for early prediction of circulatory failure in the intensive care unit
Authors:
Stephanie L. Hyland,
Martin Faltys,
Matthias Hüser,
Xinrui Lyu,
Thomas Gumbsch,
Cristóbal Esteban,
Christian Bock,
Max Horn,
Michael Moor,
Bastian Rieck,
Marc Zimmermann,
Dean Bodenham,
Karsten Borgwardt,
Gunnar Rätsch,
Tobias M. Merz
Abstract:
Intensive care clinicians are presented with large quantities of patient information and measurements from a multitude of monitoring systems. The limited ability of humans to process such complex information hinders physicians to readily recognize and act on early signs of patient deterioration. We used machine learning to develop an early warning system for circulatory failure based on a high-res…
▽ More
Intensive care clinicians are presented with large quantities of patient information and measurements from a multitude of monitoring systems. The limited ability of humans to process such complex information hinders physicians to readily recognize and act on early signs of patient deterioration. We used machine learning to develop an early warning system for circulatory failure based on a high-resolution ICU database with 240 patient years of data. This automatic system predicts 90.0% of circulatory failure events (prevalence 3.1%), with 81.8% identified more than two hours in advance, resulting in an area under the receiver operating characteristic curve of 94.0% and area under the precision-recall curve of 63.0%. The model was externally validated in a large independent patient cohort.
△ Less
Submitted 19 April, 2019; v1 submitted 16 April, 2019;
originally announced April 2019.
-
Neural Persistence: A Complexity Measure for Deep Neural Networks Using Algebraic Topology
Authors:
Bastian Rieck,
Matteo Togninalli,
Christian Bock,
Michael Moor,
Max Horn,
Thomas Gumbsch,
Karsten Borgwardt
Abstract:
While many approaches to make neural networks more fathomable have been proposed, they are restricted to interrogating the network with input data. Measures for characterizing and monitoring structural properties, however, have not been developed. In this work, we propose neural persistence, a complexity measure for neural network architectures based on topological data analysis on weighted strati…
▽ More
While many approaches to make neural networks more fathomable have been proposed, they are restricted to interrogating the network with input data. Measures for characterizing and monitoring structural properties, however, have not been developed. In this work, we propose neural persistence, a complexity measure for neural network architectures based on topological data analysis on weighted stratified graphs. To demonstrate the usefulness of our approach, we show that neural persistence reflects best practices developed in the deep learning community such as dropout and batch normalization. Moreover, we derive a neural persistence-based stop** criterion that shortens the training process while achieving comparable accuracies as early stop** based on validation loss.
△ Less
Submitted 27 September, 2019; v1 submitted 23 December, 2018;
originally announced December 2018.