-
New directions in the applications of rough path theory
Authors:
Adeline Fermanian,
Terry Lyons,
James Morrill,
Cristopher Salvi
Abstract:
This article provides a concise overview of some of the recent advances in the application of rough path theory to machine learning. Controlled differential equations (CDEs) are discussed as the key mathematical model to describe the interaction of a stream with a physical control system. A collection of iterated integrals known as the signature naturally arises in the description of the response…
▽ More
This article provides a concise overview of some of the recent advances in the application of rough path theory to machine learning. Controlled differential equations (CDEs) are discussed as the key mathematical model to describe the interaction of a stream with a physical control system. A collection of iterated integrals known as the signature naturally arises in the description of the response produced by such interactions. The signature comes equipped with a variety of powerful properties rendering it an ideal feature map for streamed data. We summarise recent advances in the symbiosis between deep learning and CDEs, studying the link with RNNs and culminating with the Neural CDE model. We concluded with a discussion on signature kernel methods.
△ Less
Submitted 9 February, 2023;
originally announced February 2023.
-
Neural Controlled Differential Equations for Online Prediction Tasks
Authors:
James Morrill,
Patrick Kidger,
Lingyi Yang,
Terry Lyons
Abstract:
Neural controlled differential equations (Neural CDEs) are a continuous-time extension of recurrent neural networks (RNNs), achieving state-of-the-art (SOTA) performance at modelling functions of irregular time series. In order to interpret discrete data in continuous time, current implementations rely on non-causal interpolations of the data. This is fine when the whole time series is observed in…
▽ More
Neural controlled differential equations (Neural CDEs) are a continuous-time extension of recurrent neural networks (RNNs), achieving state-of-the-art (SOTA) performance at modelling functions of irregular time series. In order to interpret discrete data in continuous time, current implementations rely on non-causal interpolations of the data. This is fine when the whole time series is observed in advance, but means that Neural CDEs are not suitable for use in \textit{online prediction tasks}, where predictions need to be made in real-time: a major use case for recurrent networks. Here, we show how this limitation may be rectified. First, we identify several theoretical conditions that interpolation schemes for Neural CDEs should satisfy, such as boundedness and uniqueness. Second, we use these to motivate the introduction of new schemes that address these conditions, offering in particular measurability (for online prediction), and smoothness (for speed). Third, we empirically benchmark our online Neural CDE model on three continuous monitoring tasks from the MIMIC-IV medical database: we demonstrate improved performance on all tasks against ODE benchmarks, and on two of the three tasks against SOTA non-ODE benchmarks.
△ Less
Submitted 21 June, 2021;
originally announced June 2021.
-
Neural Rough Differential Equations for Long Time Series
Authors:
James Morrill,
Cristopher Salvi,
Patrick Kidger,
James Foster,
Terry Lyons
Abstract:
Neural controlled differential equations (CDEs) are the continuous-time analogue of recurrent neural networks, as Neural ODEs are to residual networks, and offer a memory-efficient continuous-time way to model functions of potentially irregular time series. Existing methods for computing the forward pass of a Neural CDE involve embedding the incoming time series into path space, often via interpol…
▽ More
Neural controlled differential equations (CDEs) are the continuous-time analogue of recurrent neural networks, as Neural ODEs are to residual networks, and offer a memory-efficient continuous-time way to model functions of potentially irregular time series. Existing methods for computing the forward pass of a Neural CDE involve embedding the incoming time series into path space, often via interpolation, and using evaluations of this path to drive the hidden state. Here, we use rough path theory to extend this formulation. Instead of directly embedding into path space, we instead represent the input signal over small time intervals through its \textit{log-signature}, which are statistics describing how the signal drives a CDE. This is the approach for solving \textit{rough differential equations} (RDEs), and correspondingly we describe our main contribution as the introduction of Neural RDEs. This extension has a purpose: by generalising the Neural CDE approach to a broader class of driving signals, we demonstrate particular advantages for tackling long time series. In this regime, we demonstrate efficacy on problems of length up to 17k observations and observe significant training speed-ups, improvements in model performance, and reduced memory requirements compared to existing approaches.
△ Less
Submitted 21 June, 2021; v1 submitted 17 September, 2020;
originally announced September 2020.
-
A Generalised Signature Method for Multivariate Time Series Feature Extraction
Authors:
James Morrill,
Adeline Fermanian,
Patrick Kidger,
Terry Lyons
Abstract:
The 'signature method' refers to a collection of feature extraction techniques for multivariate time series, derived from the theory of controlled differential equations. There is a great deal of flexibility as to how this method can be applied. On the one hand, this flexibility allows the method to be tailored to specific problems, but on the other hand, can make precise application challenging.…
▽ More
The 'signature method' refers to a collection of feature extraction techniques for multivariate time series, derived from the theory of controlled differential equations. There is a great deal of flexibility as to how this method can be applied. On the one hand, this flexibility allows the method to be tailored to specific problems, but on the other hand, can make precise application challenging. This paper makes two contributions. First, the variations on the signature method are unified into a general approach, the \emph{generalised signature method}, of which previous variations are special cases. A primary aim of this unifying framework is to make the signature method more accessible to any machine learning practitioner, whereas it is now mostly used by specialists. Second, and within this framework, we derive a canonical collection of choices that provide a domain-agnostic starting point. We derive these choices as a result of an extensive empirical study on 26 datasets and go on to show competitive performance against current benchmarks for multivariate time series classification. Finally, to ease practical application, we make our techniques available as part of the open-source [redacted] project.
△ Less
Submitted 6 February, 2021; v1 submitted 1 June, 2020;
originally announced June 2020.
-
Generalised Interpretable Shapelets for Irregular Time Series
Authors:
Patrick Kidger,
James Morrill,
Terry Lyons
Abstract:
The shapelet transform is a form of feature extraction for time series, in which a time series is described by its similarity to each of a collection of `shapelets'. However it has previously suffered from a number of limitations, such as being limited to regularly-spaced fully-observed time series, and having to choose between efficient training and interpretability. Here, we extend the method to…
▽ More
The shapelet transform is a form of feature extraction for time series, in which a time series is described by its similarity to each of a collection of `shapelets'. However it has previously suffered from a number of limitations, such as being limited to regularly-spaced fully-observed time series, and having to choose between efficient training and interpretability. Here, we extend the method to continuous time, and in doing so handle the general case of irregularly-sampled partially-observed multivariate time series. Furthermore, we show that a simple regularisation penalty may be used to train efficiently without sacrificing interpretability. The continuous-time formulation additionally allows for learning the length of each shapelet (previously a discrete object) in a differentiable manner. Finally, we demonstrate that the measure of similarity between time series may be generalised to a learnt pseudometric. We validate our method by demonstrating its performance and interpretability on several datasets; for example we discover (purely from data) that the digits 5 and 6 may be distinguished by the chirality of their bottom loop, and that a kind of spectral gap exists in spoken audio classification.
△ Less
Submitted 29 May, 2020; v1 submitted 28 May, 2020;
originally announced May 2020.
-
Neural Controlled Differential Equations for Irregular Time Series
Authors:
Patrick Kidger,
James Morrill,
James Foster,
Terry Lyons
Abstract:
Neural ordinary differential equations are an attractive option for modelling temporal dynamics. However, a fundamental issue is that the solution to an ordinary differential equation is determined by its initial condition, and there is no mechanism for adjusting the trajectory based on subsequent observations. Here, we demonstrate how this may be resolved through the well-understood mathematics o…
▽ More
Neural ordinary differential equations are an attractive option for modelling temporal dynamics. However, a fundamental issue is that the solution to an ordinary differential equation is determined by its initial condition, and there is no mechanism for adjusting the trajectory based on subsequent observations. Here, we demonstrate how this may be resolved through the well-understood mathematics of \emph{controlled differential equations}. The resulting \emph{neural controlled differential equation} model is directly applicable to the general setting of partially-observed irregularly-sampled multivariate time series, and (unlike previous work on this problem) it may utilise memory-efficient adjoint-based backpropagation even across observations. We demonstrate that our model achieves state-of-the-art performance against similar (ODE or RNN based) models in empirical studies on a range of datasets. Finally we provide theoretical results demonstrating universal approximation, and that our model subsumes alternative ODE models.
△ Less
Submitted 5 November, 2020; v1 submitted 18 May, 2020;
originally announced May 2020.