-
Koopman Ensembles for Probabilistic Time Series Forecasting
Authors:
Anthony Frion,
Lucas Drumetz,
Guillaume Tochon,
Mauro Dalla Mura,
Albdeldjalil Aïssa El Bey
Abstract:
In the context of an increasing popularity of data-driven models to represent dynamical systems, many machine learning-based implementations of the Koopman operator have recently been proposed. However, the vast majority of those works are limited to deterministic predictions, while the knowledge of uncertainty is critical in fields like meteorology and climatology. In this work, we investigate th…
▽ More
In the context of an increasing popularity of data-driven models to represent dynamical systems, many machine learning-based implementations of the Koopman operator have recently been proposed. However, the vast majority of those works are limited to deterministic predictions, while the knowledge of uncertainty is critical in fields like meteorology and climatology. In this work, we investigate the training of ensembles of models to produce stochastic outputs. We show through experiments on real remote sensing image time series that ensembles of independently trained models are highly overconfident and that using a training criterion that explicitly encourages the members to produce predictions with high inter-model variances greatly improves the uncertainty quantification of the ensembles.
△ Less
Submitted 13 March, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds
Authors:
Clément Bonet,
Lucas Drumetz,
Nicolas Courty
Abstract:
While many Machine Learning methods were developed or transposed on Riemannian manifolds to tackle data with known non Euclidean geometry, Optimal Transport (OT) methods on such spaces have not received much attention. The main OT tool on these spaces is the Wasserstein distance which suffers from a heavy computational burden. On Euclidean spaces, a popular alternative is the Sliced-Wasserstein di…
▽ More
While many Machine Learning methods were developed or transposed on Riemannian manifolds to tackle data with known non Euclidean geometry, Optimal Transport (OT) methods on such spaces have not received much attention. The main OT tool on these spaces is the Wasserstein distance which suffers from a heavy computational burden. On Euclidean spaces, a popular alternative is the Sliced-Wasserstein distance, which leverages a closed-form solution of the Wasserstein distance in one dimension, but which is not readily available on manifolds. In this work, we derive general constructions of Sliced-Wasserstein distances on Cartan-Hadamard manifolds, Riemannian manifolds with non-positive curvature, which include among others Hyperbolic spaces or the space of Symmetric Positive Definite matrices. Then, we propose different applications. Additionally, we derive non-parametric schemes to minimize these new distances by approximating their Wasserstein gradient flows.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
On Transfer in Classification: How Well do Subsets of Classes Generalize?
Authors:
Raphael Baena,
Lucas Drumetz,
Vincent Gripon
Abstract:
In classification, it is usual to observe that models trained on a given set of classes can generalize to previously unseen ones, suggesting the ability to learn beyond the initial task. This ability is often leveraged in the context of transfer learning where a pretrained model can be used to process new classes, with or without fine tuning. Surprisingly, there are a few papers looking at the the…
▽ More
In classification, it is usual to observe that models trained on a given set of classes can generalize to previously unseen ones, suggesting the ability to learn beyond the initial task. This ability is often leveraged in the context of transfer learning where a pretrained model can be used to process new classes, with or without fine tuning. Surprisingly, there are a few papers looking at the theoretical roots beyond this phenomenon. In this work, we are interested in laying the foundations of such a theoretical framework for transferability between sets of classes. Namely, we establish a partially ordered set of subsets of classes. This tool allows to represent which subset of classes can generalize to others. In a more practical setting, we explore the ability of our framework to predict which subset of classes can lead to the best performance when testing on all of them. We also explore few-shot learning, where transfer is the golden standard. Our work contributes to better understanding of transfer mechanics and model generalization.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
Physics Informed and Data Driven Simulation of Underwater Images via Residual Learning
Authors:
Tanmoy Mondal,
Ricardo Mendoza,
Lucas Drumetz
Abstract:
In general, underwater images suffer from color distortion and low contrast, because light is attenuated and backscattered as it propagates through water (differently depending on wavelength and on the properties of the water body). An existing simple degradation model (similar to atmospheric image "hazing" effects), though helpful, is not sufficient to properly represent the underwater image degr…
▽ More
In general, underwater images suffer from color distortion and low contrast, because light is attenuated and backscattered as it propagates through water (differently depending on wavelength and on the properties of the water body). An existing simple degradation model (similar to atmospheric image "hazing" effects), though helpful, is not sufficient to properly represent the underwater image degradation because there are unaccounted for and non-measurable factors e.g. scattering of light due to turbidity of water, reflective characteristics of turbid medium etc. We propose a deep learning-based architecture to automatically simulate the underwater effects where only a dehazing-like image formation equation is known to the network, and the additional degradation due to the other unknown factors if inferred in a data-driven way. We only use RGB images (because in real-time scenario depth image is not available) to estimate the depth image. For testing, we have proposed (due to the lack of real underwater image datasets) a complex image formation model/equation to manually generate images that resemble real underwater images (used as ground truth). However, only the classical image formation equation (the one used for image dehazing) is informed to the network. This mimics the fact that in a real scenario, the physics are never completely known and only simplified models are known. Thanks to the ground truth, generated by a complex image formation equation, we could successfully perform a qualitative and quantitative evaluation of proposed technique, compared to other purely data driven approaches
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Time-changed normalizing flows for accurate SDE modeling
Authors:
Naoufal El Bekri,
Lucas Drumetz,
Franck Vermet
Abstract:
The generative paradigm has become increasingly important in machine learning and deep learning models. Among popular generative models are normalizing flows, which enable exact likelihood estimation by transforming a base distribution through diffeomorphic transformations. Extending the normalizing flow framework to handle time-indexed flows gave dynamic normalizing flows, a powerful tool to mode…
▽ More
The generative paradigm has become increasingly important in machine learning and deep learning models. Among popular generative models are normalizing flows, which enable exact likelihood estimation by transforming a base distribution through diffeomorphic transformations. Extending the normalizing flow framework to handle time-indexed flows gave dynamic normalizing flows, a powerful tool to model time series, stochastic processes, and neural stochastic differential equations (SDEs). In this work, we propose a novel variant of dynamic normalizing flows, a Time Changed Normalizing Flow (TCNF), based on time deformation of a Brownian motion which constitutes a versatile and extensive family of Gaussian processes. This approach enables us to effectively model some SDEs, that cannot be modeled otherwise, including standard ones such as the well-known Ornstein-Uhlenbeck process, and generalizes prior methodologies, leading to improved results and better inference and prediction capability.
△ Less
Submitted 15 January, 2024; v1 submitted 22 December, 2023;
originally announced December 2023.
-
MultiHU-TD: Multifeature Hyperspectral Unmixing Based on Tensor Decomposition
Authors:
Mohamad Jouni,
Mauro Dalla Mura,
Lucas Drumetz,
Pierre Comon
Abstract:
Hyperspectral unmixing allows representing mixed pixels as a set of pure materials weighted by their abundances. Spectral features alone are often insufficient, so it is common to rely on other features of the scene. Matrix models become insufficient when the hyperspectral image (HSI) is represented as a high-order tensor with additional features in a multimodal, multifeature framework. Tensor mod…
▽ More
Hyperspectral unmixing allows representing mixed pixels as a set of pure materials weighted by their abundances. Spectral features alone are often insufficient, so it is common to rely on other features of the scene. Matrix models become insufficient when the hyperspectral image (HSI) is represented as a high-order tensor with additional features in a multimodal, multifeature framework. Tensor models such as canonical polyadic decomposition allow for this kind of unmixing but lack a general framework and interpretability of the results. In this article, we propose an interpretable methodological framework for low-rank multifeature hyperspectral unmixing based on tensor decomposition (MultiHU-TD) that incorporates the abundance sum-to-one constraint in the alternating optimization alternating direction method of multipliers (ADMM) algorithm and provide in-depth mathematical, physical, and graphical interpretation and connections with the extended linear mixing model. As additional features, we propose to incorporate mathematical morphology and reframe a previous work on neighborhood patches within MultiHU-TD. Experiments on real HSIs showcase the interpretability of the model and the analysis of the results. Python and MATLAB implementations are made available on GitHub.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.
-
Neural Koopman prior for data assimilation
Authors:
Anthony Frion,
Lucas Drumetz,
Mauro Dalla Mura,
Guillaume Tochon,
Abdeldjalil Aïssa El Bey
Abstract:
With the increasing availability of large scale datasets, computational power and tools like automatic differentiation and expressive neural network architectures, sequential data are now often treated in a data-driven way, with a dynamical model trained from the observation data. While neural networks are often seen as uninterpretable black-box architectures, they can still benefit from physical…
▽ More
With the increasing availability of large scale datasets, computational power and tools like automatic differentiation and expressive neural network architectures, sequential data are now often treated in a data-driven way, with a dynamical model trained from the observation data. While neural networks are often seen as uninterpretable black-box architectures, they can still benefit from physical priors on the data and from mathematical knowledge. In this paper, we use a neural network architecture which leverages the long-known Koopman operator theory to embed dynamical systems in latent spaces where their dynamics can be described linearly, enabling a number of appealing features. We introduce methods that enable to train such a model for long-term continuous reconstruction, even in difficult contexts where the data comes in irregularly-sampled time series. The potential for self-supervised learning is also demonstrated, as we show the promising use of trained dynamical models as priors for variational data assimilation techniques, with applications to e.g. time series interpolation and forecasting.
△ Less
Submitted 21 June, 2024; v1 submitted 11 September, 2023;
originally announced September 2023.
-
Learning Sentinel-2 reflectance dynamics for data-driven assimilation and forecasting
Authors:
Anthony Frion,
Lucas Drumetz,
Guillaume Tochon,
Mauro Dalla Mura,
Abdeldjalil Aïssa El Bey
Abstract:
Over the last few years, massive amounts of satellite multispectral and hyperspectral images covering the Earth's surface have been made publicly available for scientific purpose, for example through the European Copernicus project. Simultaneously, the development of self-supervised learning (SSL) methods has sparked great interest in the remote sensing community, enabling to learn latent represen…
▽ More
Over the last few years, massive amounts of satellite multispectral and hyperspectral images covering the Earth's surface have been made publicly available for scientific purpose, for example through the European Copernicus project. Simultaneously, the development of self-supervised learning (SSL) methods has sparked great interest in the remote sensing community, enabling to learn latent representations from unlabeled data to help treating downstream tasks for which there is few annotated examples, such as interpolation, forecasting or unmixing. Following this line, we train a deep learning model inspired from the Koopman operator theory to model long-term reflectance dynamics in an unsupervised way. We show that this trained model, being differentiable, can be used as a prior for data assimilation in a straightforward way. Our datasets, which are composed of Sentinel-2 multispectral image time series, are publicly released with several levels of treatment.
△ Less
Submitted 5 May, 2023;
originally announced May 2023.
-
Leveraging Neural Koopman Operators to Learn Continuous Representations of Dynamical Systems from Scarce Data
Authors:
Anthony Frion,
Lucas Drumetz,
Mauro Dalla Mura,
Guillaume Tochon,
Abdeldjalil Aissa El Bey
Abstract:
Over the last few years, several works have proposed deep learning architectures to learn dynamical systems from observation data with no or little knowledge of the underlying physics. A line of work relies on learning representations where the dynamics of the underlying phenomenon can be described by a linear operator, based on the Koopman operator theory. However, despite being able to provide r…
▽ More
Over the last few years, several works have proposed deep learning architectures to learn dynamical systems from observation data with no or little knowledge of the underlying physics. A line of work relies on learning representations where the dynamics of the underlying phenomenon can be described by a linear operator, based on the Koopman operator theory. However, despite being able to provide reliable long-term predictions for some dynamical systems in ideal situations, the methods proposed so far have limitations, such as requiring to discretize intrinsically continuous dynamical systems, leading to data loss, especially when handling incomplete or sparsely sampled data. Here, we propose a new deep Koopman framework that represents dynamics in an intrinsically continuous way, leading to better performance on limited training data, as exemplified on several datasets arising from dynamical systems.
△ Less
Submitted 13 March, 2023;
originally announced March 2023.
-
Sliced-Wasserstein on Symmetric Positive Definite Matrices for M/EEG Signals
Authors:
Clément Bonet,
Benoît Malézieux,
Alain Rakotomamonjy,
Lucas Drumetz,
Thomas Moreau,
Matthieu Kowalski,
Nicolas Courty
Abstract:
When dealing with electro or magnetoencephalography records, many supervised prediction tasks are solved by working with covariance matrices to summarize the signals. Learning with these matrices requires using Riemanian geometry to account for their structure. In this paper, we propose a new method to deal with distributions of covariance matrices and demonstrate its computational efficiency on M…
▽ More
When dealing with electro or magnetoencephalography records, many supervised prediction tasks are solved by working with covariance matrices to summarize the signals. Learning with these matrices requires using Riemanian geometry to account for their structure. In this paper, we propose a new method to deal with distributions of covariance matrices and demonstrate its computational efficiency on M/EEG multivariate time series. More specifically, we define a Sliced-Wasserstein distance between measures of symmetric positive definite matrices that comes with strong theoretical guarantees. Then, we take advantage of its properties and kernel methods to apply this distance to brain-age prediction from MEG data and compare it to state-of-the-art algorithms based on Riemannian geometry. Finally, we show that it is an efficient surrogate to the Wasserstein distance in domain adaptation for Brain Computer Interface applications.
△ Less
Submitted 24 May, 2023; v1 submitted 10 March, 2023;
originally announced March 2023.
-
Disambiguation of One-Shot Visual Classification Tasks: A Simplex-Based Approach
Authors:
Yassir Bendou,
Lucas Drumetz,
Vincent Gripon,
Giulia Lioi,
Bastien Pasdeloup
Abstract:
The field of visual few-shot classification aims at transferring the state-of-the-art performance of deep learning visual systems onto tasks where only a very limited number of training samples are available. The main solution consists in training a feature extractor using a large and diverse dataset to be applied to the considered few-shot task. Thanks to the encoded priors in the feature extract…
▽ More
The field of visual few-shot classification aims at transferring the state-of-the-art performance of deep learning visual systems onto tasks where only a very limited number of training samples are available. The main solution consists in training a feature extractor using a large and diverse dataset to be applied to the considered few-shot task. Thanks to the encoded priors in the feature extractors, classification tasks with as little as one example (or "shot'') for each class can be solved with high accuracy, even when the shots display individual features not representative of their classes. Yet, the problem becomes more complicated when some of the given shots display multiple objects. In this paper, we present a strategy which aims at detecting the presence of multiple and previously unseen objects in a given shot. This methodology is based on identifying the corners of a simplex in a high dimensional space. We introduce an optimization routine and showcase its ability to successfully detect multiple (previously unseen) objects in raw images. Then, we introduce a downstream classifier meant to exploit the presence of multiple objects to improve the performance of few-shot classification, in the case of extreme settings where only one shot is given for its class. Using standard benchmarks of the field, we show the ability of the proposed method to slightly, yet statistically significantly, improve accuracy in these settings.
△ Less
Submitted 16 January, 2023;
originally announced January 2023.
-
Hyperbolic Sliced-Wasserstein via Geodesic and Horospherical Projections
Authors:
Clément Bonet,
Laetitia Chapel,
Lucas Drumetz,
Nicolas Courty
Abstract:
It has been shown beneficial for many types of data which present an underlying hierarchical structure to be embedded in hyperbolic spaces. Consequently, many tools of machine learning were extended to such spaces, but only few discrepancies to compare probability distributions defined over those spaces exist. Among the possible candidates, optimal transport distances are well defined on such Riem…
▽ More
It has been shown beneficial for many types of data which present an underlying hierarchical structure to be embedded in hyperbolic spaces. Consequently, many tools of machine learning were extended to such spaces, but only few discrepancies to compare probability distributions defined over those spaces exist. Among the possible candidates, optimal transport distances are well defined on such Riemannian manifolds and enjoy strong theoretical properties, but suffer from high computational cost. On Euclidean spaces, sliced-Wasserstein distances, which leverage a closed-form of the Wasserstein distance in one dimension, are more computationally efficient, but are not readily available on hyperbolic spaces. In this work, we propose to derive novel hyperbolic sliced-Wasserstein discrepancies. These constructions use projections on the underlying geodesics either along horospheres or geodesics. We study and compare them on different tasks where hyperbolic representations are relevant, such as sampling or image classification.
△ Less
Submitted 26 June, 2023; v1 submitted 18 November, 2022;
originally announced November 2022.
-
Spatial Graph Signal Interpolation with an Application for Merging BCI Datasets with Various Dimensionalities
Authors:
Yassine El Ouahidi,
Lucas Drumetz,
Giulia Lioi,
Nicolas Farrugia,
Bastien Pasdeloup,
Vincent Gripon
Abstract:
BCI Motor Imagery datasets usually are small and have different electrodes setups. When training a Deep Neural Network, one may want to capitalize on all these datasets to increase the amount of data available and hence obtain good generalization results. To this end, we introduce a spatial graph signal interpolation technique, that allows to interpolate efficiently multiple electrodes. We conduct…
▽ More
BCI Motor Imagery datasets usually are small and have different electrodes setups. When training a Deep Neural Network, one may want to capitalize on all these datasets to increase the amount of data available and hence obtain good generalization results. To this end, we introduce a spatial graph signal interpolation technique, that allows to interpolate efficiently multiple electrodes. We conduct a set of experiments with five BCI Motor Imagery datasets comparing the proposed interpolation with spherical splines interpolation. We believe that this work provides novel ideas on how to leverage graphs to interpolate electrodes and on how to homogenize multiple datasets.
△ Less
Submitted 28 October, 2022;
originally announced November 2022.
-
Geometry-preserving Lie Group Integrators For Differential Equations On The Manifold Of Symmetric Positive Definite Matrices
Authors:
Lucas Drumetz,
Alexandre Reiffers-Masson,
Naoufal El Bekri,
Franck Vermet
Abstract:
In many applications, one encounters signals that lie on manifolds rather than a Euclidean space. In particular, covariance matrices are examples of ubiquitous mathematical objects that have a non Euclidean structure. The application of Euclidean methods to integrate differential equations lying on such objects does not respect the geometry of the manifold, which can cause many numerical issues. I…
▽ More
In many applications, one encounters signals that lie on manifolds rather than a Euclidean space. In particular, covariance matrices are examples of ubiquitous mathematical objects that have a non Euclidean structure. The application of Euclidean methods to integrate differential equations lying on such objects does not respect the geometry of the manifold, which can cause many numerical issues. In this paper, we propose to use Lie group methods to define geometry-preserving numerical integration schemes on the manifold of symmetric positive definite matrices. These can be applied to a number of differential equations on covariance matrices of practical interest. We show that they are more stable and robust than other classical or naive integration schemes on an example.
△ Less
Submitted 15 May, 2023; v1 submitted 17 October, 2022;
originally announced October 2022.
-
Active Few-Shot Classification: a New Paradigm for Data-Scarce Learning Settings
Authors:
Aymane Abdali,
Vincent Gripon,
Lucas Drumetz,
Bartosz Boguslawski
Abstract:
We consider a novel formulation of the problem of Active Few-Shot Classification (AFSC) where the objective is to classify a small, initially unlabeled, dataset given a very restrained labeling budget. This problem can be seen as a rival paradigm to classical Transductive Few-Shot Classification (TFSC), as both these approaches are applicable in similar conditions. We first propose a methodology t…
▽ More
We consider a novel formulation of the problem of Active Few-Shot Classification (AFSC) where the objective is to classify a small, initially unlabeled, dataset given a very restrained labeling budget. This problem can be seen as a rival paradigm to classical Transductive Few-Shot Classification (TFSC), as both these approaches are applicable in similar conditions. We first propose a methodology that combines statistical inference, and an original two-tier active learning strategy that fits well into this framework. We then adapt several standard vision benchmarks from the field of TFSC. Our experiments show the potential benefits of AFSC can be substantial, with gains in average weighted accuracy of up to 10% compared to state-of-the-art TFSC methods for the same labeling budget. We believe this new paradigm could lead to new developments and standards in data-scarce learning settings.
△ Less
Submitted 23 September, 2022;
originally announced September 2022.
-
Turning Normalizing Flows into Monge Maps with Geodesic Gaussian Preserving Flows
Authors:
Guillaume Morel,
Lucas Drumetz,
Simon Benaïchouche,
Nicolas Courty,
François Rousseau
Abstract:
Normalizing Flows (NF) are powerful likelihood-based generative models that are able to trade off between expressivity and tractability to model complex densities. A now well established research avenue leverages optimal transport (OT) and looks for Monge maps, i.e. models with minimal effort between the source and target distributions. This paper introduces a method based on Brenier's polar facto…
▽ More
Normalizing Flows (NF) are powerful likelihood-based generative models that are able to trade off between expressivity and tractability to model complex densities. A now well established research avenue leverages optimal transport (OT) and looks for Monge maps, i.e. models with minimal effort between the source and target distributions. This paper introduces a method based on Brenier's polar factorization theorem to transform any trained NF into a more OT-efficient version without changing the final density. We do so by learning a rearrangement of the source (Gaussian) distribution that minimizes the OT cost between the source and the final density. We further constrain the path leading to the estimated Monge map to lie on a geodesic in the space of volume-preserving diffeomorphisms thanks to Euler's equations. The proposed method leads to smooth flows with reduced OT cost for several existing models without affecting the model performance.
△ Less
Submitted 14 April, 2023; v1 submitted 22 September, 2022;
originally announced September 2022.
-
Preserving Fine-Grain Feature Information in Classification via Entropic Regularization
Authors:
Raphael Baena,
Lucas Drumetz,
Vincent Gripon
Abstract:
Labeling a classification dataset implies to define classes and associated coarse labels, that may approximate a smoother and more complicated ground truth. For example, natural images may contain multiple objects, only one of which is labeled in many vision datasets, or classes may result from the discretization of a regression problem. Using cross-entropy to train classification models on such c…
▽ More
Labeling a classification dataset implies to define classes and associated coarse labels, that may approximate a smoother and more complicated ground truth. For example, natural images may contain multiple objects, only one of which is labeled in many vision datasets, or classes may result from the discretization of a regression problem. Using cross-entropy to train classification models on such coarse labels is likely to roughly cut through the feature space, potentially disregarding the most meaningful such features, in particular losing information on the underlying fine-grain task. In this paper we are interested in the problem of solving fine-grain classification or regression, using a model trained on coarse-grain labels only. We show that standard cross-entropy can lead to overfitting to coarse-related features. We introduce an entropy-based regularization to promote more diversity in the feature space of trained models, and empirically demonstrate the efficacy of this methodology to reach better performance on the fine-grain problems. Our results are supported through theoretical developments and empirical validation.
△ Less
Submitted 7 August, 2022;
originally announced August 2022.
-
Spherical Sliced-Wasserstein
Authors:
Clément Bonet,
Paul Berg,
Nicolas Courty,
François Septier,
Lucas Drumetz,
Minh-Tan Pham
Abstract:
Many variants of the Wasserstein distance have been introduced to reduce its original computational burden. In particular the Sliced-Wasserstein distance (SW), which leverages one-dimensional projections for which a closed-form solution of the Wasserstein distance is available, has received a lot of interest. Yet, it is restricted to data living in Euclidean spaces, while the Wasserstein distance…
▽ More
Many variants of the Wasserstein distance have been introduced to reduce its original computational burden. In particular the Sliced-Wasserstein distance (SW), which leverages one-dimensional projections for which a closed-form solution of the Wasserstein distance is available, has received a lot of interest. Yet, it is restricted to data living in Euclidean spaces, while the Wasserstein distance has been studied and used recently on manifolds. We focus more specifically on the sphere, for which we define a novel SW discrepancy, which we call spherical Sliced-Wasserstein, making a first step towards defining SW discrepancies on manifolds. Our construction is notably based on closed-form solutions of the Wasserstein distance on the circle, together with a new spherical Radon transform. Along with efficient algorithms and the corresponding implementations, we illustrate its properties in several machine learning use cases where spherical representations of data are at stake: sampling on the sphere, density estimation on real earth data or hyperspherical auto-encoders.
△ Less
Submitted 30 January, 2023; v1 submitted 17 June, 2022;
originally announced June 2022.
-
Preventing Manifold Intrusion with Locality: Local Mixup
Authors:
Raphael Baena,
Lucas Drumetz,
Vincent Gripon
Abstract:
Mixup is a data-dependent regularization technique that consists in linearly interpolating input samples and associated outputs. It has been shown to improve accuracy when used to train on standard machine learning datasets. However, authors have pointed out that Mixup can produce out-of-distribution virtual samples and even contradictions in the augmented training set, potentially resulting in ad…
▽ More
Mixup is a data-dependent regularization technique that consists in linearly interpolating input samples and associated outputs. It has been shown to improve accuracy when used to train on standard machine learning datasets. However, authors have pointed out that Mixup can produce out-of-distribution virtual samples and even contradictions in the augmented training set, potentially resulting in adversarial effects. In this paper, we introduce Local Mixup in which distant input samples are weighted down when computing the loss. In constrained settings we demonstrate that Local Mixup can create a trade-off between bias and variance, with the extreme cases reducing to vanilla training and classical Mixup. Using standardized computer vision benchmarks , we also show that Local Mixup can improve test accuracy.
△ Less
Submitted 12 January, 2022;
originally announced January 2022.
-
Efficient Gradient Flows in Sliced-Wasserstein Space
Authors:
Clément Bonet,
Nicolas Courty,
François Septier,
Lucas Drumetz
Abstract:
Minimizing functionals in the space of probability distributions can be done with Wasserstein gradient flows. To solve them numerically, a possible approach is to rely on the Jordan-Kinderlehrer-Otto (JKO) scheme which is analogous to the proximal scheme in Euclidean spaces. However, it requires solving a nested optimization problem at each iteration, and is known for its computational challenges,…
▽ More
Minimizing functionals in the space of probability distributions can be done with Wasserstein gradient flows. To solve them numerically, a possible approach is to rely on the Jordan-Kinderlehrer-Otto (JKO) scheme which is analogous to the proximal scheme in Euclidean spaces. However, it requires solving a nested optimization problem at each iteration, and is known for its computational challenges, especially in high dimension. To alleviate it, very recent works propose to approximate the JKO scheme leveraging Brenier's theorem, and using gradients of Input Convex Neural Networks to parameterize the density (JKO-ICNN). However, this method comes with a high computational cost and stability issues. Instead, this work proposes to use gradient flows in the space of probability measures endowed with the sliced-Wasserstein (SW) distance. We argue that this method is more flexible than JKO-ICNN, since SW enjoys a closed-form differentiable approximation. Thus, the density at each step can be parameterized by any generative model which alleviates the computational burden and makes it tractable in higher dimensions.
△ Less
Submitted 15 November, 2022; v1 submitted 21 October, 2021;
originally announced October 2021.
-
Subspace Detours Meet Gromov-Wasserstein
Authors:
Clément Bonet,
Nicolas Courty,
François Septier,
Lucas Drumetz
Abstract:
In the context of optimal transport methods, the subspace detour approach was recently presented by Muzellec and Cuturi (2019). It consists in building a nearly optimal transport plan in the measures space from an optimal transport plan in a wisely chosen subspace, onto which the original measures are projected. The contribution of this paper is to extend this category of methods to the Gromov-Was…
▽ More
In the context of optimal transport methods, the subspace detour approach was recently presented by Muzellec and Cuturi (2019). It consists in building a nearly optimal transport plan in the measures space from an optimal transport plan in a wisely chosen subspace, onto which the original measures are projected. The contribution of this paper is to extend this category of methods to the Gromov-Wasserstein problem, which is a particular type of transport distance involving the inner geometry of the compared distributions. After deriving the associated formalism and properties, we also discuss a specific cost for which we can show connections with the Knothe-Rosenblatt rearrangement. We finally give an experimental illustration on a shape matching problem.
△ Less
Submitted 21 October, 2021;
originally announced October 2021.
-
Graphs as Tools to Improve Deep Learning Methods
Authors:
Carlos Lassance,
Myriam Bontonou,
Mounia Hamidouche,
Bastien Pasdeloup,
Lucas Drumetz,
Vincent Gripon
Abstract:
In recent years, deep neural networks (DNNs) have known an important rise in popularity. However, although they are state-of-the-art in many machine learning challenges, they still suffer from several limitations. For example, DNNs require a lot of training data, which might not be available in some practical applications. In addition, when small perturbations are added to the inputs, DNNs are pro…
▽ More
In recent years, deep neural networks (DNNs) have known an important rise in popularity. However, although they are state-of-the-art in many machine learning challenges, they still suffer from several limitations. For example, DNNs require a lot of training data, which might not be available in some practical applications. In addition, when small perturbations are added to the inputs, DNNs are prone to misclassification errors. DNNs are also viewed as black-boxes and as such their decisions are often criticized for their lack of interpretability.
In this chapter, we review recent works that aim at using graphs as tools to improve deep learning methods. These graphs are defined considering a specific layer in a deep learning architecture. Their vertices represent distinct samples, and their edges depend on the similarity of the corresponding intermediate representations. These graphs can then be leveraged using various methodologies, many of which built on top of graph signal processing.
This chapter is composed of four main parts: tools for visualizing intermediate layers in a DNN, denoising data representations, optimizing graph objective functions and regularizing the learning process.
△ Less
Submitted 8 October, 2021;
originally announced October 2021.
-
Learning stochastic dynamical systems with neural networks mimicking the Euler-Maruyama scheme
Authors:
Noura Dridi,
Lucas Drumetz,
Ronan Fablet
Abstract:
Stochastic differential equations (SDEs) are one of the most important representations of dynamical systems. They are notable for the ability to include a deterministic component of the system and a stochastic one to represent random unknown factors. However, this makes learning SDEs much more challenging than ordinary differential equations (ODEs). In this paper, we propose a data driven approach…
▽ More
Stochastic differential equations (SDEs) are one of the most important representations of dynamical systems. They are notable for the ability to include a deterministic component of the system and a stochastic one to represent random unknown factors. However, this makes learning SDEs much more challenging than ordinary differential equations (ODEs). In this paper, we propose a data driven approach where parameters of the SDE are represented by a neural network with a built-in SDE integration scheme. The loss function is based on a maximum likelihood criterion, under order one Markov Gaussian assumptions. The algorithm is applied to the geometric brownian motion and a stochastic version of the Lorenz-63 model. The latter is particularly hard to handle due to the presence of a stochastic component that depends on the state. The algorithm performance is attested using different simulations results. Besides, comparisons are performed with the reference gradient matching method used for non linear drift estimation, and a neural networks-based method, that does not consider the stochastic term.
△ Less
Submitted 18 May, 2021;
originally announced May 2021.
-
Inferring Graph Signal Translations as Invariant Transformations for Classification Tasks
Authors:
Raphael Baena,
Lucas Drumetz,
Vincent Gripon
Abstract:
The field of Graph Signal Processing (GSP) has proposed tools to generalize harmonic analysis to complex domains represented through graphs. Among these tools are translations, which are required to define many others. Most works propose to define translations using solely the graph structure (i.e. edges). Such a problem is ill-posed in general as a graph conveys information about neighborhood but…
▽ More
The field of Graph Signal Processing (GSP) has proposed tools to generalize harmonic analysis to complex domains represented through graphs. Among these tools are translations, which are required to define many others. Most works propose to define translations using solely the graph structure (i.e. edges). Such a problem is ill-posed in general as a graph conveys information about neighborhood but not about directions. In this paper, we propose to infer translations as edge-constrained operations that make a supervised classification problem invariant using a deep learning framework. As such, our methodology uses both the graph structure and labeled signals to infer translations. We perform experiments with regular 2D images and abstract hyperlink networks to show the effectiveness of the proposed methodology in inferring meaningful translations for signals supported on graphs.
△ Less
Submitted 18 February, 2021;
originally announced February 2021.
-
Improving Classification Accuracy with Graph Filtering
Authors:
Mounia Hamidouche,
Carlos Lassance,
Yuqing Hu,
Lucas Drumetz,
Bastien Pasdeloup,
Vincent Gripon
Abstract:
In machine learning, classifiers are typically susceptible to noise in the training data. In this work, we aim at reducing intra-class noise with the help of graph filtering to improve the classification performance. Considered graphs are obtained by connecting samples of the training set that belong to a same class depending on the similarity of their representation in a latent space. We show tha…
▽ More
In machine learning, classifiers are typically susceptible to noise in the training data. In this work, we aim at reducing intra-class noise with the help of graph filtering to improve the classification performance. Considered graphs are obtained by connecting samples of the training set that belong to a same class depending on the similarity of their representation in a latent space. We show that the proposed graph filtering methodology has the effect of asymptotically reducing intra-class variance, while maintaining the mean. While our approach applies to all classification problems in general, it is particularly useful in few-shot settings, where intra-class noise can have a huge impact due to the small sample selection. Using standardized benchmarks in the field of vision, we empirically demonstrate the ability of the proposed method to slightly improve state-of-the-art results in both cases of few-shot and standard classification.
△ Less
Submitted 25 January, 2021; v1 submitted 12 January, 2021;
originally announced January 2021.
-
Learning Sentinel-2 Spectral Dynamics for Long-Run Predictions Using Residual Neural Networks
Authors:
Joaquim Estopinan,
Guillaume Tochon,
Lucas Drumetz
Abstract:
Making the most of multispectral image time-series is a promising but still relatively under-explored research direction because of the complexity of jointly analyzing spatial, spectral and temporal information. Capturing and characterizing temporal dynamics is one of the important and challenging issues. Our new method paves the way to capture real data dynamics and should eventually benefit appl…
▽ More
Making the most of multispectral image time-series is a promising but still relatively under-explored research direction because of the complexity of jointly analyzing spatial, spectral and temporal information. Capturing and characterizing temporal dynamics is one of the important and challenging issues. Our new method paves the way to capture real data dynamics and should eventually benefit applications like unmixing or classification. Dealing with time-series dynamics classically requires the knowledge of a dynamical model and an observation model. The former may be incorrect or computationally hard to handle, thus motivating data-driven strategies aiming at learning dynamics directly from data. In this paper, we adapt neural network architectures to learn periodic dynamics of both simulated and real multispectral time-series. We emphasize the necessity of choosing the right state variable to capture periodic dynamics and show that our models can reproduce the average seasonal dynamics of vegetation using only one year of training data.
△ Less
Submitted 19 March, 2021; v1 submitted 17 November, 2020;
originally announced November 2020.
-
Variational Deep Learning for the Identification and Reconstruction of Chaotic and Stochastic Dynamical Systems from Noisy and Partial Observations
Authors:
Duong Nguyen,
Said Ouala,
Lucas Drumetz,
Ronan Fablet
Abstract:
The data-driven recovery of the unknown governing equations of dynamical systems has recently received an increasing interest. However, the identification of governing equations remains challenging when dealing with noisy and partial observations. Here, we address this challenge and investigate variational deep learning schemes. Within the proposed framework, we jointly learn an inference model to…
▽ More
The data-driven recovery of the unknown governing equations of dynamical systems has recently received an increasing interest. However, the identification of governing equations remains challenging when dealing with noisy and partial observations. Here, we address this challenge and investigate variational deep learning schemes. Within the proposed framework, we jointly learn an inference model to reconstruct the true states of the system and the governing laws of these states from series of noisy and partial data. In doing so, this framework bridges classical data assimilation and state-of-the-art machine learning techniques. We also demonstrate that it generalises state-of-the-art methods. Importantly, both the inference model and the governing model embed stochastic components to account for stochastic variabilities, model errors, and reconstruction uncertainties. Various experiments on chaotic and stochastic dynamical systems support the relevance of our scheme w.r.t. state-of-the-art approaches.
△ Less
Submitted 16 February, 2021; v1 submitted 4 September, 2020;
originally announced September 2020.
-
Learning Variational Data Assimilation Models and Solvers
Authors:
Ronan Fablet,
Bertrand Chapron,
Lucas. Drumetz,
Etienne Memin,
Olivier Pannekoucke,
Francois Rousseau
Abstract:
This paper addresses variational data assimilation from a learning point of view. Data assimilation aims to reconstruct the time evolution of some state given a series of observations, possibly noisy and irregularly-sampled. Using automatic differentiation tools embedded in deep learning frameworks, we introduce end-to-end neural network architectures for data assimilation. It comprises two key co…
▽ More
This paper addresses variational data assimilation from a learning point of view. Data assimilation aims to reconstruct the time evolution of some state given a series of observations, possibly noisy and irregularly-sampled. Using automatic differentiation tools embedded in deep learning frameworks, we introduce end-to-end neural network architectures for data assimilation. It comprises two key components: a variational model and a gradient-based solver both implemented as neural networks. A key feature of the proposed end-to-end learning architecture is that we may train the NN models using both supervised and unsupervised strategies. Our numerical experiments on Lorenz-63 and Lorenz-96 systems report significant gain w.r.t. a classic gradient-based minimization of the variational cost both in terms of reconstruction performance and optimization complexity. Intriguingly, we also show that the variational models issued from the true Lorenz-63 and Lorenz-96 ODE representations may not lead to the best reconstruction performance. We believe these results may open new research avenues for the specification of assimilation models in geoscience.
△ Less
Submitted 25 July, 2020;
originally announced July 2020.
-
Joint learning of variational representations and solvers for inverse problems with partially-observed data
Authors:
Ronan Fablet,
Lucas Drumetz,
Francois Rousseau
Abstract:
Designing appropriate variational regularization schemes is a crucial part of solving inverse problems, making them better-posed and guaranteeing that the solution of the associated optimization problem satisfies desirable properties. Recently, learning-based strategies have appeared to be very efficient for solving inverse problems, by learning direct inversion schemes or plug-and-play regularize…
▽ More
Designing appropriate variational regularization schemes is a crucial part of solving inverse problems, making them better-posed and guaranteeing that the solution of the associated optimization problem satisfies desirable properties. Recently, learning-based strategies have appeared to be very efficient for solving inverse problems, by learning direct inversion schemes or plug-and-play regularizers from available pairs of true states and observations. In this paper, we go a step further and design an end-to-end framework allowing to learn actual variational frameworks for inverse problems in such a supervised setting. The variational cost and the gradient-based solver are both stated as neural networks using automatic differentiation for the latter. We can jointly learn both components to minimize the data reconstruction error on the true states. This leads to a data-driven discovery of variational models. We consider an application to inverse problems with incomplete datasets (image inpainting and multivariate time series interpolation). We experimentally illustrate that this framework can lead to a significant gain in terms of reconstruction performance, including w.r.t. the direct minimization of the variational formulation derived from the known generative model.
△ Less
Submitted 5 June, 2020;
originally announced June 2020.
-
Filtering Internal Tides From Wide-Swath Altimeter Data Using Convolutional Neural Networks
Authors:
Redouane Lguensat,
Ronan Fablet,
Julien Le Sommer,
Sammy Metref,
Emmanuel Cosme,
Kaouther Ouenniche,
Lucas Drumetz,
Jonathan Gula
Abstract:
The upcoming Surface Water Ocean Topography (SWOT) satellite altimetry mission is expected to yield two-dimensional high-resolution measurements of Sea Surface Height (SSH), thus allowing for a better characterization of the mesoscale and submesoscale eddy field. However, to fulfill the promises of this mission, filtering the tidal component of the SSH measurements is necessary. This challenging p…
▽ More
The upcoming Surface Water Ocean Topography (SWOT) satellite altimetry mission is expected to yield two-dimensional high-resolution measurements of Sea Surface Height (SSH), thus allowing for a better characterization of the mesoscale and submesoscale eddy field. However, to fulfill the promises of this mission, filtering the tidal component of the SSH measurements is necessary. This challenging problem is crucial since the posterior studies done by physical oceanographers using SWOT data will depend heavily on the selected filtering schemes. In this paper, we cast this problem into a supervised learning framework and propose the use of convolutional neural networks (ConvNets) to estimate fields free of internal tide signals. Numerical experiments based on an advanced North Atlantic simulation of the ocean circulation (eNATL60) show that our ConvNet considerably reduces the imprint of the internal waves in SSH data even in regions unseen by the neural network. We also investigate the relevance of considering additional data from other sea surface variables such as sea surface temperature (SST).
△ Less
Submitted 3 May, 2020;
originally announced May 2020.
-
Spectral Variability in Hyperspectral Data Unmixing: A Comprehensive Review
Authors:
Ricardo Augusto Borsoi,
Tales Imbiriba,
José Carlos Moreira Bermudez,
Cédric Richard,
Jocelyn Chanussot,
Lucas Drumetz,
Jean-Yves Tourneret,
Alina Zare,
Christian Jutten
Abstract:
The spectral signatures of the materials contained in hyperspectral images, also called endmembers (EM), can be significantly affected by variations in atmospheric, illumination or environmental conditions typically occurring within an image. Traditional spectral unmixing (SU) algorithms neglect the spectral variability of the endmembers, what propagates significant mismodeling errors throughout t…
▽ More
The spectral signatures of the materials contained in hyperspectral images, also called endmembers (EM), can be significantly affected by variations in atmospheric, illumination or environmental conditions typically occurring within an image. Traditional spectral unmixing (SU) algorithms neglect the spectral variability of the endmembers, what propagates significant mismodeling errors throughout the whole unmixing process and compromises the quality of its results. Therefore, large efforts have been recently dedicated to mitigate the effects of spectral variability in SU. This resulted in the development of algorithms that incorporate different strategies to allow the EMs to vary within a hyperspectral image, using, for instance, sets of spectral signatures known a priori, Bayesian, parametric, or local EM models. Each of these approaches has different characteristics and underlying motivations. This paper presents a comprehensive literature review contextualizing both classic and recent approaches to solve this problem. We give a detailed evaluation of the sources of spectral variability and their effect in image spectra. Furthermore, we propose a new taxonomy that organizes existing works according to a practitioner's point of view, based on the necessary amount of supervision and on the computational cost they require. We also review methods used to construct spectral libraries (which are required by many SU techniques) based on the observed hyperspectral image, as well as algorithms for library augmentation and reduction. Finally, we conclude the paper with some discussions and an outline of possible future directions for the field.
△ Less
Submitted 6 April, 2021; v1 submitted 20 January, 2020;
originally announced January 2020.
-
Learning Endmember Dynamics in Multitemporal Hyperspectral Data Using a State-Space Model Formulation
Authors:
Lucas Drumetz,
Mauro Dalla Mura,
Guillaume Tochon,
Ronan Fablet
Abstract:
Hyperspectral image unmixing is an inverse problem aiming at recovering the spectral signatures of pure materials of interest (called endmembers) and estimating their proportions (called abundances) in every pixel of the image. However, in spite of a tremendous applicative potential and the avent of new satellite sensors with high temporal resolution, multitemporal hyperspectral unmixing is still…
▽ More
Hyperspectral image unmixing is an inverse problem aiming at recovering the spectral signatures of pure materials of interest (called endmembers) and estimating their proportions (called abundances) in every pixel of the image. However, in spite of a tremendous applicative potential and the avent of new satellite sensors with high temporal resolution, multitemporal hyperspectral unmixing is still a relatively underexplored research avenue in the community, compared to standard image unmixing. In this paper, we propose a new framework for multitemporal unmixing and endmember extraction based on a state-space model, and present a proof of concept on simulated data to show how this representation can be used to inform multitemporal unmixing with external prior knowledge, or on the contrary to learn the dynamics of the quantities involved from data using neural network architectures adapted to the identification of dynamical systems.
△ Less
Submitted 27 November, 2019;
originally announced November 2019.
-
End-to-end learning of energy-based representations for irregularly-sampled signals and images
Authors:
Ronan Fablet,
Lucas Drumetz,
François Rousseau
Abstract:
For numerous domains, including for instance earth observation, medical imaging, astrophysics,..., available image and signal datasets often involve irregular space-time sampling patterns and large missing data rates. These sampling properties may be critical to apply state-of-the-art learning-based (e.g., auto-encoders, CNNs,...), fully benefit from the available large-scale observations and reac…
▽ More
For numerous domains, including for instance earth observation, medical imaging, astrophysics,..., available image and signal datasets often involve irregular space-time sampling patterns and large missing data rates. These sampling properties may be critical to apply state-of-the-art learning-based (e.g., auto-encoders, CNNs,...), fully benefit from the available large-scale observations and reach breakthroughs in the reconstruction and identification of processes of interest. In this paper, we address the end-to-end learning of representations of signals, images and image sequences from irregularly-sampled data, i.e. when the training data involved missing data. From an analogy to Bayesian formulation, we consider energy-based representations. Two energy forms are investigated: one derived from auto-encoders and one relating to Gibbs priors. The learning stage of these energy-based representations (or priors) involve a joint interpolation issue, which amounts to solving an energy minimization problem under observation constraints. Using a neural-network-based implementation of the considered energy forms, we can state an end-to-end learning scheme from irregularly-sampled data. We demonstrate the relevance of the proposed representations for different case-studies: namely, multivariate time series, 2D images and image sequences.
△ Less
Submitted 1 October, 2019;
originally announced October 2019.
-
Learning Latent Dynamics for Partially-Observed Chaotic Systems
Authors:
Said Ouala,
Duong Nguyen,
Lucas Drumetz,
Bertrand Chapron,
Ananda Pascual,
Fabrice Collard,
Lucile Gaultier,
Ronan Fablet
Abstract:
This paper addresses the data-driven identification of latent dynamical representations of partially-observed systems, i.e., dynamical systems for which some components are never observed, with an emphasis on forecasting applications, including long-term asymptotic patterns. Whereas state-of-the-art data-driven approaches rely on delay embeddings and linear decompositions of the underlying operato…
▽ More
This paper addresses the data-driven identification of latent dynamical representations of partially-observed systems, i.e., dynamical systems for which some components are never observed, with an emphasis on forecasting applications, including long-term asymptotic patterns. Whereas state-of-the-art data-driven approaches rely on delay embeddings and linear decompositions of the underlying operators, we introduce a framework based on the data-driven identification of an augmented state-space model using a neural-network-based representation. For a given training dataset, it amounts to jointly learn an ODE (Ordinary Differential Equation) representation in the latent space and reconstructing latent states. Through numerical experiments, we demonstrate the relevance of the proposed framework w.r.t. state-of-the-art approaches in terms of short-term forecasting performance and long-term behaviour. We further discuss how the proposed framework relates to Koopman operator theory and Takens' embedding theorem.
△ Less
Submitted 4 July, 2019;
originally announced July 2019.
-
Spectral Variability Aware Blind Hyperspectral Image Unmixing Based on Convex Geometry
Authors:
Lucas Drumetz,
Jocelyn Chanussot,
Christian Jutten,
Wing-Kin Ma,
Akira Iwasaki
Abstract:
Hyperspectral image unmixing has proven to be a useful technique to interpret hyperspectral data, and is a prolific research topic in the community. Most of the approaches used to perform linear unmixing are based on convex geometry concepts, because of the strong geometrical structure of the linear mixing model. However, two main phenomena lead to question this model, namely nonlinearities and th…
▽ More
Hyperspectral image unmixing has proven to be a useful technique to interpret hyperspectral data, and is a prolific research topic in the community. Most of the approaches used to perform linear unmixing are based on convex geometry concepts, because of the strong geometrical structure of the linear mixing model. However, two main phenomena lead to question this model, namely nonlinearities and the spectral variability of the materials. Many algorithms based on convex geometry are still used when considering these two limitations of the linear model. A natural question is to wonder to what extent these concepts and tools (Intrinsic Dimensionality estimation, endmember extraction algorithms, pixel purity) can be safely used in these different scenarios. In this paper, we analyze them with a focus on endmember variability, assuming that the linear model holds. In the light of this analysis, we propose an integrated unmixing chain which tries to adress the shortcomings of the classical tools used in the linear case, based on our previously proposed extended linear mixing model. We show the interest of the proposed approach on simulated and real datasets.
△ Less
Submitted 8 April, 2019;
originally announced April 2019.
-
Spectral Unmixing: A Derivation of the Extended Linear Mixing Model from the Hapke Model
Authors:
Lucas Drumetz,
Jocelyn Chanussot,
Christian Jutten
Abstract:
In hyperspectral imaging, spectral unmixing aims at decomposing the image into a set of reference spectral signatures corresponding to the materials present in the observed scene and their relative proportions in every pixel. While a linear mixing model was used for a long time, the complex nature of the physical mixing processes, led to shift the community's attention towards nonlinear models and…
▽ More
In hyperspectral imaging, spectral unmixing aims at decomposing the image into a set of reference spectral signatures corresponding to the materials present in the observed scene and their relative proportions in every pixel. While a linear mixing model was used for a long time, the complex nature of the physical mixing processes, led to shift the community's attention towards nonlinear models and algorithms accounting for the variability of the endmembers. Such intra class variations are due to local changes in the physico-chemical composition of the materials, and to illumination changes. In the physical remote sensing community, a popular model accounting for illumination variability is the radiative transfer model proposed by Hapke. It is however too complex to be directly used in hyperspectral unmixing in a tractable way. Instead, the Extended Linear Mixing Model (ELMM) allows to easily unmix hyperspectral data accounting for changing illumination conditions. In this letter, we show that the ELMM can be obtained from the Hapke model by successive simplifiying physical assumptions, thus theoretically confirming its relevance to handle illumination induced variability in the unmixing problem.
△ Less
Submitted 24 July, 2019; v1 submitted 28 March, 2019;
originally announced March 2019.
-
EM-like Learning Chaotic Dynamics from Noisy and Partial Observations
Authors:
Duong Nguyen,
Said Ouala,
Lucas Drumetz,
Ronan Fablet
Abstract:
The identification of the governing equations of chaotic dynamical systems from data has recently emerged as a hot topic. While the seminal work by Brunton et al. reported proof-of-concepts for idealized observation setting for fully-observed systems, {\em i.e.} large signal-to-noise ratios and high-frequency sampling of all system variables, we here address the learning of data-driven representat…
▽ More
The identification of the governing equations of chaotic dynamical systems from data has recently emerged as a hot topic. While the seminal work by Brunton et al. reported proof-of-concepts for idealized observation setting for fully-observed systems, {\em i.e.} large signal-to-noise ratios and high-frequency sampling of all system variables, we here address the learning of data-driven representations of chaotic dynamics for partially-observed systems, including significant noise patterns and possibly lower and irregular sampling setting. Instead of considering training losses based on short-term prediction error like state-of-the-art learning-based schemes, we adopt a Bayesian formulation and state this issue as a data assimilation problem with unknown model parameters. To solve for the joint inference of the hidden dynamics and of model parameters, we combine neural-network representations and state-of-the-art assimilation schemes. Using iterative Expectation-Maximization (EM)-like procedures, the key feature of the proposed inference schemes is the derivation of the posterior of the hidden dynamics. Using a neural-network-based Ordinary Differential Equation (ODE) representation of these dynamics, we investigate two strategies: their combination to Ensemble Kalman Smoothers and Long Short-Term Memory (LSTM)-based variational approximations of the posterior. Through numerical experiments on the Lorenz-63 system with different noise and time sampling settings, we demonstrate the ability of the proposed schemes to recover and reproduce the hidden chaotic dynamics, including their Lyapunov characteristic exponents, when classic machine learning approaches fail.
△ Less
Submitted 25 March, 2019;
originally announced March 2019.
-
Hyperspectral Image Unmixing with Endmember Bundles and Group Sparsity Inducing Mixed Norms
Authors:
Lucas Drumetz,
Travis R. Meyer,
Jocelyn Chanussot,
Andrea L. Bertozzi,
Christian Jutten
Abstract:
Hyperspectral images provide much more information than conventional imaging techniques, allowing a precise identification of the materials in the observed scene, but because of the limited spatial resolution, the observations are usually mixtures of the contributions of several materials. The spectral unmixing problem aims at recovering the spectra of the pure materials of the scene (endmembers),…
▽ More
Hyperspectral images provide much more information than conventional imaging techniques, allowing a precise identification of the materials in the observed scene, but because of the limited spatial resolution, the observations are usually mixtures of the contributions of several materials. The spectral unmixing problem aims at recovering the spectra of the pure materials of the scene (endmembers), along with their proportions (abundances) in each pixel. In order to deal with the intra-class variability of the materials and the induced spectral variability of the endmembers, several spectra per material, constituting endmember bundles, can be considered. However, the usual abundance estimation techniques do not take advantage of the particular structure of these bundles, organized into groups of spectra. In this paper, we propose to use group sparsity by introducing mixed norms in the abundance estimation optimization problem. In particular, we propose a new penalty which simultaneously enforces group and within group sparsity, to the cost of being nonconvex. All the proposed penalties are compatible with the abundance sum-to-one constraint, which is not the case with traditional sparse regression. We show on simulated and real datasets that well chosen penalties can significantly improve the unmixing performance compared to the naive bundle approach.
△ Less
Submitted 28 March, 2019; v1 submitted 25 May, 2018;
originally announced May 2018.