-
Tree-based variational inference for Poisson log-normal models
Authors:
Alexandre Chaussard,
Anna Bonnet,
Elisabeth Gassiat,
Sylvain Le Corff
Abstract:
When studying ecosystems, hierarchical trees are often used to organize entities based on proximity criteria, such as the taxonomy in microbiology, social classes in geography, or product types in retail businesses, offering valuable insights into entity relationships. Despite their significance, current count-data models do not leverage this structured information. In particular, the widely used…
▽ More
When studying ecosystems, hierarchical trees are often used to organize entities based on proximity criteria, such as the taxonomy in microbiology, social classes in geography, or product types in retail businesses, offering valuable insights into entity relationships. Despite their significance, current count-data models do not leverage this structured information. In particular, the widely used Poisson log-normal (PLN) model, known for its ability to model interactions between entities from count data, lacks the possibility to incorporate such hierarchical tree structures, limiting its applicability in domains characterized by such complexities. To address this matter, we introduce the PLN-Tree model as an extension of the PLN model, specifically designed for modeling hierarchical count data. By integrating structured variational inference techniques, we propose an adapted training procedure and establish identifiability results, enhancisng both theoretical foundations and practical interpretability. Additionally, we extend our framework to classification tasks as a preprocessing pipeline, showcasing its versatility. Experimental evaluations on synthetic datasets as well as real-world microbiome data demonstrate the superior performance of the PLN-Tree model in capturing hierarchical dependencies and providing valuable insights into complex data structures, showing the practical interest of knowledge graphs like the taxonomy in ecosystems modeling.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Variational quantization for state space models
Authors:
Etienne David,
Jean Bellot,
Sylvain Le Corff
Abstract:
Forecasting tasks using large datasets gathering thousands of heterogeneous time series is a crucial statistical problem in numerous sectors. The main challenge is to model a rich variety of time series, leverage any available external signals and provide sharp predictions with statistical guarantees. In this work, we propose a new forecasting model that combines discrete state space hidden Markov…
▽ More
Forecasting tasks using large datasets gathering thousands of heterogeneous time series is a crucial statistical problem in numerous sectors. The main challenge is to model a rich variety of time series, leverage any available external signals and provide sharp predictions with statistical guarantees. In this work, we propose a new forecasting model that combines discrete state space hidden Markov models with recent neural network architectures and training procedures inspired by vector quantized variational autoencoders. We introduce a variational discrete posterior distribution of the latent states given the observations and a two-stage training procedure to alternatively train the parameters of the latent states and of the emission distributions. By learning a collection of emission laws and temporarily activating them depending on the hidden process dynamics, the proposed method allows to explore large datasets and leverage available external signals. We assess the performance of the proposed method using several datasets and show that it outperforms other state-of-the-art solutions.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Diffusion posterior sampling for simulation-based inference in tall data settings
Authors:
Julia Linhart,
Gabriel Victorino Cardoso,
Alexandre Gramfort,
Sylvain Le Corff,
Pedro L. C. Rodrigues
Abstract:
Determining which parameters of a non-linear model best describe a set of experimental data is a fundamental problem in science and it has gained much traction lately with the rise of complex large-scale simulators. The likelihood of such models is typically intractable, which is why classical MCMC methods can not be used. Simulation-based inference (SBI) stands out in this context by only requiri…
▽ More
Determining which parameters of a non-linear model best describe a set of experimental data is a fundamental problem in science and it has gained much traction lately with the rise of complex large-scale simulators. The likelihood of such models is typically intractable, which is why classical MCMC methods can not be used. Simulation-based inference (SBI) stands out in this context by only requiring a dataset of simulations to train deep generative models capable of approximating the posterior distribution that relates input parameters to a given observation. In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model. The proposed method is built upon recent developments from the flourishing score-based diffusion literature and allows to estimate the tall data posterior distribution, while simply using information from a score network trained for a single context observation. We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
△ Less
Submitted 7 June, 2024; v1 submitted 11 April, 2024;
originally announced April 2024.
-
An analysis of the noise schedule for score-based generative models
Authors:
Stanislas Strasman,
Antonio Ocello,
Claire Boyer,
Sylvain Le Corff,
Vincent Lemaire
Abstract:
Score-based generative models (SGMs) aim at estimating a target data distribution by learning score functions using only noise-perturbed samples from the target.Recent literature has focused extensively on assessing the error between the target and estimated distributions, gauging the generative quality through the Kullback-Leibler (KL) divergence and Wasserstein distances. Under mild assumptions…
▽ More
Score-based generative models (SGMs) aim at estimating a target data distribution by learning score functions using only noise-perturbed samples from the target.Recent literature has focused extensively on assessing the error between the target and estimated distributions, gauging the generative quality through the Kullback-Leibler (KL) divergence and Wasserstein distances. Under mild assumptions on the data distribution, we establish an upper bound for the KL divergence between the target and the estimated distributions, explicitly depending on any time-dependent noise schedule. Under additional regularity assumptions, taking advantage of favorable underlying contraction mechanisms, we provide a tighter error bound in Wasserstein distance compared to state-of-the-art results. In addition to being tractable, this upper bound jointly incorporates properties of the target distribution and SGM hyperparameters that need to be tuned during training.
△ Less
Submitted 24 May, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Importance sampling for online variational learning
Authors:
Mathis Chagneux,
Pierre Gloaguen,
Sylvain Le Corff,
Jimmy Olsson
Abstract:
This article addresses online variational estimation in state-space models. We focus on learning the smoothing distribution, i.e. the joint distribution of the latent states given the observations, using a variational approach together with Monte Carlo importance sampling. We propose an efficient algorithm for computing the gradient of the evidence lower bound (ELBO) in the context of streaming d…
▽ More
This article addresses online variational estimation in state-space models. We focus on learning the smoothing distribution, i.e. the joint distribution of the latent states given the observations, using a variational approach together with Monte Carlo importance sampling. We propose an efficient algorithm for computing the gradient of the evidence lower bound (ELBO) in the context of streaming data, where observations arrive sequentially. Our contributions include a computationally efficient online ELBO estimator, demonstrated performance in offline and true online settings, and adaptability for computing general expectations under joint smoothing distributions.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
Authors:
Sobihan Surendran,
Antoine Godichon-Baggioni,
Adeline Fermanian,
Sylvain Le Corff
Abstract:
Stochastic Gradient Descent (SGD) with adaptive steps is now widely used for training deep neural networks. Most theoretical results assume access to unbiased gradient estimators, which is not the case in several recent deep learning and reinforcement learning applications that use Monte Carlo methods. This paper provides a comprehensive non-asymptotic analysis of SGD with biased gradients and ada…
▽ More
Stochastic Gradient Descent (SGD) with adaptive steps is now widely used for training deep neural networks. Most theoretical results assume access to unbiased gradient estimators, which is not the case in several recent deep learning and reinforcement learning applications that use Monte Carlo methods. This paper provides a comprehensive non-asymptotic analysis of SGD with biased gradients and adaptive steps for convex and non-convex smooth functions. Our study incorporates time-dependent bias and emphasizes the importance of controlling the bias and Mean Squared Error (MSE) of the gradient estimator. In particular, we establish that Adagrad and RMSProp with biased gradients converge to critical points for smooth non-convex functions at a rate similar to existing results in the literature for the unbiased case. Finally, we provide experimental results using Variational Autoenconders (VAE) that illustrate our convergence results and show how the effect of bias can be reduced by appropriate hyperparameter tuning.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
Variational excess risk bound for general state space models
Authors:
Élisabeth Gassiat,
Sylvain Le Corff
Abstract:
In this paper, we consider variational autoencoders (VAE) for general state space models. We consider a backward factorization of the variational distributions to analyze the excess risk associated with VAE. Such backward factorizations were recently proposed to perform online variational learning and to obtain upper bounds on the variational estimation error. When independent trajectories of seq…
▽ More
In this paper, we consider variational autoencoders (VAE) for general state space models. We consider a backward factorization of the variational distributions to analyze the excess risk associated with VAE. Such backward factorizations were recently proposed to perform online variational learning and to obtain upper bounds on the variational estimation error. When independent trajectories of sequences are observed and under strong mixing assumptions on the state space model and on the variational distribution, we provide an oracle inequality explicit in the number of samples and in the length of the observation sequences. We then derive consequences of this theoretical result. In particular, when the data distribution is given by a state space model, we provide an upper bound for the Kullback-Leibler divergence between the data distribution and its estimator and between the variational posterior and the estimated state space posterior distributions.Under classical assumptions, we prove that our results can be applied to Gaussian backward kernels built with dense and recurrent neural networks.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Monte Carlo guided Diffusion for Bayesian linear inverse problems
Authors:
Gabriel Cardoso,
Yazid Janati El Idrissi,
Sylvain Le Corff,
Eric Moulines
Abstract:
Ill-posed linear inverse problems arise frequently in various applications, from computational photography to medical imaging. A recent line of research exploits Bayesian inference with informative priors to handle the ill-posedness of such problems. Amongst such priors, score-based generative models (SGM) have recently been successfully applied to several different inverse problems. In this study…
▽ More
Ill-posed linear inverse problems arise frequently in various applications, from computational photography to medical imaging. A recent line of research exploits Bayesian inference with informative priors to handle the ill-posedness of such problems. Amongst such priors, score-based generative models (SGM) have recently been successfully applied to several different inverse problems. In this study, we exploit the particular structure of the prior defined by the SGM to define a sequence of intermediate linear inverse problems. As the noise level decreases, the posteriors of these inverse problems get closer to the target posterior of the original inverse problem. To sample from this sequence of posteriors, we propose the use of Sequential Monte Carlo (SMC) methods. The proposed algorithm, MCGDiff, is shown to be theoretically grounded and we provide numerical simulations showing that it outperforms competing baselines when dealing with ill-posed inverse problems in a Bayesian setting.
△ Less
Submitted 25 October, 2023; v1 submitted 15 August, 2023;
originally announced August 2023.
-
Last layer state space model for representation learning and uncertainty quantification
Authors:
Max Cohen,
Maurice Charbit,
Sylvain Le Corff
Abstract:
As sequential neural architectures become deeper and more complex, uncertainty estimation is more and more challenging. Efforts in quantifying uncertainty often rely on specific training procedures, and bear additional computational costs due to the dimensionality of such models. In this paper, we propose to decompose a classification or regression task in two steps: a representation learning stag…
▽ More
As sequential neural architectures become deeper and more complex, uncertainty estimation is more and more challenging. Efforts in quantifying uncertainty often rely on specific training procedures, and bear additional computational costs due to the dimensionality of such models. In this paper, we propose to decompose a classification or regression task in two steps: a representation learning stage to learn low-dimensional states, and a state space model for uncertainty estimation. This approach allows to separate representation learning and design of generative models. We demonstrate how predictive distributions can be estimated on top of an existing and trained neural network, by adding a state space-based last layer whose parameters are estimated with Sequential Monte Carlo methods. We apply our proposed methodology to the hourly estimation of Electricity Transformer Oil temperature, a publicly benchmarked dataset. Our model accounts for the noisy data structure, due to unknown or unavailable variables, and is able to provide confidence intervals on predictions.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
Variational latent discrete representation for time series modelling
Authors:
Max Cohen,
Maurice Charbit,
Sylvain Le Corff
Abstract:
Discrete latent space models have recently achieved performance on par with their continuous counterparts in deep variational inference. While they still face various implementation challenges, these models offer the opportunity for a better interpretation of latent spaces, as well as a more direct representation of naturally discrete phenomena. Most recent approaches propose to train separately v…
▽ More
Discrete latent space models have recently achieved performance on par with their continuous counterparts in deep variational inference. While they still face various implementation challenges, these models offer the opportunity for a better interpretation of latent spaces, as well as a more direct representation of naturally discrete phenomena. Most recent approaches propose to train separately very high-dimensional prior models on the discrete latent data which is a challenging task on its own. In this paper, we introduce a latent data model where the discrete state is a Markov chain, which allows fast end-to-end training. The performance of our generative model is assessed on a building management dataset and on the publicly available Electricity Transformer Dataset.
△ Less
Submitted 16 August, 2023; v1 submitted 27 June, 2023;
originally announced June 2023.
-
Asymptotic convergence of iterative optimization algorithms
Authors:
Randal Douc,
Sylvain Le Corff
Abstract:
This paper introduces a general framework for iterative optimization algorithms and establishes under general assumptions that their convergence is asymptotically geometric. We also prove that under appropriate assumptions, the rate of convergence can be lower bounded. The convergence is then only geometric, and we provide the exact asymptotic convergence rate. This framework allows to deal with c…
▽ More
This paper introduces a general framework for iterative optimization algorithms and establishes under general assumptions that their convergence is asymptotically geometric. We also prove that under appropriate assumptions, the rate of convergence can be lower bounded. The convergence is then only geometric, and we provide the exact asymptotic convergence rate. This framework allows to deal with constrained optimization and encompasses the Expectation Maximization algorithm and the mirror descent algorithm, as well as some variants such as the alpha-Expectation Maximization or the Mirror Prox algorithm.Furthermore, we establish sufficient conditions for the convergence of the Mirror Prox algorithm, under which the method converges systematically to the unique minimizer of a convex function on a convex compact set.
△ Less
Submitted 24 February, 2023;
originally announced February 2023.
-
State and parameter learning with PaRIS particle Gibbs
Authors:
Gabriel Cardoso,
Yazid Janati El Idrissi,
Sylvain Le Corff,
Eric Moulines,
Jimmy Olsson
Abstract:
Non-linear state-space models, also known as general hidden Markov models, are ubiquitous in statistical machine learning, being the most classical generative models for serial data and sequences in general. The particle-based, rapid incremental smoother PaRIS is a sequential Monte Carlo (SMC) technique allowing for efficient online approximation of expectations of additive functionals under the s…
▽ More
Non-linear state-space models, also known as general hidden Markov models, are ubiquitous in statistical machine learning, being the most classical generative models for serial data and sequences in general. The particle-based, rapid incremental smoother PaRIS is a sequential Monte Carlo (SMC) technique allowing for efficient online approximation of expectations of additive functionals under the smoothing distribution in these models. Such expectations appear naturally in several learning contexts, such as likelihood estimation (MLE) and Markov score climbing (MSC). PARIS has linear computational complexity, limited memory requirements and comes with non-asymptotic bounds, convergence results and stability guarantees. Still, being based on self-normalised importance sampling, the PaRIS estimator is biased. Our first contribution is to design a novel additive smoothing algorithm, the Parisian particle Gibbs PPG sampler, which can be viewed as a PaRIS algorithm driven by conditional SMC moves, resulting in bias-reduced estimates of the targeted quantities. We substantiate the PPG algorithm with theoretical results, including new bounds on bias and variance as well as deviation inequalities. Our second contribution is to apply PPG in a learning framework, covering MLE and MSC as special examples. In this context, we establish, under standard assumptions, non-asymptotic bounds highlighting the value of bias reduction and the implicit Rao--Blackwellization of PPG. These are the first non-asymptotic results of this kind in this setting. We illustrate our theoretical results with numerical experiments supporting our claims.
△ Less
Submitted 2 January, 2023;
originally announced January 2023.
-
Amortized backward variational inference in nonlinear state-space models
Authors:
Mathis Chagneux,
Élisabeth Gassiat,
Pierre Gloaguen,
Sylvain Le Corff
Abstract:
We consider the problem of state estimation in general state-space models using variational inference. For a generic variational family defined using the same backward decomposition as the actual joint smoothing distribution, we establish for the first time that, under mixing assumptions, the variational approximation of expectations of additive state functionals induces an error which grows at…
▽ More
We consider the problem of state estimation in general state-space models using variational inference. For a generic variational family defined using the same backward decomposition as the actual joint smoothing distribution, we establish for the first time that, under mixing assumptions, the variational approximation of expectations of additive state functionals induces an error which grows at most linearly in the number of observations. This guarantee is consistent with the known upper bounds for the approximation of smoothing distributions using standard Monte Carlo methods. Moreover, we propose an amortized inference framework where a neural network shared over all times steps outputs the parameters of the variational kernels. We also study empirically parametrizations which allow analytical marginalization of the variational distributions, and therefore lead to efficient smoothing algorithms. Significant improvements are made over state-of-the art variational solutions, especially when the generative model depends on a strongly nonlinear and noninjective mixing function.
△ Less
Submitted 1 June, 2022;
originally announced June 2022.
-
Variance estimation for Sequential Monte Carlo Algorithms: a backward sampling approach
Authors:
Yazid Janati El idrissi,
Sylvain Le Corff,
Yohan Petetin
Abstract:
In this paper, we consider the problem of online asymptotic variance estimation for particle filtering and smoothing. Current solutions for the particle filter rely on the particle genealogy and are either unstable or hard to tune in practice. We propose to mitigate these limitations by introducing a new estimator of the asymptotic variance based on the so called backward weights. The resulting es…
▽ More
In this paper, we consider the problem of online asymptotic variance estimation for particle filtering and smoothing. Current solutions for the particle filter rely on the particle genealogy and are either unstable or hard to tune in practice. We propose to mitigate these limitations by introducing a new estimator of the asymptotic variance based on the so called backward weights. The resulting estimator is weakly consistent and trades computational cost for more stability and reduced variance. We also propose a more computationally efficient estimator inspired by the PaRIS algorithm of Olsson & Westerborn. As an application, particle smoothing is considered and an estimator of the asymptotic variance of the Forward Filtering Backward Smoothing estimator applied to additive functionals is provided.
△ Less
Submitted 2 January, 2023; v1 submitted 4 April, 2022;
originally announced April 2022.
-
Diffusion bridges vector quantized Variational AutoEncoders
Authors:
Max Cohen,
Guillaume Quispe,
Sylvain Le Corff,
Charles Ollion,
Eric Moulines
Abstract:
Vector Quantized-Variational AutoEncoders (VQ-VAE) are generative models based on discrete latent representations of the data, where inputs are mapped to a finite set of learned embeddings.To generate new samples, an autoregressive prior distribution over the discrete states must be trained separately. This prior is generally very complex and leads to slow generation. In this work, we propose a ne…
▽ More
Vector Quantized-Variational AutoEncoders (VQ-VAE) are generative models based on discrete latent representations of the data, where inputs are mapped to a finite set of learned embeddings.To generate new samples, an autoregressive prior distribution over the discrete states must be trained separately. This prior is generally very complex and leads to slow generation. In this work, we propose a new model to train the prior and the encoder/decoder networks simultaneously. We build a diffusion bridge between a continuous coded vector and a non-informative prior distribution. The latent discrete states are then given as random functions of these continuous vectors. We show that our model is competitive with the autoregressive prior on the mini-Imagenet and CIFAR dataset and is efficient in both optimization and sampling. Our framework also extends the standard VQ-VAE and enables end-to-end training.
△ Less
Submitted 3 August, 2022; v1 submitted 10 February, 2022;
originally announced February 2022.
-
HERMES: Hybrid Error-corrector Model with inclusion of External Signals for nonstationary fashion time series
Authors:
Etienne David,
Jean Bellot,
Sylvain Le Corff
Abstract:
Develo** models and algorithms to predict nonstationary time series is a long standing statistical problem. It is crucial for many applications, in particular for fashion or retail industries, to make optimal inventory decisions and avoid massive wastes. By tracking thousands of fashion trends on social media with state-of-the-art computer vision approaches, we propose a new model for fashion ti…
▽ More
Develo** models and algorithms to predict nonstationary time series is a long standing statistical problem. It is crucial for many applications, in particular for fashion or retail industries, to make optimal inventory decisions and avoid massive wastes. By tracking thousands of fashion trends on social media with state-of-the-art computer vision approaches, we propose a new model for fashion time series forecasting. Our contribution is twofold. We first provide publicly a dataset gathering 10000 weekly fashion time series. As influence dynamics are the key of emerging trend detection, we associate with each time series an external weak signal representing behaviours of influencers. Secondly, to leverage such a dataset, we propose a new hybrid forecasting model. Our approach combines per-time-series parametric models with seasonal components and a global recurrent neural network to include sporadic external signals. This hybrid model provides state-of-the-art results on the proposed fashion dataset, on the weekly time series of the M4 competition, and illustrates the benefit of the contribution of external weak signals.
△ Less
Submitted 11 September, 2023; v1 submitted 7 February, 2022;
originally announced February 2022.
-
Learning Natural Language Generation from Scratch
Authors:
Alice Martin Donati,
Guillaume Quispe,
Charles Ollion,
Sylvain Le Corff,
Florian Strub,
Olivier Pietquin
Abstract:
This paper introduces TRUncated ReinForcement Learning for Language (TrufLL), an original ap-proach to train conditional language models from scratch by only using reinforcement learning (RL). AsRL methods unsuccessfully scale to large action spaces, we dynamically truncate the vocabulary spaceusing a generic language model. TrufLL thus enables to train a language agent by solely interacting withi…
▽ More
This paper introduces TRUncated ReinForcement Learning for Language (TrufLL), an original ap-proach to train conditional language models from scratch by only using reinforcement learning (RL). AsRL methods unsuccessfully scale to large action spaces, we dynamically truncate the vocabulary spaceusing a generic language model. TrufLL thus enables to train a language agent by solely interacting withits environment without any task-specific prior knowledge; it is only guided with a task-agnostic languagemodel. Interestingly, this approach avoids the dependency to labelled datasets and inherently reduces pre-trained policy flaws such as language or exposure biases. We evaluate TrufLL on two visual questiongeneration tasks, for which we report positive results over performance and language metrics, which wethen corroborate with a human evaluation. To our knowledge, it is the first approach that successfullylearns a language generation policy (almost) from scratch.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Disentangling Identifiable Features from Noisy Data with Structured Nonlinear ICA
Authors:
Hermanni Hälvä,
Sylvain Le Corff,
Luc Lehéricy,
Jonathan So,
Yongjie Zhu,
Elisabeth Gassiat,
Aapo Hyvarinen
Abstract:
We introduce a new general identifiable framework for principled disentanglement referred to as Structured Nonlinear Independent Component Analysis (SNICA). Our contribution is to extend the identifiability theory of deep generative models for a very broad class of structured models. While previous works have shown identifiability for specific classes of time-series models, our theorems extend thi…
▽ More
We introduce a new general identifiable framework for principled disentanglement referred to as Structured Nonlinear Independent Component Analysis (SNICA). Our contribution is to extend the identifiability theory of deep generative models for a very broad class of structured models. While previous works have shown identifiability for specific classes of time-series models, our theorems extend this to more general temporal structures as well as to models with more complex structures such as spatial dependencies. In particular, we establish the major result that identifiability for this framework holds even in the presence of noise of unknown distribution. Finally, as an example of our framework's flexibility, we introduce the first nonlinear ICA model for time-series that combines the following very useful properties: it accounts for both nonstationarity and autocorrelation in a fully unsupervised setting; performs dimensionality reduction; models hidden states; and enables principled estimation and inference by variational maximum-likelihood.
△ Less
Submitted 27 October, 2021; v1 submitted 17 June, 2021;
originally announced June 2021.
-
End-to-end deep meta modelling to calibrate and optimize energy consumption and comfort
Authors:
Max Cohen,
Sylvain Le Corff,
Maurice Charbit,
Marius Preda,
Gilles Nozière
Abstract:
In this paper, we propose a new end-to-end methodology to optimize the energy performance as well as comfort and air quality in large buildings without any renovation work. We introduce a metamodel based on recurrent neural networks and trained to predict the behavior of a general class of buildings using a database sampled from a simulation program. This metamodel is then deployed in different fr…
▽ More
In this paper, we propose a new end-to-end methodology to optimize the energy performance as well as comfort and air quality in large buildings without any renovation work. We introduce a metamodel based on recurrent neural networks and trained to predict the behavior of a general class of buildings using a database sampled from a simulation program. This metamodel is then deployed in different frameworks and its parameters are calibrated using the specific data of two real buildings. Parameters are estimated by comparing the predictions of the metamodel with real data obtained from sensors using the CMA-ES algorithm, a derivative free optimization procedure. Then, energy consumptions are optimized while maintaining a target thermal comfort and air quality, using the NSGA-II multi-objective optimization procedure. The numerical experiments illustrate how this metamodel ensures a significant gain in energy efficiency, up to almost 10%, while being computationally much more appealing than numerical models and flexible enough to be adapted to several types of buildings.
△ Less
Submitted 5 November, 2021; v1 submitted 1 February, 2021;
originally announced May 2021.
-
NEO: Non Equilibrium Sampling on the Orbit of a Deterministic Transform
Authors:
Achille Thin,
Yazid Janati,
Sylvain Le Corff,
Charles Ollion,
Arnaud Doucet,
Alain Durmus,
Eric Moulines,
Christian Robert
Abstract:
Sampling from a complex distribution $π$ and approximating its intractable normalizing constant Z are challenging problems. In this paper, a novel family of importance samplers (IS) and Markov chain Monte Carlo (MCMC) samplers is derived. Given an invertible map T, these schemes combine (with weights) elements from the forward and backward Orbits through points sampled from a proposal distributi…
▽ More
Sampling from a complex distribution $π$ and approximating its intractable normalizing constant Z are challenging problems. In this paper, a novel family of importance samplers (IS) and Markov chain Monte Carlo (MCMC) samplers is derived. Given an invertible map T, these schemes combine (with weights) elements from the forward and backward Orbits through points sampled from a proposal distribution $ρ$. The map T does not leave the target $π$ invariant, hence the name NEO, standing for Non-Equilibrium Orbits. NEO-IS provides unbiased estimators of the normalizing constant and self-normalized IS estimators of expectations under $π$ while NEO-MCMC combines multiple NEO-IS estimates of the normalizing constant and an iterated sampling-importance resampling mechanism to sample from $π$. For T chosen as a discrete-time integrator of a conformal Hamiltonian system, NEO-IS achieves state-of-the art performance on difficult benchmarks and NEO-MCMC is able to explore highly multimodal targets. Additionally, we provide detailed theoretical results for both methods. In particular, we show that NEO-MCMC is uniformly geometrically ergodic and establish explicit mixing time estimates under mild conditions.
△ Less
Submitted 23 August, 2021; v1 submitted 17 March, 2021;
originally announced March 2021.
-
Joint self-supervised blind denoising and noise estimation
Authors:
Jean Ollion,
Charles Ollion,
Elisabeth Gassiat,
Luc Lehéricy,
Sylvain Le Corff
Abstract:
We propose a novel self-supervised image blind denoising approach in which two neural networks jointly predict the clean signal and infer the noise distribution. Assuming that the noisy observations are independent conditionally to the signal, the networks can be jointly trained without clean training data. Therefore, our approach is particularly relevant for biomedical image denoising where the n…
▽ More
We propose a novel self-supervised image blind denoising approach in which two neural networks jointly predict the clean signal and infer the noise distribution. Assuming that the noisy observations are independent conditionally to the signal, the networks can be jointly trained without clean training data. Therefore, our approach is particularly relevant for biomedical image denoising where the noise is difficult to model precisely and clean training data are usually unavailable. Our method significantly outperforms current state-of-the-art self-supervised blind denoising algorithms, on six publicly available biomedical image datasets. We also show empirically with synthetic noisy data that our model captures the noise distribution efficiently. Finally, the described framework is simple, lightweight and computationally efficient, making it useful in practical cases.
△ Less
Submitted 16 February, 2021;
originally announced February 2021.
-
The Monte Carlo Transformer: a stochastic self-attention model for sequence prediction
Authors:
Alice Martin,
Charles Ollion,
Florian Strub,
Sylvain Le Corff,
Olivier Pietquin
Abstract:
This paper introduces the Sequential Monte Carlo Transformer, an original approach that naturally captures the observations distribution in a transformer architecture. The keys, queries, values and attention vectors of the network are considered as the unobserved stochastic states of its hidden structure. This generative model is such that at each time step the received observation is a random fun…
▽ More
This paper introduces the Sequential Monte Carlo Transformer, an original approach that naturally captures the observations distribution in a transformer architecture. The keys, queries, values and attention vectors of the network are considered as the unobserved stochastic states of its hidden structure. This generative model is such that at each time step the received observation is a random function of its past states in a given attention window. In this general state-space setting, we use Sequential Monte Carlo methods to approximate the posterior distributions of the states given the observations, and to estimate the gradient of the log-likelihood. We hence propose a generative model giving a predictive distribution, instead of a single-point estimate.
△ Less
Submitted 15 December, 2020; v1 submitted 15 July, 2020;
originally announced July 2020.
-
Deconvolution with unknown noise distribution is possible for multivariate signals
Authors:
Elisabeth Gassiat,
Sylvain Le Corff,
Luc Lehéricy
Abstract:
This paper considers the deconvolution problem in the case where the target signal is multidimensional and no information is known about the noise distribution. More precisely, no assumption is made on the noise distribution and no samples are available to estimate it: the deconvolution problem is solved based only on the corrupted signal observations. We establish the identifiability of the model…
▽ More
This paper considers the deconvolution problem in the case where the target signal is multidimensional and no information is known about the noise distribution. More precisely, no assumption is made on the noise distribution and no samples are available to estimate it: the deconvolution problem is solved based only on the corrupted signal observations. We establish the identifiability of the model up to translation when the signal has a Laplace transform with an exponential growth smaller than $2$ and when it can be decomposed into two dependent components. Then, we propose an estimator of the probability density function of the signal without any assumption on the noise distribution. As this estimator depends of the lightness of the tail of the signal distribution which is usually unknown, a model selection procedure is proposed to obtain an adaptive estimator in this parameter with the same rate of convergence as the estimator with a known tail parameter. Finally, we establish a lower bound on the minimax rate of convergence that matches the upper bound.
△ Less
Submitted 17 February, 2021; v1 submitted 25 June, 2020;
originally announced June 2020.
-
End-to-end deep metamodeling to calibrate and optimize energy loads
Authors:
Max Cohen,
Maurice Charbit,
Sylvain Le Corff,
Marius Preda,
Gilles Nozière
Abstract:
In this paper, we propose a new end-to-end methodology to optimize the energy performance and the comfort, air quality and hygiene of large buildings. A metamodel based on a Transformer network is introduced and trained using a dataset sampled with a simulation program. Then, a few physical parameters and the building management system settings of this metamodel are calibrated using the CMA-ES opt…
▽ More
In this paper, we propose a new end-to-end methodology to optimize the energy performance and the comfort, air quality and hygiene of large buildings. A metamodel based on a Transformer network is introduced and trained using a dataset sampled with a simulation program. Then, a few physical parameters and the building management system settings of this metamodel are calibrated using the CMA-ES optimization algorithm and real data obtained from sensors. Finally, the optimal settings to minimize the energy loads while maintaining a target thermal comfort and air quality are obtained using a multi-objective optimization procedure. The numerical experiments illustrate how this metamodel ensures a significant gain in energy efficiency while being computationally much more appealing than models requiring a huge number of physical parameters to be estimated.
△ Less
Submitted 19 June, 2020;
originally announced June 2020.
-
Backward importance sampling for online estimation of state space models
Authors:
Alice Martin,
Marie-Pierre Etienne,
Pierre Gloaguen,
Sylvain Le Corff,
Jimmy Olsson
Abstract:
This paper proposes a new Sequential Monte Carlo algorithm to perform online estimation in the context of state space models when either the transition density of the latent state or the conditional likelihood of an observation given a state is intractable. In this setting, obtaining low variance estimators of expectations under the posterior distributions of the unobserved states given the observ…
▽ More
This paper proposes a new Sequential Monte Carlo algorithm to perform online estimation in the context of state space models when either the transition density of the latent state or the conditional likelihood of an observation given a state is intractable. In this setting, obtaining low variance estimators of expectations under the posterior distributions of the unobserved states given the observations is a challenging task. Following recent theoretical results for pseudo-marginal sequential Monte Carlo smoothers, a pseudo-marginal backward importance sampling step is introduced to estimate such expectations. This new step allows to reduce very significantly the computational time of the existing numerical solutions based on an acceptance-rejection procedure for similar performance, and to broaden the class of eligible models for such methods. For instance, in the context of multivariate stochastic differential equations, the proposed algorithm makes use of unbiased estimates of the unknown transition densities under much weaker assumptions than standard alternatives. The performance of this estimator is assessed for high-dimensional discrete-time latent data models, for recursive maximum likelihood estimation in the context of partially observed diffusion process, and in the case of a bidimensional partially observed stochastic Lotka-Volterra model.
△ Less
Submitted 7 May, 2021; v1 submitted 13 February, 2020;
originally announced February 2020.
-
A pseudo-marginal sequential Monte Carlo online smoothing algorithm
Authors:
Pierre Gloaguen,
Sylvain Le Corff,
Jimmy Olsson
Abstract:
We consider online computation of expectations of additive state functionals under general path probability measures proportional to products of unnormalised transition densities. These transition densities are assumed to be intractable but possible to estimate, with or without bias. Using pseudo-marginalisation techniques we are able to extend the particle-based, rapid incremental smoother (PaRIS…
▽ More
We consider online computation of expectations of additive state functionals under general path probability measures proportional to products of unnormalised transition densities. These transition densities are assumed to be intractable but possible to estimate, with or without bias. Using pseudo-marginalisation techniques we are able to extend the particle-based, rapid incremental smoother (PaRIS) algorithm proposed in [J.Olsson and J.Westerborn. Efficient particle-based online smoothing in general hidden Markov models: The PaRIS algorithm. Bernoulli, 23(3):1951--1996, 2017] to this setting. The resulting algorithm, which has a linear complexity in the number of particles and constant memory requirements, applies to a wide range of challenging path-space Monte Carlo problems, including smoothing in partially observed diffusion processes and models with intractable likelihood. The algorithm is furnished with several theoretical results, including a central limit theorem, establishing its convergence and numerical stability. Moreover, under strong mixing assumptions we establish a novel $O(n \varepsilon)$ bound on the asymptotic bias of the algorithm, where $n$ is the path length and $\varepsilon$ controls the bias of the density estimators.
△ Less
Submitted 12 April, 2021; v1 submitted 20 August, 2019;
originally announced August 2019.
-
Identifiability and consistent estimation of nonparametric translation hidden Markov models with general state space
Authors:
Elisabeth Gassiat,
Sylvain Le Corff,
Luc Lehéricy
Abstract:
This paper considers hidden Markov models where the observations are given as the sum of a latent state which lies in a general state space and some independent noise with unknown distribution. It is shown that these fully nonparametric translation models are identifiable with respect to both the distribution of the latent variables and the distribution of the noise, under mostly a light tail assu…
▽ More
This paper considers hidden Markov models where the observations are given as the sum of a latent state which lies in a general state space and some independent noise with unknown distribution. It is shown that these fully nonparametric translation models are identifiable with respect to both the distribution of the latent variables and the distribution of the noise, under mostly a light tail assumption on the latent variables. Two nonparametric estimation methods are proposed and we prove that the corresponding estimators are consistent for the weak convergence topology. These results are illustrated with numerical experiments.
△ Less
Submitted 29 January, 2020; v1 submitted 4 February, 2019;
originally announced February 2019.
-
A Bayesian nonparametric approach for generalized Bradley-Terry models in random environment
Authors:
Sylvain Le Corff,
Matthieu Lerasle,
Elodie Vernet
Abstract:
This paper deals with the estimation of the unknown distribution of hidden random variables from the observation of pairwise comparisons between these variables. This problem is inspired by recent developments on Bradley-Terry models in random environment since this framework happens to be relevant to predict for instance the issue of a championship from the observation of a few contests per team.…
▽ More
This paper deals with the estimation of the unknown distribution of hidden random variables from the observation of pairwise comparisons between these variables. This problem is inspired by recent developments on Bradley-Terry models in random environment since this framework happens to be relevant to predict for instance the issue of a championship from the observation of a few contests per team. This paper provides three contributions on a Bayesian nonparametric approach to solve this problem. First, we establish contraction rates of the posterior distribution. We also propose a Markov Chain Monte Carlo algorithm to approximately sample from this posterior distribution inspired from a recent Bayesian nonparametric method for hidden Markov models. Finally, the performance of this algorithm are appreciated by comparing predictions on the issue of a championship based on the actual values of the teams and those obtained by sampling from the estimated posterior distribution.
△ Less
Submitted 24 August, 2018;
originally announced August 2018.
-
Learning the distribution of latent variables in paired comparison models with round-robin scheduling
Authors:
Roland Diel,
Sylvain Le Corff,
Matthieu Lerasle
Abstract:
Paired comparison data considered in this paper originate from the comparison of a large number N of individuals in couples. The dataset is a collection of results of contests between two individuals when each of them has faced n opponents, where n is much larger than N. Individual are represented by independent and identically distributed random parameters characterizing their abilities.The pape…
▽ More
Paired comparison data considered in this paper originate from the comparison of a large number N of individuals in couples. The dataset is a collection of results of contests between two individuals when each of them has faced n opponents, where n is much larger than N. Individual are represented by independent and identically distributed random parameters characterizing their abilities.The paper studies the maximum likelihood estimator of the parameters distribution. The analysis relies on the construction of a graphical model encoding conditional dependencies of the observations which are the outcomes of the first n contests each individual is involved in. This graphical model allows to prove geometric loss of memory properties and deduce the asymptotic behavior of the likelihood function. This paper sets the focus on graphical models obtained from round-robin scheduling of these contests.Following a classical construction in learning theory, the asymptotic likelihood is used to measure performance of the maximum likelihood estimator. Risk bounds for this estimator are finally obtained by sub-Gaussian deviation results for Markov chains applied to the graphical model.
△ Less
Submitted 13 February, 2020; v1 submitted 5 July, 2017;
originally announced July 2017.
-
Particle rejuvenation of Rao-Blackwellized Sequential Monte Carlo smoothers for Conditionally Linear and Gaussian models
Authors:
Ngoc Minh Nguyen,
Sylvain Le Corff,
Eric Moulines
Abstract:
This paper focuses on Sequential Monte Carlo approximations of smoothing distributions in conditionally linear and Gaussian state spaces. To reduce Monte Carlo variance of smoothers, it is typical in these models to use Rao-Blackwellization: particle approximation is used to sample sequences of hidden regimes while the Gaussian states are explicitly integrated conditional on the sequence of regime…
▽ More
This paper focuses on Sequential Monte Carlo approximations of smoothing distributions in conditionally linear and Gaussian state spaces. To reduce Monte Carlo variance of smoothers, it is typical in these models to use Rao-Blackwellization: particle approximation is used to sample sequences of hidden regimes while the Gaussian states are explicitly integrated conditional on the sequence of regimes and observations, using variants of the Kalman filter / smoother. The first successful attempt to use Rao-Blackwellization for smoothing extends the Bryson-Frazier smoother for Gaussian linear state space models using the generalized two-filter formula together with Kalman filters / smoothers. More recently, a forward backward decomposition of smoothing distributions mimicking the Rauch-Tung-Striebel smoother for the regimes combined with backward Kalman updates has been introduced. This paper investigates the benefit of introducing additional rejuvenation steps in all these algorithms to sample at each time instant new regimes conditional on the forward and backward particles. This defines particle based approximations of the smoothing distributions whose support is not restricted to the set of particles sampled in the forward or backward filter. These procedures are applied to commodity markets which are described using a two factor model based on the spot price and a convenience yield for crude oil data.
△ Less
Submitted 5 July, 2017;
originally announced July 2017.
-
Online Sequential Monte Carlo smoother for partially observed stochastic differential equations
Authors:
Pierre Gloaguen,
Marie-Pierre Etienne,
Sylvain Le Corff
Abstract:
This paper introduces a new algorithm to approximate smoothed additive functionals for partially observed stochastic differential equations. This method relies on a recent procedure which allows to compute such approximations online, i.e. as the observations are received, and with a computational complexity growing linearly with the number of Monte Carlo samples. This online smoother cannot be use…
▽ More
This paper introduces a new algorithm to approximate smoothed additive functionals for partially observed stochastic differential equations. This method relies on a recent procedure which allows to compute such approximations online, i.e. as the observations are received, and with a computational complexity growing linearly with the number of Monte Carlo samples. This online smoother cannot be used directly in the case of partially observed stochastic differential equations since the transition density of the latent data is usually unknown. We prove that a similar algorithm may still be defined for partially observed continuous processes by replacing this unknown quantity by an unbiased estimator obtained for instance using general Poisson estimators. We prove that this estimator is consistent and its performance are illustrated using data from two models.
△ Less
Submitted 6 March, 2017;
originally announced March 2017.
-
On the two-filter approximations of marginal smoothing distributions in general state space models
Authors:
Thi Ngoc Minh Nguyen,
Sylvain Le Corff,
Eric Moulines
Abstract:
A prevalent problem in general state space models is the approximation of the smoothing distribution of a state conditional on the observations from the past, the present, and the future. The aim of this paper is to provide a rigorous analysis of such approximations of smoothed distributions provided by the two-filter algorithms. We extend the results available for the approximation of smoothing d…
▽ More
A prevalent problem in general state space models is the approximation of the smoothing distribution of a state conditional on the observations from the past, the present, and the future. The aim of this paper is to provide a rigorous analysis of such approximations of smoothed distributions provided by the two-filter algorithms. We extend the results available for the approximation of smoothing distributions to these two-filter approaches which combine a forward filter approximating the filtering distributions with a backward information filter approximating a quantity proportional to the posterior distribution of the state given future observations.
△ Less
Submitted 27 May, 2016;
originally announced May 2016.
-
Optimal scaling of the Random Walk Metropolis algorithm under Lp mean differentiability
Authors:
Alain Durmus,
Sylvain Le Corff,
Eric Moulines,
Gareth O. Roberts
Abstract:
This paper considers the optimal scaling problem for high-dimensional random walk Metropolis algorithms for densities which are differentiable in Lp mean but which may be irregular at some points (like the Laplace density for example) and/or are supported on an interval. Our main result is the weak convergence of the Markov chain (appropriately rescaled in time and space) to a Langevin diffusion p…
▽ More
This paper considers the optimal scaling problem for high-dimensional random walk Metropolis algorithms for densities which are differentiable in Lp mean but which may be irregular at some points (like the Laplace density for example) and/or are supported on an interval. Our main result is the weak convergence of the Markov chain (appropriately rescaled in time and space) to a Langevin diffusion process as the dimension d goes to infinity. Because the log-density might be non-differentiable, the limiting diffusion could be singular. The scaling limit is established under assumptions which are much weaker than the one used in the original derivation of [6]. This result has important practical implications for the use of random walk Metropolis algorithms in Bayesian frameworks based on sparsity inducing priors.
△ Less
Submitted 22 April, 2016;
originally announced April 2016.
-
Stochastic differential equation based on a multimodal potential to model movement data in ecology
Authors:
Pierre Gloaguen,
Marie-Pierre Etienne,
Sylvain Le Corff
Abstract:
This paper proposes a new model for individuals movement in ecology. The movement process is defined as a solution to a stochastic differential equation whose drift is the gradient of a multimodal potential surface. This offers a new flexible approach among the popular potential based movement models in ecology. To perform parameter inference, the widely used Euler method is compared with two othe…
▽ More
This paper proposes a new model for individuals movement in ecology. The movement process is defined as a solution to a stochastic differential equation whose drift is the gradient of a multimodal potential surface. This offers a new flexible approach among the popular potential based movement models in ecology. To perform parameter inference, the widely used Euler method is compared with two other pseudo-likelihood procedures and with a Monte Carlo Expectation Maximization approach based on exact simulation of diffusions. Performances of all methods are assessed with simulated data and with a data set of fishing vessels trajectories. We show that the usual Euler method performs worse than the other procedures for all sampling schemes.
△ Less
Submitted 21 September, 2017; v1 submitted 30 September, 2015;
originally announced September 2015.
-
Consistent estimation of the filtering and marginal smoothing distributions in nonparametric hidden Markov models
Authors:
Yohann De Castro,
Elisabeth Gassiat,
Sylvain Le Corff
Abstract:
In this paper, we consider the filtering and smoothing recursions in nonparametric finite state space hidden Markov models (HMMs) when the parameters of the model are unknown and replaced by estimators. We provide an explicit and time uniform control of the filtering and smoothing errors in total variation norm as a function of the parameter estimation errors. We prove that the risk for the filter…
▽ More
In this paper, we consider the filtering and smoothing recursions in nonparametric finite state space hidden Markov models (HMMs) when the parameters of the model are unknown and replaced by estimators. We provide an explicit and time uniform control of the filtering and smoothing errors in total variation norm as a function of the parameter estimation errors. We prove that the risk for the filtering and smoothing errors may be uniformly upper bounded by the risk of the estimators. It has been proved very recently that statistical inference for finite state space nonparametric HMMs is possible. We study how the recent spectral methods developed in the parametric setting may be extended to the nonparametric framework and we give explicit upper bounds for the L2-risk of the nonparametric spectral estimators. When the observation space is compact, this provides explicit rates for the filtering and smoothing errors in total variation norm. The performance of the spectral method is assessed with simulated data for both the estimation of the (nonparametric) conditional distribution of the observations and the estimation of the marginal smoothing distributions.
△ Less
Submitted 23 July, 2015;
originally announced July 2015.
-
Statistical Inference for Oscillation Processes
Authors:
Rainer Dahlhaus,
Thierry Dumont,
Sylvain Le Corff,
Jan C. Neddermeyer
Abstract:
A new model for time series with a specific oscillation pattern is proposed. The model consists of a hidden phase process controlling the speed of polling and a nonparametric curve characterizing the pattern, leading together to a generalized state space model. Identifiability of the model is proved and a method for statistical inference based on a particle smoother and a nonparametric EM algorith…
▽ More
A new model for time series with a specific oscillation pattern is proposed. The model consists of a hidden phase process controlling the speed of polling and a nonparametric curve characterizing the pattern, leading together to a generalized state space model. Identifiability of the model is proved and a method for statistical inference based on a particle smoother and a nonparametric EM algorithm is developed. In particular, the oscillation pattern and the unobserved phase process are estimated. The proposed algorithms are computationally efficient and their performance is assessed through simulations and an application to human electrocardiogram recordings.
△ Less
Submitted 12 August, 2016; v1 submitted 16 December, 2014;
originally announced December 2014.
-
A shrinkage-thresholding Metropolis adjusted Langevin algorithm for Bayesian variable selection
Authors:
Amandine Schreck,
Gersende Fort,
Sylvain Le Corff,
Eric Moulines
Abstract:
This paper introduces a new Markov Chain Monte Carlo method for Bayesian variable selection in high dimensional settings. The algorithm is a Hastings-Metropolis sampler with a proposal mechanism which combines a Metropolis Adjusted Langevin (MALA) step to propose local moves associated with a shrinkage-thresholding step allowing to propose new models. The geometric ergodicity of this new trans-di…
▽ More
This paper introduces a new Markov Chain Monte Carlo method for Bayesian variable selection in high dimensional settings. The algorithm is a Hastings-Metropolis sampler with a proposal mechanism which combines a Metropolis Adjusted Langevin (MALA) step to propose local moves associated with a shrinkage-thresholding step allowing to propose new models. The geometric ergodicity of this new trans-dimensional Markov Chain Monte Carlo sampler is established. An extensive numerical experiment, on simulated and real data, is presented to illustrate the performance of the proposed algorithm in comparison with some more classical trans-dimensional algorithms.
△ Less
Submitted 11 September, 2015; v1 submitted 19 December, 2013;
originally announced December 2013.
-
Simultaneous Localization and Map** Problem in Wireless Sensor Networks
Authors:
Thierry Dumont,
Sylvain Le Corff
Abstract:
Mobile device localization in wireless sensor networks is a challenging task. It has already been addressed when the WiFI propagation maps of the access points are modeled deterministically. However, this procedure does not take into account the environmental dynamics and also assumes an offline human training calibration. In this paper, the maps are made of an average indoor propagation model com…
▽ More
Mobile device localization in wireless sensor networks is a challenging task. It has already been addressed when the WiFI propagation maps of the access points are modeled deterministically. However, this procedure does not take into account the environmental dynamics and also assumes an offline human training calibration. In this paper, the maps are made of an average indoor propagation model combined with a perturbation field which represents the influence of the environment. This perturbation field is embedded with a prior distribution. The device localization is dealt with using Sequential Monte Carlo methods and relies on the estimation of the propagation maps. This inference task is performed online, i.e. using the observations sequentially, with a recently proposed online Expectation Maximization based algorithm. The performance of the algorithm are illustrated through Monte Carlo experiments.
△ Less
Submitted 26 September, 2012;
originally announced September 2012.
-
Nonparametric regression on hidden phi-mixing variables: identifiability and consistency of a pseudo-likelihood based estimation procedure
Authors:
Thierry Dumont,
Sylvain Le Corff
Abstract:
This paper outlines a new nonparametric estimation procedure for unobserved phi-mixing processes. It is assumed that the only information on the stationary hidden states (Xk) is given by the process (Yk), where Yk is a noisy observation of f(Xk). The paper introduces a maximum pseudo-likelihood procedure to estimate the function f and the distribution of the hidden states using blocks of observati…
▽ More
This paper outlines a new nonparametric estimation procedure for unobserved phi-mixing processes. It is assumed that the only information on the stationary hidden states (Xk) is given by the process (Yk), where Yk is a noisy observation of f(Xk). The paper introduces a maximum pseudo-likelihood procedure to estimate the function f and the distribution of the hidden states using blocks of observations of length b. The identifiability of the model is studied in the particular cases b=1 and b=2. The consistency of the estimators of f and of the distribution of the hidden states as the number of observations grows to infinity is established.
△ Less
Submitted 26 August, 2015; v1 submitted 4 September, 2012;
originally announced September 2012.
-
Convergence of a Particle-based Approximation of the Block Online Expectation Maximization Algorithm
Authors:
Sylvain Le Corff,
Gersende Fort
Abstract:
Online variants of the Expectation Maximization (EM) algorithm have recently been proposed to perform parameter inference with large data sets or data streams, in independent latent models and in hidden Markov models. Nevertheless, the convergence properties of these algorithms remain an open problem at least in the hidden Markov case. This contribution deals with a new online EM algorithm which u…
▽ More
Online variants of the Expectation Maximization (EM) algorithm have recently been proposed to perform parameter inference with large data sets or data streams, in independent latent models and in hidden Markov models. Nevertheless, the convergence properties of these algorithms remain an open problem at least in the hidden Markov case. This contribution deals with a new online EM algorithm which updates the parameter at some deterministic times. Some convergence results have been derived even in general latent models such as hidden Markov models. These properties rely on the assumption that some intermediate quantities are available in closed form or can be approximated by Monte Carlo methods when the Monte Carlo error vanishes rapidly enough. In this paper, we propose an algorithm which approximates these quantities using Sequential Monte Carlo methods. The convergence of this algorithm and of an averaged version is established and their performance is illustrated through Monte Carlo experiments.
△ Less
Submitted 30 May, 2012; v1 submitted 5 November, 2011;
originally announced November 2011.
-
Supplement paper to "Online Expectation Maximization based algorithms for inference in hidden Markov models"
Authors:
Sylvain Le Corff,
Gersende Fort
Abstract:
This is a supplementary material to the paper "Online Expectation Maximization based algorithms for inference in hidden Markov models". It contains further technical derivations and additional simulation results.
This is a supplementary material to the paper "Online Expectation Maximization based algorithms for inference in hidden Markov models". It contains further technical derivations and additional simulation results.
△ Less
Submitted 16 October, 2012; v1 submitted 20 August, 2011;
originally announced August 2011.
-
Online Expectation Maximization based algorithms for inference in hidden Markov models
Authors:
Sylvain Le Corff,
Gersende Fort
Abstract:
The Expectation Maximization (EM) algorithm is a versatile tool for model parameter estimation in latent data models. When processing large data sets or data stream however, EM becomes intractable since it requires the whole data set to be available at each iteration of the algorithm. In this contribution, a new generic online EM algorithm for model parameter inference in general Hidden Markov Mod…
▽ More
The Expectation Maximization (EM) algorithm is a versatile tool for model parameter estimation in latent data models. When processing large data sets or data stream however, EM becomes intractable since it requires the whole data set to be available at each iteration of the algorithm. In this contribution, a new generic online EM algorithm for model parameter inference in general Hidden Markov Model is proposed. This new algorithm updates the parameter estimate after a block of observations is processed (online). The convergence of this new algorithm is established, and the rate of convergence is studied showing the impact of the block size. An averaging procedure is also proposed to improve the rate of convergence. Finally, practical illustrations are presented to highlight the performance of these algorithms in comparison to other online maximum likelihood procedures.
△ Less
Submitted 16 October, 2012; v1 submitted 19 August, 2011;
originally announced August 2011.
-
Non-asymptotic deviation inequalities for smoothed additive functionals in non-linear state-space models
Authors:
Cyrille Dubarry,
Sylvain Le Corff
Abstract:
The approximation of fixed-interval smoothing distributions is a key issue in inference for general state-space hidden Markov models (HMM). This contribution establishes non-asymptotic bounds for the Forward Filtering Backward Smoothing (FFBS) and the Forward Filtering Backward Simulation (FFBSi) estimators of fixed-interval smoothing functionals. We show that the rate of convergence of the Lq-mea…
▽ More
The approximation of fixed-interval smoothing distributions is a key issue in inference for general state-space hidden Markov models (HMM). This contribution establishes non-asymptotic bounds for the Forward Filtering Backward Smoothing (FFBS) and the Forward Filtering Backward Simulation (FFBSi) estimators of fixed-interval smoothing functionals. We show that the rate of convergence of the Lq-mean errors of both methods depends on the number of observations T and the number of particles N only through the ratio T/N for additive functionals. In the case of the FFBS, this improves recent results providing bounds depending on T and the square root of N.
△ Less
Submitted 26 April, 2012; v1 submitted 19 December, 2010;
originally announced December 2010.