-
Antithetic Multilevel Methods for Elliptic and Hypo-Elliptic Diffusions with Applications
Authors:
Yuga Iguchi,
Ajay Jasra,
Mohamed Maama,
Alexandros Beskos
Abstract:
In this paper, we present a new antithetic multilevel Monte Carlo (MLMC) method for the estimation of expectations with respect to laws of diffusion processes that can be elliptic or hypo-elliptic. In particular, we consider the case where one has to resort to time discretization of the diffusion and numerical simulation of such schemes. Motivated by recent developments, we introduce a new MLMC es…
▽ More
In this paper, we present a new antithetic multilevel Monte Carlo (MLMC) method for the estimation of expectations with respect to laws of diffusion processes that can be elliptic or hypo-elliptic. In particular, we consider the case where one has to resort to time discretization of the diffusion and numerical simulation of such schemes. Motivated by recent developments, we introduce a new MLMC estimator of expectations, which does not require simulation of intractable Lévy areas but has a weak error of order 2 and achieves the optimal computational complexity. We then show how this approach can be used in the context of the filtering problem associated to partially observed diffusions with discrete time observations. We illustrate with numerical simulations that our new approaches provide efficiency gains for several problems relative to some existing methods.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Parameter Inference for Hypo-Elliptic Diffusions under a Weak Design Condition
Authors:
Yuga Iguchi,
Alexandros Beskos
Abstract:
We address the problem of parameter estimation for degenerate diffusion processes defined via the solution of Stochastic Differential Equations (SDEs) with diffusion matrix that is not full-rank. For this class of hypo-elliptic diffusions recent works have proposed contrast estimators that are asymptotically normal, provided that the step-size in-between observations $Δ=Δ_n$ and their total number…
▽ More
We address the problem of parameter estimation for degenerate diffusion processes defined via the solution of Stochastic Differential Equations (SDEs) with diffusion matrix that is not full-rank. For this class of hypo-elliptic diffusions recent works have proposed contrast estimators that are asymptotically normal, provided that the step-size in-between observations $Δ=Δ_n$ and their total number $n$ satisfy $n \to \infty$, $n Δ_n \to \infty$, $Δ_n \to 0$, and additionally $Δ_n = o (n^{-1/2})$. This latter restriction places a requirement for a so-called `rapidly increasing experimental design'. In this paper, we overcome this limitation and develop a general contrast estimator satisfying asymptotic normality under the weaker design condition $Δ_n = o(n^{-1/p})$ for general $p \ge 2$. Such a result has been obtained for elliptic SDEs in the literature, but its derivation in a hypo-elliptic setting is highly non-trivial. We provide numerical results to illustrate the advantages of the developed theory.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Graph Sphere: From Nodes to Supernodes in Graphical Models
Authors:
Willem van den Boom,
Maria De Iorio,
Alexandros Beskos,
Ajay Jasra
Abstract:
High-dimensional data analysis typically focuses on low-dimensional structure, often to aid interpretation and computational efficiency. Graphical models provide a powerful methodology for learning the conditional independence structure in multivariate data by representing variables as nodes and dependencies as edges. Inference is often focused on individual edges in the latent graph. Nonetheless,…
▽ More
High-dimensional data analysis typically focuses on low-dimensional structure, often to aid interpretation and computational efficiency. Graphical models provide a powerful methodology for learning the conditional independence structure in multivariate data by representing variables as nodes and dependencies as edges. Inference is often focused on individual edges in the latent graph. Nonetheless, there is increasing interest in determining more complex structures, such as communities of nodes, for multiple reasons, including more effective information retrieval and better interpretability. In this work, we propose a multilayer graphical model where we first cluster nodes and then, at the second layer, investigate the relationships among groups of nodes. Specifically, nodes are partitioned into "supernodes" with a data-coherent size-biased tessellation prior which combines ideas from Bayesian nonparametrics and Voronoi tessellations. This construct allows accounting also for dependence of nodes within supernodes. At the second layer, dependence structure among supernodes is modelled through a Gaussian graphical model, where the focus of inference is on "superedges". We provide theoretical justification for our modelling choices. We design tailored Markov chain Monte Carlo schemes, which also enable parallel computations. We demonstrate the effectiveness of our approach for large-scale structure learning in simulations and a transcriptomics application.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
Parameter Inference for Degenerate Diffusion Processes
Authors:
Yuga Iguchi,
Alexandros Beskos,
Matthew Graham
Abstract:
We study parametric inference for ergodic diffusion processes with a degenerate diffusion matrix. Existing research focuses on a particular class of hypo-elliptic SDEs, with components split into `rough'/`smooth' and noise from rough components propagating directly onto smooth ones, but some critical model classes arising in applications have yet to be explored. We aim to cover this gap, thus anal…
▽ More
We study parametric inference for ergodic diffusion processes with a degenerate diffusion matrix. Existing research focuses on a particular class of hypo-elliptic SDEs, with components split into `rough'/`smooth' and noise from rough components propagating directly onto smooth ones, but some critical model classes arising in applications have yet to be explored. We aim to cover this gap, thus analyse the highly degenerate class of SDEs, where components split into further sub-groups. Such models include e.g. the notable case of generalised Langevin equations. We propose a tailored time-discretisation scheme and provide asymptotic results supporting our scheme in the context of high-frequency, full observations. The proposed discretisation scheme is applicable in much more general data regimes and is shown to overcome biases via simulation studies also in the practical case when only a smooth component is observed. Joint consideration of our study for highly degenerate SDEs and existing research provides a general `recipe' for the development of time-discretisation schemes to be used within statistical methods for general classes of hypo-elliptic SDEs.
△ Less
Submitted 28 May, 2024; v1 submitted 31 July, 2023;
originally announced July 2023.
-
Sequential Markov Chain Monte Carlo for Lagrangian Data Assimilation with Applications to Unknown Data Locations
Authors:
Hamza Ruzayqat,
Alexandros Beskos,
Dan Crisan,
Ajay Jasra,
Nikolas Kantas
Abstract:
We consider a class of high-dimensional spatial filtering problems, where the spatial locations of observations are unknown and driven by the partially observed hidden signal. This problem is exceptionally challenging as not only is high-dimensional, but the model for the signal yields longer-range time dependencies through the observation locations. Motivated by this model we revisit a lesser-kno…
▽ More
We consider a class of high-dimensional spatial filtering problems, where the spatial locations of observations are unknown and driven by the partially observed hidden signal. This problem is exceptionally challenging as not only is high-dimensional, but the model for the signal yields longer-range time dependencies through the observation locations. Motivated by this model we revisit a lesser-known and \emph{provably convergent} computational methodology from \cite{berzuini, cent, martin} that uses sequential Markov Chain Monte Carlo (MCMC) chains. We extend this methodology for data filtering problems with unknown observation locations. We benchmark our algorithms on Linear Gaussian state space models against competing ensemble methods and demonstrate a significant improvement in both execution speed and accuracy. Finally, we implement a realistic case study on a high-dimensional rotating shallow water model (of about $10^4-10^5$ dimensions) with real and synthetic data. The data is provided by the National Oceanic and Atmospheric Administration (NOAA) and contains observations from ocean drifters in a domain of the Atlantic Ocean restricted to the longitude and latitude intervals $[-51^{\circ}, -41^{\circ}]$, $[17^{\circ}, 27^{\circ}]$ respectively.
△ Less
Submitted 5 March, 2024; v1 submitted 30 April, 2023;
originally announced May 2023.
-
Parameter Estimation with Increased Precision for Elliptic and Hypo-elliptic Diffusions
Authors:
Yuga Iguchi,
Alexandros Beskos,
Matthew M. Graham
Abstract:
This work aims at making a comprehensive contribution in the general area of parametric inference for discretely observed diffusion processes. Established approaches for likelihood-based estimation invoke a time-discretisation scheme for the approximation of the intractable transition dynamics of the Stochastic Differential Equation (SDE) model over finite time periods. The scheme is applied for a…
▽ More
This work aims at making a comprehensive contribution in the general area of parametric inference for discretely observed diffusion processes. Established approaches for likelihood-based estimation invoke a time-discretisation scheme for the approximation of the intractable transition dynamics of the Stochastic Differential Equation (SDE) model over finite time periods. The scheme is applied for a step-size that is either user-selected or determined by the data. Recent research has highlighted the critical ef-fect of the choice of numerical scheme on the behaviour of derived parameter estimates in the setting of hypo-elliptic SDEs. In brief, in our work, first, we develop two weak second order sampling schemes (to cover both hypo-elliptic and elliptic SDEs) and produce a small time expansion for the density of the schemes to form a proxy for the true intractable SDE transition density. Then, we establish a collection of analytic results for likelihood-based parameter estimates obtained via the formed proxies, thus providing a theoretical framework that showcases advantages from the use of the developed methodology for SDE calibration. We present numerical results from carrying out classical or Bayesian inference, for both elliptic and hypo-elliptic SDEs.
△ Less
Submitted 29 January, 2024; v1 submitted 29 November, 2022;
originally announced November 2022.
-
A Bayesian framework for genome-wide inference of DNA methylation levels
Authors:
Marcel Hirt,
Axel Finke,
Alexandros Beskos,
Petros Dellaportas,
Stephan Beck,
Ismail Moghul,
Simone Ecker
Abstract:
DNA methylation is an important epigenetic mark that has been studied extensively for its regulatory role in biological processes and diseases. WGBS allows for genome-wide measurements of DNA methylation up to single-base resolutions, yet poses challenges in identifying significantly different methylation patterns across distinct biological conditions. We propose a novel methylome change-point mod…
▽ More
DNA methylation is an important epigenetic mark that has been studied extensively for its regulatory role in biological processes and diseases. WGBS allows for genome-wide measurements of DNA methylation up to single-base resolutions, yet poses challenges in identifying significantly different methylation patterns across distinct biological conditions. We propose a novel methylome change-point model which describes the joint dynamics of methylation regimes of a case and a control group and benefits from taking into account the information of neighbouring methylation sites among all available samples. We also devise particle filtering and smoothing algorithms to perform efficient inference of the latent methylation patterns. We illustrate that our approach can detect and test for very flexible differential methylation signatures with high power while controlling Type-I error measures.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
Change point detection in dynamic Gaussian graphical models: the impact of COVID-19 pandemic on the US stock market
Authors:
Beatrice Franzolini,
Alexandros Beskos,
Maria De Iorio,
Warrick Poklewski Koziell,
Karolina Grzeszkiewicz
Abstract:
Reliable estimates of volatility and correlation are fundamental in economics and finance for understanding the impact of macroeconomics events on the market and guiding future investments and policies. Dependence across financial returns is likely to be subject to sudden structural changes, especially in correspondence with major global events, such as the COVID-19 pandemic. In this work, we are…
▽ More
Reliable estimates of volatility and correlation are fundamental in economics and finance for understanding the impact of macroeconomics events on the market and guiding future investments and policies. Dependence across financial returns is likely to be subject to sudden structural changes, especially in correspondence with major global events, such as the COVID-19 pandemic. In this work, we are interested in capturing abrupt changes over time in the dependence across US industry stock portfolios, over a time horizon that covers the COVID-19 pandemic. The selected stocks give a comprehensive picture of the US stock market. To this end, we develop a Bayesian multivariate stochastic volatility model based on a time-varying sequence of graphs capturing the evolution of the dependence structure. The model builds on the Gaussian graphical models and the random change points literature. In particular, we treat the number, the position of change points, and the graphs as object of posterior inference, allowing for sparsity in graph recovery and change point detection. The high dimension of the parameter space poses complex computational challenges. However, the model admits a hidden Markov model formulation. This leads to the development of an efficient computational strategy, based on a combination of sequential Monte-Carlo and Markov chain Monte-Carlo techniques. Model and computational development are widely applicable, beyond the scope of the application of interest in this work.
△ Less
Submitted 23 May, 2023; v1 submitted 1 August, 2022;
originally announced August 2022.
-
Bayesian Learning of Graph Substructures
Authors:
Willem van den Boom,
Maria De Iorio,
Alexandros Beskos
Abstract:
Graphical models provide a powerful methodology for learning the conditional independence structure in multivariate data. Inference is often focused on estimating individual edges in the latent graph. Nonetheless, there is increasing interest in inferring more complex structures, such as communities, for multiple reasons, including more effective information retrieval and better interpretability.…
▽ More
Graphical models provide a powerful methodology for learning the conditional independence structure in multivariate data. Inference is often focused on estimating individual edges in the latent graph. Nonetheless, there is increasing interest in inferring more complex structures, such as communities, for multiple reasons, including more effective information retrieval and better interpretability. Stochastic blockmodels offer a powerful tool to detect such structure in a network. We thus propose to exploit advances in random graph theory and embed them within the graphical models framework. A consequence of this approach is the propagation of the uncertainty in graph estimation to large-scale structure learning. We consider Bayesian nonparametric stochastic blockmodels as priors on the graph. We extend such models to consider clique-based blocks and to multiple graph settings introducing a novel prior process based on a Dependent Dirichlet process. Moreover, we devise a tailored computation strategy of Bayes factors for block structure based on the Savage-Dickey ratio to test for presence of larger structure in a graph. We demonstrate our approach in simulations as well as on real data applications in finance and transcriptomics.
△ Less
Submitted 11 October, 2022; v1 submitted 22 March, 2022;
originally announced March 2022.
-
Unbiased Estimation using a Class of Diffusion Processes
Authors:
Hamza Ruzayqat,
Alexandros Beskos,
Dan Crisan,
Ajay Jasra,
Nikolas Kantas
Abstract:
We study the problem of unbiased estimation of expectations with respect to (w.r.t.) $π$ a given, general probability measure on $(\mathbb{R}^d,\mathcal{B}(\mathbb{R}^d))$ that is absolutely continuous with respect to a standard Gaussian measure. We focus on simulation associated to a particular class of diffusion processes, sometimes termed the Schrödinger-Föllmer Sampler, which is a simulation t…
▽ More
We study the problem of unbiased estimation of expectations with respect to (w.r.t.) $π$ a given, general probability measure on $(\mathbb{R}^d,\mathcal{B}(\mathbb{R}^d))$ that is absolutely continuous with respect to a standard Gaussian measure. We focus on simulation associated to a particular class of diffusion processes, sometimes termed the Schrödinger-Föllmer Sampler, which is a simulation technique that approximates the law of a particular diffusion bridge process $\{X_t\}_{t\in [0,1]}$ on $\mathbb{R}^d$, $d\in \mathbb{N}_0$. This latter process is constructed such that, starting at $X_0=0$, one has $X_1\sim π$. Typically, the drift of the diffusion is intractable and, even if it were not, exact sampling of the associated diffusion is not possible. As a result, \cite{sf_orig,jiao} consider a stochastic Euler-Maruyama scheme that allows the development of biased estimators for expectations w.r.t.~$π$. We show that for this methodology to achieve a mean square error of $\mathcal{O}(ε^2)$, for arbitrary $ε>0$, the associated cost is $\mathcal{O}(ε^{-5})$. We then introduce an alternative approach that provides unbiased estimates of expectations w.r.t.~$π$, that is, it does not suffer from the time discretization bias or the bias related with the approximation of the drift function. We prove that to achieve a mean square error of $\mathcal{O}(ε^2)$, the associated cost is, with high probability, $\mathcal{O}(ε^{-2}|\log(ε)|^{2+δ})$, for any $δ>0$. We implement our method on several examples including Bayesian inverse problems.
△ Less
Submitted 19 September, 2022; v1 submitted 6 March, 2022;
originally announced March 2022.
-
A Lagged Particle Filter for Stable Filtering of certain High-Dimensional State-Space Models
Authors:
Hamza Ruzayqat,
Aimad Er-Raiy,
Alexandros Beskos,
Dan Crisan,
Ajay Jasra,
Nikolas Kantas
Abstract:
We consider the problem of high-dimensional filtering of state-space models (SSMs) at discrete times. This problem is particularly challenging as analytical solutions are typically not available and many numerical approximation methods can have a cost that scales exponentially with the dimension of the hidden state. Inspired by lag-approximation methods for the smoothing problem, we introduce a la…
▽ More
We consider the problem of high-dimensional filtering of state-space models (SSMs) at discrete times. This problem is particularly challenging as analytical solutions are typically not available and many numerical approximation methods can have a cost that scales exponentially with the dimension of the hidden state. Inspired by lag-approximation methods for the smoothing problem, we introduce a lagged approximation of the smoothing distribution that is necessarily biased. For certain classes of SSMs, particularly those that forget the initial condition exponentially fast in time, the bias of our approximation is shown to be uniformly controlled in the dimension and exponentially small in time. We develop a sequential Monte Carlo (SMC) method to recursively estimate expectations with respect to our biased filtering distributions. Moreover, we prove for a class of class of SSMs that can contain dependencies amongst coordinates that as the dimension $d\rightarrow\infty$ the cost to achieve a stable mean square error in estimation, for classes of expectations, is of $\mathcal{O}(Nd^2)$ per-unit time, where $N$ is the number of simulated samples in the SMC algorithm. Our methodology is implemented on several challenging high-dimensional examples including the conservative shallow-water model.
△ Less
Submitted 12 January, 2022; v1 submitted 2 October, 2021;
originally announced October 2021.
-
The G-Wishart Weighted Proposal Algorithm: Efficient Posterior Computation for Gaussian Graphical Models
Authors:
Willem van den Boom,
Alexandros Beskos,
Maria De Iorio
Abstract:
Gaussian graphical models can capture complex dependency structures among variables. For such models, Bayesian inference is attractive as it provides principled ways to incorporate prior information and to quantify uncertainty through the posterior distribution. However, posterior computation under the conjugate G-Wishart prior distribution on the precision matrix is expensive for general non-deco…
▽ More
Gaussian graphical models can capture complex dependency structures among variables. For such models, Bayesian inference is attractive as it provides principled ways to incorporate prior information and to quantify uncertainty through the posterior distribution. However, posterior computation under the conjugate G-Wishart prior distribution on the precision matrix is expensive for general non-decomposable graphs. We therefore propose a new Markov chain Monte Carlo (MCMC) method named the G-Wishart weighted proposal algorithm (WWA). WWA's distinctive features include delayed acceptance MCMC, Gibbs updates for the precision matrix and an informed proposal distribution on the graph space that enables embarrassingly parallel computations. Compared to existing approaches, WWA reduces the frequency of the relatively expensive sampling from the G-Wishart distribution. This results in faster MCMC convergence, improved MCMC mixing and reduced computing time. Numerical studies on simulated and real data show that WWA provides a more efficient tool for posterior inference than competing state-of-the-art MCMC algorithms.
△ Less
Submitted 4 April, 2023; v1 submitted 3 August, 2021;
originally announced August 2021.
-
Unbiased approximation of posteriors via coupled particle Markov chain Monte Carlo
Authors:
Willem van den Boom,
Ajay Jasra,
Maria De Iorio,
Alexandros Beskos,
Johan G. Eriksson
Abstract:
Markov chain Monte Carlo (MCMC) is a powerful methodology for the approximation of posterior distributions. However, the iterative nature of MCMC does not naturally facilitate its use with modern highly parallel computation on HPC and cloud environments. Another concern is the identification of the bias and Monte Carlo error of produced averages. The above have prompted the recent development of f…
▽ More
Markov chain Monte Carlo (MCMC) is a powerful methodology for the approximation of posterior distributions. However, the iterative nature of MCMC does not naturally facilitate its use with modern highly parallel computation on HPC and cloud environments. Another concern is the identification of the bias and Monte Carlo error of produced averages. The above have prompted the recent development of fully ('embarrassingly') parallel unbiased Monte Carlo methodology based on coupling of MCMC algorithms. A caveat is that formulation of effective coupling is typically not trivial and requires model-specific technical effort. We propose coupling of MCMC chains deriving from sequential Monte Carlo (SMC) by considering adaptive SMC methods in combination with recent advances in unbiased estimation for state-space models. Coupling is then achieved at the SMC level and is, in principle, not problem-specific. The resulting methodology enjoys desirable theoretical properties. A central motivation is to extend unbiased MCMC to more challenging targets compared to the ones typically considered in the relevant literature. We illustrate the effectiveness of the algorithm via application to two complex statistical models: (i) horseshoe regression; (ii) Gaussian graphical models.
△ Less
Submitted 27 April, 2023; v1 submitted 8 March, 2021;
originally announced March 2021.
-
Score-Based Parameter Estimation for a Class of Continuous-Time State Space Models
Authors:
Alexandros Beskos,
Dan Crisan,
Ajay Jasra,
Nikolas Kantas,
Hamza Ruzayqat
Abstract:
We consider the problem of parameter estimation for a class of continuous-time state space models. In particular, we explore the case of a partially observed diffusion, with data also arriving according to a diffusion process. Based upon a standard identity of the score function, we consider two particle filter based methodologies to estimate the score function. Both methods rely on an online esti…
▽ More
We consider the problem of parameter estimation for a class of continuous-time state space models. In particular, we explore the case of a partially observed diffusion, with data also arriving according to a diffusion process. Based upon a standard identity of the score function, we consider two particle filter based methodologies to estimate the score function. Both methods rely on an online estimation algorithm for the score function of $\mathcal{O}(N^2)$ cost, with $N\in\mathbb{N}$ the number of particles. The first approach employs a simple Euler discretization and standard particle smoothers and is of cost $\mathcal{O}(N^2 + NΔ_l^{-1})$ per unit time, where $Δ_l=2^{-l}$, $l\in\mathbb{N}_0$, is the time-discretization step. The second approach is new and based upon a novel diffusion bridge construction. It yields a new backward type Feynman-Kac formula in continuous-time for the score function and is presented along with a particle method for its approximation. Considering a time-discretization, the cost is $\mathcal{O}(N^2Δ_l^{-1})$ per unit time. To improve computational costs, we then consider multilevel methodologies for the score function. We illustrate our parameter estimation method via stochastic gradient approaches in several numerical examples.
△ Less
Submitted 15 March, 2021; v1 submitted 18 August, 2020;
originally announced August 2020.
-
MCMC Algorithms for Posteriors on Matrix Spaces
Authors:
Alexandros Beskos,
Kengo Kamatani
Abstract:
We study Markov chain Monte Carlo (MCMC) algorithms for target distributions defined on matrix spaces. Such an important sampling problem has yet to be analytically explored. We carry out a major step in covering this gap by develo** the proper theoretical framework that allows for the identification of ergodicity properties of typical MCMC algorithms, relevant in such a context. Beyond the stan…
▽ More
We study Markov chain Monte Carlo (MCMC) algorithms for target distributions defined on matrix spaces. Such an important sampling problem has yet to be analytically explored. We carry out a major step in covering this gap by develo** the proper theoretical framework that allows for the identification of ergodicity properties of typical MCMC algorithms, relevant in such a context. Beyond the standard Random-Walk Metropolis (RWM) and preconditioned Crank--Nicolson (pCN), a contribution of this paper in the development of a novel algorithm, termed the `Mixed' pCN (MpCN). RWM and pCN are shown not to be geometrically ergodic for an important class of matrix distributions with heavy tails. In contrast, MpCN is robust across targets with different tail behaviour and has very good empirical performance within the class of heavy-tailed distributions. Geometric ergodicity for MpCN is not fully proven in this work, as some remaining drift conditions are quite challenging to obtain owing to the complexity of the state space. We do, however, make a lot of progress towards a proof, and show in detail the last steps left for future work.
We illustrate the computational performance of the various algorithms through numerical applications,
including calibration on real data of a challenging model arising in financial statistics.
△ Less
Submitted 6 November, 2021; v1 submitted 6 August, 2020;
originally announced August 2020.
-
Online Smoothing for Diffusion Processes Observed with Noise
Authors:
Shouto Yonekura,
Alexandros Beskos
Abstract:
We introduce a methodology for online estimation of smoothing expectations for a class of additive functionals, in the context of a rich family of diffusion processes (that may include jumps) -- observed at discrete-time instances. We overcome the unavailability of the transition density of the underlying SDE by working on the augmented pathspace. The new method can be applied, for instance, to ca…
▽ More
We introduce a methodology for online estimation of smoothing expectations for a class of additive functionals, in the context of a rich family of diffusion processes (that may include jumps) -- observed at discrete-time instances. We overcome the unavailability of the transition density of the underlying SDE by working on the augmented pathspace. The new method can be applied, for instance, to carry out online parameter inference for the designated class of models. Algorithms defined on the infinite-dimensional pathspace have been developed in the last years mainly in the context of MCMC techniques. There, the main benefit is the achievement of mesh-free mixing times for the practical time-discretised algorithm used on a PC. Our own methodology sets up the framework for infinite-dimensional online filtering -- an important positive practical consequence is the construct of estimates with the variance that does not increase with decreasing mesh-size. Besides regularity conditions, our method is, in principle, applicable under the weak assumption -- relatively to restrictive conditions often required in the MCMC or filtering literature of methods defined on pathspace -- that the SDE covariance matrix is invertible.
△ Less
Submitted 11 August, 2021; v1 submitted 27 March, 2020;
originally announced March 2020.
-
Manifold Markov chain Monte Carlo methods for Bayesian inference in diffusion models
Authors:
Matthew M. Graham,
Alexandre H. Thiery,
Alexandros Beskos
Abstract:
Bayesian inference for nonlinear diffusions, observed at discrete times, is a challenging task that has prompted the development of a number of algorithms, mainly within the computational statistics community. We propose a new direction, and accompanying methodology, borrowing ideas from statistical physics and computational chemistry, for inferring the posterior distribution of latent diffusion p…
▽ More
Bayesian inference for nonlinear diffusions, observed at discrete times, is a challenging task that has prompted the development of a number of algorithms, mainly within the computational statistics community. We propose a new direction, and accompanying methodology, borrowing ideas from statistical physics and computational chemistry, for inferring the posterior distribution of latent diffusion paths and model parameters, given observations of the process. Joint configurations of the underlying process noise and of parameters, map** onto diffusion paths consistent with observations, form an implicitly defined manifold. Then, by making use of a constrained Hamiltonian Monte Carlo algorithm on the embedded manifold, we are able to perform computationally efficient inference for a class of discretely observed diffusion models. Critically, in contrast with other approaches proposed in the literature, our methodology is highly automated, requiring minimal user intervention and applying alike in a range of settings, including: elliptic or hypo-elliptic systems; observations with or without noise; linear or non-linear observation operators. Exploiting Markovianity, we propose a variant of the method with complexity that scales linearly in the resolution of path discretisation and the number of observation times. Python code reproducing the results is available at https://doi.org/10.5281/zenodo.5796148
△ Less
Submitted 10 January, 2022; v1 submitted 6 December, 2019;
originally announced December 2019.
-
Monte Carlo Co-Ordinate Ascent Variational Inference
Authors:
Lifeng Ye,
Alexandros Beskos,
Maria De Iorio,
Jie Hao
Abstract:
In Variational Inference (VI), coordinate-ascent and gradient-based approaches are two major types of algorithms for approximating difficult-to-compute probability densities. In real-world implementations of complex models, Monte Carlo methods are widely used to estimate expectations in coordinate-ascent approaches and gradients in derivative-driven ones. We discuss a Monte Carlo Co-ordinate Ascen…
▽ More
In Variational Inference (VI), coordinate-ascent and gradient-based approaches are two major types of algorithms for approximating difficult-to-compute probability densities. In real-world implementations of complex models, Monte Carlo methods are widely used to estimate expectations in coordinate-ascent approaches and gradients in derivative-driven ones. We discuss a Monte Carlo Co-ordinate Ascent VI (MC-CAVI) algorithm that makes use of Markov chain Monte Carlo (MCMC) methods in the calculation of expectations required within Co-ordinate Ascent VI (CAVI). We show that, under regularity conditions, an MC-CAVI recursion will get arbitrarily close to a maximiser of the evidence lower bound (ELBO) with any given high probability. In numerical examples, the performance of MC-CAVI algorithm is compared with that of MCMC and -- as a representative of derivative-based VI methods -- of Black Box VI (BBVI). We discuss and demonstrate MC-CAVI's suitability for models with hard constraints in simulated and real examples. We compare MC-CAVI's performance with that of MCMC in an important complex model used in Nuclear Magnetic Resonance (NMR) spectroscopy data analysis -- BBVI is nearly impossible to be employed in this setting due to the hard constraints involved in the model.
△ Less
Submitted 17 October, 2019; v1 submitted 9 May, 2019;
originally announced May 2019.
-
Bridging trees for posterior inference on Ancestral Recombination Graphs
Authors:
Kari Heine,
Alex Beskos,
Ajay Jasra,
David Balding,
Maria De Iorio
Abstract:
We present a new Markov chain Monte Carlo algorithm, implemented in software Arbores, for inferring the history of a sample of DNA sequences. Our principal innovation is a bridging procedure, previously applied only for simple stochastic processes, in which the local computations within a bridge can proceed independently of the rest of the DNA sequence, facilitating large-scale parallelisation.
We present a new Markov chain Monte Carlo algorithm, implemented in software Arbores, for inferring the history of a sample of DNA sequences. Our principal innovation is a bridging procedure, previously applied only for simple stochastic processes, in which the local computations within a bridge can proceed independently of the rest of the DNA sequence, facilitating large-scale parallelisation.
△ Less
Submitted 4 December, 2018;
originally announced December 2018.
-
Asymptotic Analysis of Model Selection Criteria for General Hidden Markov Models
Authors:
Shouto Yonekura,
Alexandros Beskos,
Sumeetpal S. Singh
Abstract:
The paper obtains analytical results for the asymptotic properties of Model Selection Criteria -- widely used in practice -- for a general family of hidden Markov models (HMMs), thereby substantially extending the related theory beyond typical i.i.d.-like model structures and filling in an important gap in the relevant literature. In particular, we look at the Bayesian and Akaike Information Crite…
▽ More
The paper obtains analytical results for the asymptotic properties of Model Selection Criteria -- widely used in practice -- for a general family of hidden Markov models (HMMs), thereby substantially extending the related theory beyond typical i.i.d.-like model structures and filling in an important gap in the relevant literature. In particular, we look at the Bayesian and Akaike Information Criteria (BIC and AIC) and the model evidence. In the setting of nested classes of models, we prove that BIC and the evidence are strongly consistent for HMMs (under regularity conditions), whereas AIC is not weakly consistent. Numerical experiments support our theoretical results.
△ Less
Submitted 30 March, 2020; v1 submitted 28 November, 2018;
originally announced November 2018.
-
A 4D-Var Method with Flow-Dependent Background Covariances for the Shallow-Water Equations
Authors:
Daniel Paulin,
Ajay Jasra,
Alexandros Beskos,
Dan Crisan
Abstract:
The 4D-Var method for filtering partially observed nonlinear chaotic dynamical systems consists of finding the maximum a-posteriori (MAP) estimator of the initial condition of the system given observations over a time window, and propagating it forward to the current time via the model dynamics. This method forms the basis of most currently operational weather forecasting systems. In practice the…
▽ More
The 4D-Var method for filtering partially observed nonlinear chaotic dynamical systems consists of finding the maximum a-posteriori (MAP) estimator of the initial condition of the system given observations over a time window, and propagating it forward to the current time via the model dynamics. This method forms the basis of most currently operational weather forecasting systems. In practice the optimization becomes infeasible if the time window is too long due to the non-convexity of the cost function, the effect of model errors, and the limited precision of the ODE solvers. Hence the window has to be kept sufficiently short, and the observations in the previous windows can be taken into account via a Gaussian background (prior) distribution. The choice of the background covariance matrix is an important question that has received much attention in the literature. In this paper, we define the background covariances in a principled manner, based on observations in the previous $b$ assimilation windows, for a parameter $b\ge 1$. The method is at most $b$ times more computationally expensive than using fixed background covariances, requires little tuning, and greatly improves the accuracy of 4D-Var. As a concrete example, we focus on the shallow-water equations. The proposed method is compared against state-of-the-art approaches in data assimilation and is shown to perform favourably on simulated data. We also illustrate our approach on data from the recent tsunami of 2011 in Fukushima, Japan.
△ Less
Submitted 17 January, 2021; v1 submitted 31 October, 2017;
originally announced October 2017.
-
Particle Filtering for Stochastic Navier-Stokes Signal Observed with Linear Additive Noise
Authors:
Francesc Pons Llopis,
Nikolas Kantas,
Alexandros Beskos,
Ajay Jasra
Abstract:
We consider a non-linear filtering problem, whereby the signal obeys the stochastic Navier-Stokes equations and is observed through a linear map** with additive noise. The setup is relevant to data assimilation for numerical weather prediction and climate modelling, where similar models are used for unknown ocean or wind velocities. We present a particle filtering methodology that uses likelihoo…
▽ More
We consider a non-linear filtering problem, whereby the signal obeys the stochastic Navier-Stokes equations and is observed through a linear map** with additive noise. The setup is relevant to data assimilation for numerical weather prediction and climate modelling, where similar models are used for unknown ocean or wind velocities. We present a particle filtering methodology that uses likelihood informed importance proposals, adaptive tempering, and a small number of appropriate Markov Chain Monte Carlo steps. We provide a detailed design for each of these steps and show in our numerical examples that they are all crucial in terms of achieving good performance and efficiency.
△ Less
Submitted 9 April, 2018; v1 submitted 12 October, 2017;
originally announced October 2017.
-
Efficient sequential Monte Carlo algorithms for integrated population models
Authors:
Axel Finke,
Ruth King,
Alexandros Beskos,
Petros Dellaportas
Abstract:
State-space models are commonly used to describe different forms of ecological data. We consider the case of count data with observation errors. For such data the system process is typically multi-dimensional consisting of coupled Markov processes, where each component corresponds to a different characterisation of the population, such as age group, gender or breeding status. The associated system…
▽ More
State-space models are commonly used to describe different forms of ecological data. We consider the case of count data with observation errors. For such data the system process is typically multi-dimensional consisting of coupled Markov processes, where each component corresponds to a different characterisation of the population, such as age group, gender or breeding status. The associated system process equations describe the biological mechanisms under which the system evolves over time. However, there is often limited information in the count data alone to sensibly estimate demographic parameters of interest, so these are often combined with additional ecological observations leading to an integrated data analysis. Unfortunately, fitting these models to the data can be challenging, especially if the state-space model for the count data is non-linear or non-Gaussian. We propose an efficient particle Markov chain Monte Carlo algorithm to estimate the demographic parameters without the need for resorting to linear or Gaussian approximations. In particular, we exploit the integrated model structure to enhance the efficiency of the algorithm. We then incorporate the algorithm into a sequential Monte Carlo sampler in order to perform model comparison with regards to the dependence structure of the demographic parameters. Finally, we demonstrate the applicability and computational efficiency of our algorithms on two real datasets.
△ Less
Submitted 14 August, 2017;
originally announced August 2017.
-
Multilevel Sequential Monte Carlo with Dimension-Independent Likelihood-Informed Proposals
Authors:
Alexandros Beskos,
Ajay Jasra,
Kody Law,
Youssef Marzouk,
Yan Zhou
Abstract:
In this article we develop a new sequential Monte Carlo (SMC) method for multilevel (ML) Monte Carlo estimation. In particular, the method can be used to estimate expectations with respect to a target probability distribution over an infinite-dimensional and non-compact space as given, for example, by a Bayesian inverse problem with Gaussian random field prior. Under suitable assumptions the MLSMC…
▽ More
In this article we develop a new sequential Monte Carlo (SMC) method for multilevel (ML) Monte Carlo estimation. In particular, the method can be used to estimate expectations with respect to a target probability distribution over an infinite-dimensional and non-compact space as given, for example, by a Bayesian inverse problem with Gaussian random field prior. Under suitable assumptions the MLSMC method has the optimal $O(ε^{-2})$ bound on the cost to obtain a mean-square error of $O(ε^2)$. The algorithm is accelerated by dimension-independent likelihood-informed (DILI) proposals designed for Gaussian priors, leveraging a novel variation which uses empirical sample covariance information in lieu of Hessian information, hence eliminating the requirement for gradient evaluations. The efficiency of the algorithm is illustrated on two examples: inversion of noisy pressure measurements in a PDE model of Darcy flow to recover the posterior distribution of the permeability field, and inversion of noisy measurements of the solution of an SDE to recover the posterior path measure.
△ Less
Submitted 14 March, 2017;
originally announced March 2017.
-
Optimization Based Methods for Partially Observed Chaotic Systems
Authors:
Daniel Paulin,
Ajay Jasra,
Dan Crisan,
Alexandros Beskos
Abstract:
In this paper we consider filtering and smoothing of partially observed chaotic dynamical systems that are discretely observed, with an additive Gaussian noise in the observation. These models are found in a wide variety of real applications and include the Lorenz 96' model. In the context of a fixed observation interval $T$, observation time step $h$ and Gaussian observation variance $σ_Z^2$, we…
▽ More
In this paper we consider filtering and smoothing of partially observed chaotic dynamical systems that are discretely observed, with an additive Gaussian noise in the observation. These models are found in a wide variety of real applications and include the Lorenz 96' model. In the context of a fixed observation interval $T$, observation time step $h$ and Gaussian observation variance $σ_Z^2$, we show under assumptions that the filter and smoother are well approximated by a Gaussian with high probability when $h$ and $σ^2_Z h$ are sufficiently small. Based on this result we show that the Maximum-a-posteriori (MAP) estimators are asymptotically optimal in mean square error as $σ^2_Z h$ tends to $0$. Given these results, we provide a batch algorithm for the smoother and filter, based on Newton's method, to obtain the MAP. In particular, we show that if the initial point is close enough to the MAP, then Newton's method converges to it at a fast rate. We also provide a method for computing such an initial point. These results contribute to the theoretical understanding of widely used 4D-Var data assimilation method. Our approach is illustrated numerically on the Lorenz 96' model with state vector up to 1 million dimensions, with code running in the order of minutes. To our knowledge the results in this paper are the first of their type for this class of models.
△ Less
Submitted 25 February, 2018; v1 submitted 8 February, 2017;
originally announced February 2017.
-
Geometric MCMC for Infinite-Dimensional Inverse Problems
Authors:
Alexandros Beskos,
Mark Girolami,
Shiwei Lan,
Patrick E. Farrell,
Andrew M. Stuart
Abstract:
Bayesian inverse problems often involve sampling posterior distributions on infinite-dimensional function spaces. Traditional Markov chain Monte Carlo (MCMC) algorithms are characterized by deteriorating mixing times upon mesh-refinement, when the finite-dimensional approximations become more accurate. Such methods are typically forced to reduce step-sizes as the discretization gets finer, and thu…
▽ More
Bayesian inverse problems often involve sampling posterior distributions on infinite-dimensional function spaces. Traditional Markov chain Monte Carlo (MCMC) algorithms are characterized by deteriorating mixing times upon mesh-refinement, when the finite-dimensional approximations become more accurate. Such methods are typically forced to reduce step-sizes as the discretization gets finer, and thus are expensive as a function of dimension. Recently, a new class of MCMC methods with mesh-independent convergence times has emerged. However, few of them take into account the geometry of the posterior informed by the data. At the same time, recently developed geometric MCMC algorithms have been found to be powerful in exploring complicated distributions that deviate significantly from elliptic Gaussian laws, but are in general computationally intractable for models defined in infinite dimensions. In this work, we combine geometric methods on a finite-dimensional subspace with mesh-independent infinite-dimensional approaches. Our objective is to speed up MCMC mixing times, without significantly increasing the computational cost per step (for instance, in comparison with the vanilla preconditioned Crank-Nicolson (pCN) method). This is achieved by using ideas from geometric MCMC to probe the complex structure of an intrinsic finite-dimensional subspace where most data information concentrates, while retaining robust mixing times as the dimension grows by using pCN-like methods in the complementary subspace. The resulting algorithms are demonstrated in the context of three challenging inverse problems arising in subsurface flow, heat conduction and incompressible flow control. The algorithms exhibit up to two orders of magnitude improvement in sampling efficiency when compared with the pCN method.
△ Less
Submitted 13 January, 2017; v1 submitted 20 June, 2016;
originally announced June 2016.
-
Asymptotic Analysis of the Random-Walk Metropolis Algorithm on Ridged Densities
Authors:
Alexandros Beskos,
Gareth Roberts,
Alexandre Thiery,
Natesh Pillai
Abstract:
In this paper we study the asymptotic behavior of the Random-Walk Metropolis algorithm on probability densities with two different `scales', where most of the probability mass is distributed along certain key directions with the `orthogonal' directions containing relatively less mass. Such class of probability measures arise in various applied contexts including Bayesian inverse problems where the…
▽ More
In this paper we study the asymptotic behavior of the Random-Walk Metropolis algorithm on probability densities with two different `scales', where most of the probability mass is distributed along certain key directions with the `orthogonal' directions containing relatively less mass. Such class of probability measures arise in various applied contexts including Bayesian inverse problems where the posterior measure concentrates on a sub-manifold when the noise variance goes to zero. When the target measure concentrates on a linear sub-manifold, we derive analytically a diffusion limit for the Random-Walk Metropolis Markov chain as the scale parameter goes to zero. In contrast to the existing works on scaling limits, our limiting Stochastic Differential Equation does not in general have a constant diffusion coefficient. Our results show that in some cases, the usual practice of adapting the step-size to control the acceptance probability might be sub-optimal as the optimal acceptance probability is zero (in the limit).
△ Less
Submitted 9 October, 2015;
originally announced October 2015.
-
Bayesian Inference for Duplication-Mutation with Complementarity Network Models
Authors:
Ajay Jasra,
Adam Persing,
Alexandros Beskos,
Kari Heine,
Maria De Iorio
Abstract:
We observe an undirected graph $G$ without multiple edges and self-loops, which is to represent a protein-protein interaction (PPI) network. We assume that $G$ evolved under the duplication-mutation with complementarity (DMC) model from a seed graph, $G_0$, and we also observe the binary forest $Γ$ that represents the duplication history of $G$. A posterior density for the DMC model parameters is…
▽ More
We observe an undirected graph $G$ without multiple edges and self-loops, which is to represent a protein-protein interaction (PPI) network. We assume that $G$ evolved under the duplication-mutation with complementarity (DMC) model from a seed graph, $G_0$, and we also observe the binary forest $Γ$ that represents the duplication history of $G$. A posterior density for the DMC model parameters is established, and we outline a sampling strategy by which one can perform Bayesian inference; that sampling strategy employs a particle marginal Metropolis-Hastings (PMMH) algorithm. We test our methodology on numerical examples to demonstrate a high accuracy and precision in the inference of the DMC model's mutation and homodimerization parameters.
△ Less
Submitted 7 April, 2015;
originally announced April 2015.
-
Multilevel Sequential Monte Carlo Samplers
Authors:
Alexandros Beskos,
Ajay Jasra,
Kody Law,
Raul Tempone,
Yan Zhou
Abstract:
In this article we consider the approximation of expectations w.r.t. probability distributions associated to the solution of partial differential equations (PDEs); this scenario appears routinely in Bayesian inverse problems. In practice, one often has to solve the associated PDE numerically, using, for instance finite element methods and leading to a discretisation bias, with the step-size level…
▽ More
In this article we consider the approximation of expectations w.r.t. probability distributions associated to the solution of partial differential equations (PDEs); this scenario appears routinely in Bayesian inverse problems. In practice, one often has to solve the associated PDE numerically, using, for instance finite element methods and leading to a discretisation bias, with the step-size level $h_L$. In addition, the expectation cannot be computed analytically and one often resorts to Monte Carlo methods. In the context of this problem, it is known that the introduction of the multilevel Monte Carlo (MLMC) method can reduce the amount of computational effort to estimate expectations, for a given level of error. This is achieved via a telesco** identity associated to a Monte Carlo approximation of a sequence of probability distributions with discretisation levels $\infty>h_0>h_1\cdots>h_L$. In many practical problems of interest, one cannot achieve an i.i.d. sampling of the associated sequence of probability distributions. A sequential Monte Carlo (SMC) version of the MLMC method is introduced to deal with this problem. It is shown that under appropriate assumptions, the attractive property of a reduction of the amount of computational effort to estimate expectations, for a given level of error, can be maintained within the SMC context.
△ Less
Submitted 24 March, 2015;
originally announced March 2015.
-
Sequential Monte Carlo Methods for Bayesian Elliptic Inverse Problems
Authors:
Alex Beskos,
Ajay Jasra,
Ege Muzaffer,
Andrew Stuart
Abstract:
In this article we consider a Bayesian inverse problem associated to elliptic partial differential equations (PDEs) in two and three dimensions. This class of inverse problems is important in applications such as hydrology, but the complexity of the link function between unknown field and measurements can make it difficult to draw inference from the associated posterior. We prove that for this inv…
▽ More
In this article we consider a Bayesian inverse problem associated to elliptic partial differential equations (PDEs) in two and three dimensions. This class of inverse problems is important in applications such as hydrology, but the complexity of the link function between unknown field and measurements can make it difficult to draw inference from the associated posterior. We prove that for this inverse problem a basic SMC method has a Monte Carlo rate of convergence with constants which are independent of the dimension of the discretization of the problem; indeed convergence of the SMC method is established in a function space setting. We also develop an enhancement of the sequential Monte Carlo (SMC) methods for inverse problems which were introduced in \cite{kantas}; the enhancement is designed to deal with the additional complexity of this elliptic inverse problem. The efficacy of the methodology, and its desirable theoretical properties, are demonstrated on numerical examples in both two and three dimensions.
△ Less
Submitted 14 December, 2014;
originally announced December 2014.
-
A Stable Particle Filter in High-Dimensions
Authors:
Alex Beskos,
Dan Crisan,
Ajay Jasra,
Kengo Kamatani,
Yan Zhou
Abstract:
We consider the numerical approximation of the filtering problem in high dimensions, that is, when the hidden state lies in $\mathbb{R}^d$ with $d$ large. For low dimensional problems, one of the most popular numerical procedures for consistent inference is the class of approximations termed particle filters or sequential Monte Carlo methods. However, in high dimensions, standard particle filters…
▽ More
We consider the numerical approximation of the filtering problem in high dimensions, that is, when the hidden state lies in $\mathbb{R}^d$ with $d$ large. For low dimensional problems, one of the most popular numerical procedures for consistent inference is the class of approximations termed particle filters or sequential Monte Carlo methods. However, in high dimensions, standard particle filters (e.g. the bootstrap particle filter) can have a cost that is exponential in $d$ for the algorithm to be stable in an appropriate sense. We develop a new particle filter, called the \emph{space-time particle filter}, for a specific family of state-space models in discrete time. This new class of particle filters provide consistent Monte Carlo estimates for any fixed $d$, as do standard particle filters. Moreover, we expect that the state-space particle filter will scale much better with $d$ than the standard filter. We illustrate this analytically for a model of a simple i.i.d. structure and one of a Markovian structure in the $d$-dimensional space-direction, when we show that the algorithm exhibits certain stability properties as $d$ increases at a cost $\mathcal{O}(nNd^2)$, where $n$ is the time parameter and $N$ is the number of Monte Carlo samples, that are fixed and independent of $d$. Similar results are expected to hold, under a more general structure than the i.i.d.~one. independently of the dimension. Our theoretical results are also supported by numerical simulations on practical models of complex structures. The results suggest that it is indeed possible to tackle some high dimensional filtering problems using the space-time particle filter that standard particle filters cannot handle.
△ Less
Submitted 10 December, 2014;
originally announced December 2014.
-
A simulation approach for change-points on phylogenetic trees
Authors:
Adam Persing,
Ajay Jasra,
Alexandros Beskos,
David Balding,
Maria De Iorio
Abstract:
We observe $n$ sequences at each of $m$ sites, and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all sites, but the parameters of the evolutionary model can vary over sites. We assume a piecewise co…
▽ More
We observe $n$ sequences at each of $m$ sites, and assume that they have evolved from an ancestral sequence that forms the root of a binary tree of known topology and branch lengths, but the sequence states at internal nodes are unknown. The topology of the tree and branch lengths are the same for all sites, but the parameters of the evolutionary model can vary over sites. We assume a piecewise constant model for these parameters, with an unknown number of change-points and hence a trans-dimensional parameter space over which we seek to perform Bayesian inference. We propose two novel ideas to deal with the computational challenges of such inference. Firstly, we approximate the model based on the time machine principle: the top nodes of the binary tree (near the root) are replaced by an approximation of the true distribution; as more nodes are removed from the top of the tree, the cost of computing the likelihood is reduced linearly in $n$. The approach introduces a bias, which we investigate empirically. Secondly, we develop a particle marginal Metropolis-Hastings (PMMH) algorithm, that employs a sequential Monte Carlo (SMC) sampler and can use the first idea. Our time-machine PMMH algorithm copes well with one of the bottle-necks of standard computational algorithms: the trans-dimensional nature of the posterior distribution. The algorithm is implemented on simulated and real data examples, and we empirically demonstrate its potential to outperform competing methods based on approximate Bayesian computation (ABC) techniques.
△ Less
Submitted 27 August, 2014;
originally announced August 2014.
-
A Stable Manifold MCMC Method for High Dimensions
Authors:
Alexandros Beskos
Abstract:
We combine two important recent advancements of MCMC algorithms: first, methods utilizing the intrinsic manifold structure of the parameter space; then, algorithms effective for targets in infinite-dimensions with the critical property that their mixing time is robust to mesh refinement.
We combine two important recent advancements of MCMC algorithms: first, methods utilizing the intrinsic manifold structure of the parameter space; then, algorithms effective for targets in infinite-dimensions with the critical property that their mixing time is robust to mesh refinement.
△ Less
Submitted 30 March, 2014;
originally announced March 2014.
-
Sequential Monte Carlo Methods for High-Dimensional Inverse Problems: A case study for the Navier-Stokes equations
Authors:
Nikolas Kantas,
Alexandros Beskos,
Ajay Jasra
Abstract:
We consider the inverse problem of estimating the initial condition of a partial differential equation, which is only observed through noisy measurements at discrete time intervals. In particular, we focus on the case where Eulerian measurements are obtained from the time and space evolving vector field, whose evolution obeys the two-dimensional Navier-Stokes equations defined on a torus. This con…
▽ More
We consider the inverse problem of estimating the initial condition of a partial differential equation, which is only observed through noisy measurements at discrete time intervals. In particular, we focus on the case where Eulerian measurements are obtained from the time and space evolving vector field, whose evolution obeys the two-dimensional Navier-Stokes equations defined on a torus. This context is particularly relevant to the area of numerical weather forecasting and data assimilation. We will adopt a Bayesian formulation resulting from a particular regularization that ensures the problem is well posed. In the context of Monte Carlo based inference, it is a challenging task to obtain samples from the resulting high dimensional posterior on the initial condition. In real data assimilation applications it is common for computational methods to invoke the use of heuristics and Gaussian approximations. The resulting inferences are biased and not well-justified in the presence of non-linear dynamics and observations. On the other hand, Monte Carlo methods can be used to assimilate data in a principled manner, but are often perceived as inefficient in this context due to the high-dimensionality of the problem. In this work we will propose a generic Sequential Monte Carlo (SMC) sampling approach for high dimensional inverse problems that overcomes these difficulties. The method builds upon Markov chain Monte Carlo (MCMC) techniques, which are currently considered as benchmarks for evaluating data assimilation algorithms used in practice. In our numerical examples, the proposed SMC approach achieves the same accuracy as MCMC but in a much more efficient manner.
△ Less
Submitted 23 July, 2013;
originally announced July 2013.
-
Bayesian Inference for partially observed SDEs Driven by Fractional Brownian Motion
Authors:
Alexandros Beskos,
Joseph Dureau,
Konstantinos Kalogeropoulos
Abstract:
We consider continuous-time diffusion models driven by fractional Brownian motion. Observations are assumed to possess a non-trivial likelihood given the latent path. Due to the non-Markovianity and high-dimensionality of the latent paths, estimating posterior expectations is a computationally challenging undertaking. We present a reparameterization framework based on the Davies and Harte method f…
▽ More
We consider continuous-time diffusion models driven by fractional Brownian motion. Observations are assumed to possess a non-trivial likelihood given the latent path. Due to the non-Markovianity and high-dimensionality of the latent paths, estimating posterior expectations is a computationally challenging undertaking. We present a reparameterization framework based on the Davies and Harte method for sampling stationary Gaussian processes and use this framework to construct a Markov chain Monte Carlo algorithm that allows computationally efficient Bayesian inference. The Markov chain Monte Carlo algorithm is based on a version of hybrid Monte Carlo that delivers increased efficiency when applied on the high-dimensional latent variables arising in this context. We specify the methodology on a stochastic volatility model allowing for memory in the volatility increments through a fractional specification. The methodology is illustrated on simulated data and on the S&P500/VIX time series and is shown to be effective. Contrary to a long range dependence attribute of such models often assumed in the literature, with Hurst parameter larger than 1/2, the posterior distribution favours values smaller than 1/2, pointing towards medium range dependence.
△ Less
Submitted 24 March, 2015; v1 submitted 30 June, 2013;
originally announced July 2013.
-
On the Convergence of Adaptive Sequential Monte Carlo Methods
Authors:
Alexandros Beskos,
Ajay Jasra,
Nikolas Kantas,
Alexandre Thiery
Abstract:
In several implementations of Sequential Monte Carlo (SMC) methods it is natural, and important in terms of algorithmic efficiency, to exploit the information of the history of the samples to optimally tune their subsequent propagations. In this article we provide a carefully formulated asymptotic theory for a class of such \emph{adaptive} SMC methods. The theoretical framework developed here will…
▽ More
In several implementations of Sequential Monte Carlo (SMC) methods it is natural, and important in terms of algorithmic efficiency, to exploit the information of the history of the samples to optimally tune their subsequent propagations. In this article we provide a carefully formulated asymptotic theory for a class of such \emph{adaptive} SMC methods. The theoretical framework developed here will cover, under assumptions, several commonly used SMC algorithms. There are only limited results about the theoretical underpinning of such adaptive methods: we will bridge this gap by providing a weak law of large numbers (WLLN) and a central limit theorem (CLT) for some of these algorithms. The latter seems to be the first result of its kind in the literature and provides a formal justification of algorithms used in many real data context. We establish that for a general class of adaptive SMC algorithms the asymptotic variance of the estimators from the adaptive SMC method is \emph{identical} to a so-called `perfect' SMC algorithm which uses ideal proposal kernels. Our results are supported by application on a complex high-dimensional posterior distribution associated with the Navier-Stokes model, where adapting high-dimensional parameters of the proposal kernels is critical for the efficiency of the algorithm.
△ Less
Submitted 5 February, 2014; v1 submitted 27 June, 2013;
originally announced June 2013.
-
Advanced MCMC Methods for Sampling on Diffusion Pathspace
Authors:
Alexandros Beskos,
Konstantinos Kalogeropoulos,
Erik Pazos
Abstract:
The need to calibrate increasingly complex statistical models requires a persistent effort for further advances on available, computationally intensive Monte Carlo methods. We study here an advanced version of familiar Markov Chain Monte Carlo (MCMC) algorithms that sample from target distributions defined as change of measures from Gaussian laws on general Hilbert spaces. Such a model structure a…
▽ More
The need to calibrate increasingly complex statistical models requires a persistent effort for further advances on available, computationally intensive Monte Carlo methods. We study here an advanced version of familiar Markov Chain Monte Carlo (MCMC) algorithms that sample from target distributions defined as change of measures from Gaussian laws on general Hilbert spaces. Such a model structure arises in several contexts: we focus here at the important class of statistical models driven by diffusion paths whence the Wiener process constitutes the reference Gaussian law. Particular emphasis is given on advanced Hybrid Monte-Carlo (HMC) which makes large, derivative-driven steps in the state space (in contrast with local-move Random-walk-type algorithms) with analytical and experimental results. We illustrate it's computational advantages in various diffusion processes and observation regimes; examples include stochastic volatility and latent survival models. In contrast with their standard MCMC counterparts, the advanced versions have mesh-free mixing times, as these will not deteriorate upon refinement of the approximation of the inherently infinite-dimensional diffusion paths by finite-dimensional ones used in practice when applying the algorithms on a computer.
△ Less
Submitted 7 March, 2013; v1 submitted 28 March, 2012;
originally announced March 2012.
-
Error Bounds and Normalizing Constants for Sequential Monte Carlo in High Dimensions
Authors:
Alexandros Beskos,
Dan Crisan,
Ajay Jasra,
Nick Whiteley
Abstract:
In a recent paper Beskos et al (2011), the Sequential Monte Carlo (SMC) sampler introduced in Del Moral et al (2006), Neal (2001) has been shown to be asymptotically stable in the dimension of the state space d at a cost that is only polynomial in d, when N the number of Monte Carlo samples, is fixed. More precisely, it has been established that the effective sample size (ESS) of the ensuing (appr…
▽ More
In a recent paper Beskos et al (2011), the Sequential Monte Carlo (SMC) sampler introduced in Del Moral et al (2006), Neal (2001) has been shown to be asymptotically stable in the dimension of the state space d at a cost that is only polynomial in d, when N the number of Monte Carlo samples, is fixed. More precisely, it has been established that the effective sample size (ESS) of the ensuing (approximate) sample and the Monte Carlo error of fixed dimensional marginals will converge as $d$ grows, with a computational cost of $\mathcal{O}(Nd^2)$. In the present work, further results on SMC methods in high dimensions are provided as $d\to\infty$ and with $N$ fixed. We deduce an explicit bound on the Monte-Carlo error for estimates derived using the SMC sampler and the exact asymptotic relative $\mathbb{L}_2$-error of the estimate of the normalizing constant. We also establish marginal propagation of chaos properties of the algorithm. The accuracy in high-dimensions of some approximate SMC-based filtering schemes is also discussed.
△ Less
Submitted 7 December, 2011;
originally announced December 2011.
-
ε-Strong simulation of the Brownian path
Authors:
Alexandros Beskos,
Stefano Peluchetti,
Gareth Roberts
Abstract:
We present an iterative sampling method which delivers upper and lower bounding processes for the Brownian path. We develop such processes with particular emphasis on being able to unbiasedly simulate them on a personal computer. The dominating processes converge almost surely in the supremum and $L_1$ norms. In particular, the rate of converge in $L_1$ is of the order…
▽ More
We present an iterative sampling method which delivers upper and lower bounding processes for the Brownian path. We develop such processes with particular emphasis on being able to unbiasedly simulate them on a personal computer. The dominating processes converge almost surely in the supremum and $L_1$ norms. In particular, the rate of converge in $L_1$ is of the order $\mathcal {O}(\mathcal{K}^{-1/2})$, $\mathcal{K}$ denoting the computing cost. The a.s. enfolding of the Brownian path can be exploited in Monte Carlo applications involving Brownian paths whence our algorithm (termed the $\varepsilon$-strong algorithm) can deliver unbiased Monte Carlo estimators over path expectations, overcoming discretisation errors characterising standard approaches. We will show analytical results from applications of the $\varepsilon$-strong algorithm for estimating expectations arising in option pricing. We will also illustrate that individual steps of the algorithm can be of separate interest, giving new simulation methods for interesting Brownian distributions.
△ Less
Submitted 26 November, 2012; v1 submitted 1 October, 2011;
originally announced October 2011.
-
On the Stability of Sequential Monte Carlo Methods in High Dimensions
Authors:
Alexandros Beskos,
Dan Crisan,
Ajay Jasra
Abstract:
We investigate the stability of a Sequential Monte Carlo (SMC) method applied to the problem of sampling from a target distribution on $\mathbb{R}^d$ for large $d$. It is well known that using a single importance sampling step one produces an approximation for the target that deteriorates as the dimension $d$ increases, unless the number of Monte Carlo samples $N$ increases at an exponential rate…
▽ More
We investigate the stability of a Sequential Monte Carlo (SMC) method applied to the problem of sampling from a target distribution on $\mathbb{R}^d$ for large $d$. It is well known that using a single importance sampling step one produces an approximation for the target that deteriorates as the dimension $d$ increases, unless the number of Monte Carlo samples $N$ increases at an exponential rate in $d$. We show that this degeneracy can be avoided by introducing a sequence of artificial targets, starting from a `simple' density and moving to the one of interest, using an SMC method to sample from the sequence. Using this class of SMC methods with a fixed number of samples, one can produce an approximation for which the effective sample size (ESS) converges to a random variable $\varepsilon_N$ as $d\rightarrow\infty$ with $1<\varepsilon_{N}<N$. The convergence is achieved with a computational cost proportional to $Nd^2$. If $\varepsilon_N\ll N$, we can raise its value by introducing a number of resampling steps, say $m$ (where $m$ is independent of $d$). In this case, ESS converges to a random variable $\varepsilon_{N,m}$ as $d\rightarrow\infty$ and $\lim_{m\to\infty}\varepsilon_{N,m}=N$. Also, we show that the Monte Carlo error for estimating a fixed dimensional marginal expectation is of order $\frac{1}{\sqrt{N}}$ uniformly in $d$. The results imply that, in high dimensions, SMC algorithms can efficiently control the variability of the importance sampling weights and estimate fixed dimensional marginals at a cost which is less than exponential in $d$ and indicate that, in high dimensions, resampling leads to a reduction in the Monte Carlo error and increase in the ESS.
△ Less
Submitted 18 April, 2012; v1 submitted 21 March, 2011;
originally announced March 2011.
-
Markov chain Monte Carlo for exact inference for diffusions
Authors:
Giorgos Sermaidis,
Omiros Papaspiliopoulos,
Gareth O. Roberts,
Alex Beskos,
Paul Fearnhead
Abstract:
We develop exact Markov chain Monte Carlo methods for discretely-sampled, directly and indirectly observed diffusions. The qualification "exact" refers to the fact that the invariant and limiting distribution of the Markov chains is the posterior distribution of the parameters free of any discretisation error. The class of processes to which our methods directly apply are those which can be simula…
▽ More
We develop exact Markov chain Monte Carlo methods for discretely-sampled, directly and indirectly observed diffusions. The qualification "exact" refers to the fact that the invariant and limiting distribution of the Markov chains is the posterior distribution of the parameters free of any discretisation error. The class of processes to which our methods directly apply are those which can be simulated using the most general to date exact simulation algorithm. The article introduces various methods to boost the performance of the basic scheme, including reparametrisations and auxiliary Poisson sampling. We contrast both theoretically and empirically how this new approach compares to irreducible high frequency imputation, which is the state-of-the-art alternative for the class of processes we consider, and we uncover intriguing connections. All methods discussed in the article are tested on typical examples.
△ Less
Submitted 3 May, 2012; v1 submitted 27 February, 2011;
originally announced February 2011.