Search | arXiv e-print repository

A Multiscale Perspective on Maximum Marginal Likelihood Estimation

Authors: O. Deniz Akyildiz, Michela Ottobre, Iain Souttar

Abstract: In this paper, we provide a multiscale perspective on the problem of maximum marginal likelihood estimation. We consider and analyse a diffusion-based maximum marginal likelihood estimation scheme using ideas from multiscale dynamics. Our perspective is based on stochastic averaging; we make an explicit connection between ideas in applied probability and parameter inference in computational statis… ▽ More In this paper, we provide a multiscale perspective on the problem of maximum marginal likelihood estimation. We consider and analyse a diffusion-based maximum marginal likelihood estimation scheme using ideas from multiscale dynamics. Our perspective is based on stochastic averaging; we make an explicit connection between ideas in applied probability and parameter inference in computational statistics. In particular, we consider a general class of coupled Langevin diffusions for joint inference of latent variables and parameters in statistical models, where the latent variables are sampled from a fast Langevin process (which acts as a sampler), and the parameters are updated using a slow Langevin process (which acts as an optimiser). We show that the resulting system of stochastic differential equations (SDEs) can be viewed as a two-time scale system. To demonstrate the utility of such a perspective, we show that the averaged parameter dynamics obtained in the limit of scale separation can be used to estimate the optimal parameter, within the strongly convex setting. We do this by using recent uniform-in-time non-asymptotic averaging bounds. Finally, we conclude by showing that the slow-fast algorithm we consider here, termed Slow-Fast Langevin Algorithm, performs on par with state-of-the-art methods on a variety of examples. We believe that the stochastic averaging approach we provide in this paper enables us to look at these algorithms from a fresh angle, as well as unlocking the path to develop and analyse new methods using well-established averaging principles. △ Less

Submitted 10 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

Comments: 31 pages, 3 figures

MSC Class: 62-08 (Primary); 60J60; 62C12; 62F15; 65C30; 65D15 (Secondary)

arXiv:1903.06960 [pdf, other]

Reversible and non-reversible Markov Chain Monte Carlo algorithms for reservoir simulation problems

Authors: P. Dobson, I. Fursov, G. Lord, M. Ottobre

Abstract: We compare numerically the performance of reversible and non-reversible Markov Chain Monte Carlo algorithms for high dimensional oil reservoir problems; because of the nature of the problem at hand, the target measures from which we sample are supported on bounded domains. We compare two strategies to deal with bounded domains, namely reflecting proposals off the boundary and rejecting them when t… ▽ More We compare numerically the performance of reversible and non-reversible Markov Chain Monte Carlo algorithms for high dimensional oil reservoir problems; because of the nature of the problem at hand, the target measures from which we sample are supported on bounded domains. We compare two strategies to deal with bounded domains, namely reflecting proposals off the boundary and rejecting them when they fall outside of the domain. We observe that for complex high dimensional problems reflection mechanisms outperform rejection approaches and that the advantage of introducing non-reversibility in the Markov Chain employed for sampling is more and more visible as the dimension of the parameter space increases. △ Less

Submitted 16 March, 2019; originally announced March 2019.

arXiv:1702.01777 [pdf, other]

Optimal Scaling of the MALA algorithm with Irreversible Proposals for Gaussian targets

Authors: Michela Ottobre, Natesh S. Pillai, Konstantinos Spiliopoulos

Abstract: It is well known in many settings that reversible Langevin diffusions in confining potentials converge to equilibrium exponentially fast. Adding irreversible perturbations to the drift of a Langevin diffusion that maintain the same invariant measure accelerates its convergence to stationarity. Many existing works thus advocate the use of such non-reversible dynamics for sampling. When implementing… ▽ More It is well known in many settings that reversible Langevin diffusions in confining potentials converge to equilibrium exponentially fast. Adding irreversible perturbations to the drift of a Langevin diffusion that maintain the same invariant measure accelerates its convergence to stationarity. Many existing works thus advocate the use of such non-reversible dynamics for sampling. When implementing Markov Chain Monte Carlo algorithms (MCMC) using time discretisations of such Stochastic Differential Equations (SDEs), one can append the discretization with the usual Metropolis-Hastings accept-reject step and this is often done in practice because the accept--reject step eliminates bias. On the other hand, such a step makes the resulting chain reversible. It is not known whether adding the accept-reject step preserves the faster mixing properties of the non-reversible dynamics. In this paper, we address this gap between theory and practice by analyzing the optimal scaling of MCMC algorithms constructed from proposal moves that are time-step Euler discretisations of an irreversible SDE, for high dimensional Gaussian target measures. We call the resulting algorithm the \imala, in comparison to the classical MALA algorithm (here {\em ip} is for irreversible proposal). In order to quantify how the cost of the algorithm scales with the dimension $N$, we prove invariance principles for the appropriately rescaled chain. In contrast to the usual MALA algorithm, we show that there could be two regimes asymptotically: (i) a diffusive regime, as in the MALA algorithm and (ii) a ``fluid" regime where the limit is an ordinary differential equation. We provide concrete examples where the limit is a diffusion, as in the standard MALA, but with provably higher limiting acceptance probabilities. Numerical results are also given corroborating the theory. △ Less

Submitted 1 July, 2019; v1 submitted 6 February, 2017; originally announced February 2017.

Showing 1–3 of 3 results for author: Ottobre, M