Search | arXiv e-print repository

A Multiscale Perspective on Maximum Marginal Likelihood Estimation

Authors: O. Deniz Akyildiz, Michela Ottobre, Iain Souttar

Abstract: In this paper, we provide a multiscale perspective on the problem of maximum marginal likelihood estimation. We consider and analyse a diffusion-based maximum marginal likelihood estimation scheme using ideas from multiscale dynamics. Our perspective is based on stochastic averaging; we make an explicit connection between ideas in applied probability and parameter inference in computational statis… ▽ More In this paper, we provide a multiscale perspective on the problem of maximum marginal likelihood estimation. We consider and analyse a diffusion-based maximum marginal likelihood estimation scheme using ideas from multiscale dynamics. Our perspective is based on stochastic averaging; we make an explicit connection between ideas in applied probability and parameter inference in computational statistics. In particular, we consider a general class of coupled Langevin diffusions for joint inference of latent variables and parameters in statistical models, where the latent variables are sampled from a fast Langevin process (which acts as a sampler), and the parameters are updated using a slow Langevin process (which acts as an optimiser). We show that the resulting system of stochastic differential equations (SDEs) can be viewed as a two-time scale system. To demonstrate the utility of such a perspective, we show that the averaged parameter dynamics obtained in the limit of scale separation can be used to estimate the optimal parameter, within the strongly convex setting. We do this by using recent uniform-in-time non-asymptotic averaging bounds. Finally, we conclude by showing that the slow-fast algorithm we consider here, termed Slow-Fast Langevin Algorithm, performs on par with state-of-the-art methods on a variety of examples. We believe that the stochastic averaging approach we provide in this paper enables us to look at these algorithms from a fresh angle, as well as unlocking the path to develop and analyse new methods using well-established averaging principles. △ Less

Submitted 10 June, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

Comments: 31 pages, 3 figures

MSC Class: 62-08 (Primary); 60J60; 62C12; 62F15; 65C30; 65D15 (Secondary)

arXiv:2404.07488 [pdf, ps, other]

Approximation of non-linear SPDEs with additive noise via weighted interacting particles systems: the stochastic McKean-Vlasov equation

Authors: Letizia Angeli, Dan Crisan, Martin Kolodziejczyk, Michela Ottobre

Abstract: This paper is devoted to the problem of approximating non-linear Stochastic Partial Differential Equations (SPDEs) via interacting particle systems. In particular, we consider the Stochastic McKean-Vlasov equation, which is the McKean-Vlasov (MKV) PDE, perturbed by additive trace class noise. As is well-known, the MKV PDE can be obtained as mean field limit of the empirical measure of a stochastic… ▽ More This paper is devoted to the problem of approximating non-linear Stochastic Partial Differential Equations (SPDEs) via interacting particle systems. In particular, we consider the Stochastic McKean-Vlasov equation, which is the McKean-Vlasov (MKV) PDE, perturbed by additive trace class noise. As is well-known, the MKV PDE can be obtained as mean field limit of the empirical measure of a stochastic system of interacting particles, where particles are subject to independent sources of noise. There is now a natural question, which is the one we consider and answer in this paper: can we obtain the SMKV equation, i.e. additive perturbations of the MKV PDE, as limit of interacting particle systems? It turns out that, in order to obtain the SMKV equation, one needs to study weighted empirical measures of particles, where the particles evolve according to a system of SDEs with independent noise, while the weights are time evolving and subject to common noise. The work of this manuscript therefore complements and contributes to various streams of literature, in particular: i) much attention in the community is currently devoted to obtaining SPDEs as scaling limits of appropriate dynamics; this paper contributes to a complementary stream, which is devoted to obtaining representations of SPDE through limits of empirical measures of interacting particle systems; ii) since the literature on limits of weighted empirical measures is often constrained to the case of static (random or deterministic) weights, this paper contributes to further expanding this line of research to the case of time-evolving weights. △ Less

Submitted 11 April, 2024; originally announced April 2024.

Comments: 62 pages

MSC Class: 35Q83; 35Q70; 60H15; 35R60; 65C35; 82M60

arXiv:2305.04632 [pdf, ps, other]

On the study of slow-fast dynamics, when the fast process has multiple invariant measures

Authors: B. D. Goddard, M. Ottobre, K. J. Painter, I. Souttar

Abstract: Motivated by applications to mathematical biology, we study the averaging problem for slow-fast systems, {\em in the case in which the fast dynamics is a stochastic process with multiple invariant measures}. We consider both the case in which the fast process is decoupled from the slow process and the case in which the two components are fully coupled. We work in the setting in which the slow proc… ▽ More Motivated by applications to mathematical biology, we study the averaging problem for slow-fast systems, {\em in the case in which the fast dynamics is a stochastic process with multiple invariant measures}. We consider both the case in which the fast process is decoupled from the slow process and the case in which the two components are fully coupled. We work in the setting in which the slow process evolves according to an Ordinary Differential Equation (ODE) and the fast process is a continuous time Markov Process with finite state space and show that, in this setting, the limiting (averaged) dynamics can be described as a random ODE (that is, an ODE with random coefficients.) Keywords. Multiscale methods, Processes with multiple equilibria, Averaging, Collective Navigation, Interacting Piecewise Deterministic Markov Processes. △ Less

Submitted 15 August, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Comments: 24 pages

MSC Class: 34C29; 34D05; 34E10; 34F05; 37H10; 60K35

arXiv:2303.15463 [pdf, other]

Uniform in time convergence of numerical schemes for stochastic differential equations via Strong Exponential stability: Euler methods, Split-Step and Tamed Schemes

Authors: Letizia Angeli, Dan Crisan, Michela Ottobre

Abstract: We prove a general criterion providing sufficient conditions under which a time-discretiziation of a given Stochastic Differential Equation (SDE) is a uniform in time approximation of the SDE. The criterion is also, to a certain extent, discussed in the paper, necessary. Using such a criterion we then analyse the convergence properties of numerical methods for solutions of SDEs; we consider Explic… ▽ More We prove a general criterion providing sufficient conditions under which a time-discretiziation of a given Stochastic Differential Equation (SDE) is a uniform in time approximation of the SDE. The criterion is also, to a certain extent, discussed in the paper, necessary. Using such a criterion we then analyse the convergence properties of numerical methods for solutions of SDEs; we consider Explicit and Implicit Euler, split-step and (truncated) tamed Euler methods. In particular, we show that, under mild conditions on the coefficients of the SDE (locally Lipschitz and strictly monotonic), these methods produce approximations of the law of the solution of the SDE that converge uniformly in time. The theoretical results are verified by numerical examples. △ Less

Submitted 23 March, 2023; originally announced March 2023.

Comments: 50 pages, 2 figures

MSC Class: 65C20; 65C30; 60H10; 65G99; 47D07; 60J60

arXiv:2211.08004 [pdf, ps, other]

Well-posedness and stationary solutions of McKean-Vlasov (S)PDEs

Authors: Letizia Angeli, Julien Barré, Martin Kolodziejczyk, Michela Ottobre

Abstract: This paper is composed of two parts. In the first part we consider McKean-Vlasov Partial Differential Equations (PDEs), obtained as thermodynamic limits of interacting particle systems (i.e. in the limit $N\to\infty$, where N is the number of particles). It is well-known that, even when the particle system has a unique invariant measure (stationary solution), the limiting PDE very often displays a… ▽ More This paper is composed of two parts. In the first part we consider McKean-Vlasov Partial Differential Equations (PDEs), obtained as thermodynamic limits of interacting particle systems (i.e. in the limit $N\to\infty$, where N is the number of particles). It is well-known that, even when the particle system has a unique invariant measure (stationary solution), the limiting PDE very often displays a phase transition: for certain choices of (coefficients and) parameter values, the PDE has a unique stationary solution, but as the value of the parameter varies multiple stationary states appear. In the first part of this paper, we add to this stream of literature and consider a specific instance of a McKean-Vlasov type equation, namely the Kuramoto model on the torus perturbed by a symmetric double-well potential, and show that this PDE undergoes the type of phase transition just described, as the diffusion coefficient is varied. In the second part of the paper, we consider a rather general class of McKean-Vlasov PDEs on the torus (which includes both the original Kuramoto model and the Kuramoto model in double well potential of part one) perturbed by (strong enough) infinite-dimensional additive noise. To the best of our knowledge, the resulting Stochastic PDE, which we refer to as the Stochastic McKean-Vlasov equation, has not been studied before, so we first study its well-posedness. We then show that the addition of noise to the PDE has the effect of restoring uniqueness of the stationary state in the sense that, irrespective of the choice of coefficients and parameter values in the McKean-Vlasov PDE, the Stochastic McKean-Vlasov PDE always admits at most one invariant measure. △ Less

Submitted 15 November, 2022; originally announced November 2022.

Comments: 50 pages

MSC Class: 35Q83; 35Q84; 35Q70; 60H15; 35R60; 37A30

arXiv:2206.06776 [pdf, other]

Non-mean-field Vicsek-type models for collective behaviour

Authors: P. Buttà, B. Goddard, T. M. Hodgson, M. Ottobre, K. J. Painter

Abstract: We consider interacting particle dynamics with Vicsek type interactions, and their macroscopic PDE limit, in the non-mean-field regime; that is, we consider the case in which each particle/agent in the system interacts only with a prescribed subset of the particles in the system (for example, those within a certain distance). In this non-mean-field regime the influence between agents (i.e. the int… ▽ More We consider interacting particle dynamics with Vicsek type interactions, and their macroscopic PDE limit, in the non-mean-field regime; that is, we consider the case in which each particle/agent in the system interacts only with a prescribed subset of the particles in the system (for example, those within a certain distance). In this non-mean-field regime the influence between agents (i.e. the interaction term) can be normalised either by the total number of agents in the system (\textit{global scaling}) or by the number of agents with which the particle is effectively interacting (\textit{local scaling}). We compare the behaviour of the globally scaled and the locally scaled systems in many respects, considering for each scaling both the PDE and the corresponding particle model. In particular we observe that both the locally and globally scaled particle system exhibit pattern formation (i.e. formation of travelling-wave-like solutions) within certain parameter regimes, and generally display similar dynamics. The same is not true of the corresponding PDE models. Indeed, while both PDE models have multiple stationary states, for the globally scaled PDE such (space-homogeneous) equilibria are unstable for certain parameter regimes, with the instability leading to travelling wave solutions, while they are always stable for the locally scaled one, which never produces travelling waves. This observation is based on a careful numerical study of the model, supported by further analysis. △ Less

Submitted 17 May, 2022; originally announced June 2022.

Comments: 44 pages

arXiv:2204.02679 [pdf, other]

Poisson Equations with locally-Lipschitz coefficients and Uniform in Time Averaging for Stochastic Differential Equations via Strong Exponential Stability

Authors: Dan Crisan, Paul Dobson, Ben Goddard, Michela Ottobre, Iain Souttar

Abstract: We study averaging for Stochastic Differential Equations (SDEs) and Poisson equations. We succeed in obtaining a uniform in time (UiT) averaging result, with a rate, for fully coupled SDE models with super-linearly growing coefficients. This is the main result of this paper and is, to the best of our knowledge, the first UiT multiscale result with a rate. Very few UiT averaging results exist in th… ▽ More We study averaging for Stochastic Differential Equations (SDEs) and Poisson equations. We succeed in obtaining a uniform in time (UiT) averaging result, with a rate, for fully coupled SDE models with super-linearly growing coefficients. This is the main result of this paper and is, to the best of our knowledge, the first UiT multiscale result with a rate. Very few UiT averaging results exist in the literature, and they almost exclusively apply to multiscale systems of Ordinary Differential Equations. Among these few, none of those we are aware of comes with a rate of convergence. The UiT nature of this result and the rate of convergence given by the main theorem, make it important as theoretical underpinning for a range of applications, such as applications to statistical methodology, molecular dynamics etc. Key to obtaining both our UiT averaging result and to enable dealing with the super-linear growth of the coefficients is conquering exponential decay in time of the space-derivatives of appropriate Markov semigroups. We refer to this property as being Strongly Exponentially Stable (SES). The analytic approach to proving averaging results we take requires studying a family of Poisson problems associated with the generator of the (fast component of the) SDE dynamics. The study of Poisson equations in non-compact state space is notoriously difficult, with current literature mostly covering the case when the coefficients of the Partial Differential Equation (PDE) are either bounded or satisfy linear growth assumptions. In this paper we treat Poisson equations on non-compact state spaces for coefficients that can grow super-linearly. We demonstrate how SES can be employed not only to prove the UiT result for the slow-fast system but also to overcome some of the technical hurdles in the analysis of Poisson problems, which is of independent interest as well. △ Less

Submitted 5 April, 2024; v1 submitted 6 April, 2022; originally announced April 2022.

Comments: 70 pages, 3 figures

MSC Class: 60J60; 60H10; 35B30; 34K33; 34D20; 47D07; 65M75

arXiv:2111.00286 [pdf, ps, other]

Non-reversible processes: GENERIC, Hypocoercivity and fluctuations

Authors: Manh Hong Duong, Michela Ottobre

Abstract: We consider two approaches to study non-reversible Markov processes, namely the Hypocoercivity Theory (HT) and GENERIC (General Equations for Non-Equilibrium Reversible-Irreversible Coupling); the basic idea behind both of them is to split the process into a reversible component and a non-reversible one, and then quantify the way in which they interact. We compare such theories and provide explici… ▽ More We consider two approaches to study non-reversible Markov processes, namely the Hypocoercivity Theory (HT) and GENERIC (General Equations for Non-Equilibrium Reversible-Irreversible Coupling); the basic idea behind both of them is to split the process into a reversible component and a non-reversible one, and then quantify the way in which they interact. We compare such theories and provide explicit formulas to pass from one formulation to the other; as a bi-product we give a simple proof of the link between reversibility of the dynamics and gradient flow structure of the associated Fokker-Planck equation. We do this both for linear Markov processes and for a class of nonlinear Markov process as well. We then characterize the structure of the Large deviation functional of generalised-reversible processes; this is a class of non-reversible processes of large relevance in applications. Finally, we show how our results apply to two classes of Markov processes, namely non-reversible diffusion processes and a class of Piecewise Deterministic Markov Processes (PDMPs), which have recently attracted the attention of the statistical sampling community. In particular, for the PDMPs we consider we prove entropy decay. △ Less

Submitted 24 January, 2023; v1 submitted 30 October, 2021; originally announced November 2021.

Comments: 49 pages, revised version

arXiv:2003.14230 [pdf, ps, other]

Fast non mean-field networks: uniform in time averaging

Authors: Julien Barré, Paul Dobson, Michela Ottobre, Ewelina Zatorska

Abstract: We study a population of $N$ particles, which evolve according to a diffusion process and interact through a dynamical network. In turn, the evolution of the network is coupled to the particles' positions. In contrast with the mean-field regime, in which each particle interacts with every other particle, i.e. with $O(N)$ particles, we consider the a priori more difficult case of a sparse network;… ▽ More We study a population of $N$ particles, which evolve according to a diffusion process and interact through a dynamical network. In turn, the evolution of the network is coupled to the particles' positions. In contrast with the mean-field regime, in which each particle interacts with every other particle, i.e. with $O(N)$ particles, we consider the a priori more difficult case of a sparse network; that is, each particle interacts, on average, with $O(1)$ particles. We also assume that the network's dynamics is much faster than the particles' dynamics, with the time-scale of the network described by a parameter $ε>0$. We combine the averaging ($ε\rightarrow 0$) and the many particles ($N \rightarrow \infty$) limits and prove that the evolution of the particles' empirical density is described (after taking both limits) by a non-linear Fokker-Planck equation; we moreover give conditions under which such limits can be taken uniformly in time, hence providing a criterion under which the limiting non-linear Fokker-Planck equation is a good approximation of the original system uniformly in time. The heart of our proof consists of controlling precisely the dependence in $N$ of the averaging estimates. △ Less

Submitted 12 October, 2020; v1 submitted 31 March, 2020; originally announced March 2020.

Comments: 33 pages

MSC Class: 60K35; 47D07; 60J60; 35Q84; 35Q82; 82C31

arXiv:2002.11663 [pdf, ps, other]

Well-Posedness and Equilibrium Behaviour of Overdamped Dynamic Density Functional Theory

Authors: B. D. Goddard, R. D. Mills-Williams, M. Ottobre, G. Pavliotis

Abstract: We establish the global well-posedness of overdamped dynamic density functional theory (DDFT): a nonlinear, nonlocal integro-partial differential equation used in statistical mechanical models of colloidal fluids, and other applications including nonlinear reaction-diffusion systems and opinion dynamics. With nonlinear no-flux boundary conditions, we determine the existence and uniqueness of the w… ▽ More We establish the global well-posedness of overdamped dynamic density functional theory (DDFT): a nonlinear, nonlocal integro-partial differential equation used in statistical mechanical models of colloidal fluids, and other applications including nonlinear reaction-diffusion systems and opinion dynamics. With nonlinear no-flux boundary conditions, we determine the existence and uniqueness of the weak density and flux, subject to two-body hydrodynamic interactions (HI). We also show that the density is Lyapunov stable with respect to the usual (Helmholtz) free energy functional. Principally, this is done by rewriting the dynamics for the density in an implicit gradient flow form, resembling the classical Smoluchowski equation but with spatially inhomogeneous diffusion and advection tensors. We also rigorously show that the stationary density is independent of the HI tensors, and prove exponentially fast convergence to equilibrium. △ Less

Submitted 14 September, 2021; v1 submitted 26 February, 2020; originally announced February 2020.

Comments: 33 pages

arXiv:1905.03524 [pdf, other]

Uniform in time estimates for the weak error of the Euler method for SDEs and a Pathwise Approach to Derivative Estimates for Diffusion Semigroups

Authors: D. Crisan, P. Dobson, M. Ottobre

Abstract: We present a criterion for uniform in time convergence of the weak error of the Euler scheme for Stochastic Differential equations (SDEs). The criterion requires i) exponential decay in time of the space-derivatives of the semigroup associated with the SDE and ii) bounds on (some) moments of the Euler approximation. We show by means of examples (and counterexamples) how both i) and ii) are needed… ▽ More We present a criterion for uniform in time convergence of the weak error of the Euler scheme for Stochastic Differential equations (SDEs). The criterion requires i) exponential decay in time of the space-derivatives of the semigroup associated with the SDE and ii) bounds on (some) moments of the Euler approximation. We show by means of examples (and counterexamples) how both i) and ii) are needed to obtain the desired result. If the weak error converges to zero uniformly in time, then convergence of ergodic averages follows as well. We also show that Lyapunov-type conditions are neither sufficient nor necessary in order for the weak error of the Euler approximation to converge uniformly in time and clarify relations between the validity of Lyapunov conditions, i) and ii). Conditions for ii) to hold are studied in the literature. Here we produce sufficient conditions for i) to hold. The study of derivative estimates has attracted a lot of attention, however not many results are known in order to guarantee exponentially fast decay of the derivatives. Exponential decay of derivatives typically follows from coercive-type conditions involving the vector fields appearing in the equation and their commutators; here we focus on the case in which such coercive-type conditions are non-uniform in space. To the best of our knowledge, this situation is unexplored in the literature, at least on a systematic level. To obtain results under such space-inhomogeneous conditions we initiate a pathwise approach to the study of derivative estimates for diffusion semigroups and combine this pathwise method with the use of Large Deviation Principles. △ Less

Submitted 27 July, 2020; v1 submitted 9 May, 2019; originally announced May 2019.

Comments: 47 pages and 9 figures

MSC Class: 65C20; 65C30; 60H10; 65G99; 47D07; 60J60

arXiv:1903.06960 [pdf, other]

Reversible and non-reversible Markov Chain Monte Carlo algorithms for reservoir simulation problems

Authors: P. Dobson, I. Fursov, G. Lord, M. Ottobre

Abstract: We compare numerically the performance of reversible and non-reversible Markov Chain Monte Carlo algorithms for high dimensional oil reservoir problems; because of the nature of the problem at hand, the target measures from which we sample are supported on bounded domains. We compare two strategies to deal with bounded domains, namely reflecting proposals off the boundary and rejecting them when t… ▽ More We compare numerically the performance of reversible and non-reversible Markov Chain Monte Carlo algorithms for high dimensional oil reservoir problems; because of the nature of the problem at hand, the target measures from which we sample are supported on bounded domains. We compare two strategies to deal with bounded domains, namely reflecting proposals off the boundary and rejecting them when they fall outside of the domain. We observe that for complex high dimensional problems reflection mechanisms outperform rejection approaches and that the advantage of introducing non-reversibility in the Markov Chain employed for sampling is more and more visible as the dimension of the parameter space increases. △ Less

Submitted 16 March, 2019; originally announced March 2019.

arXiv:1805.01350 [pdf, other]

Long-time behaviour of degenerate diffusions: UFG-type SDEs and time-inhomogeneous hypoelliptic processes

Authors: T. Cass, D. Crisan, P. Dobson, M. Ottobre

Abstract: We study the long time behaviour of a large class of diffusion processes on $R^N$, generated by second order differential operators of (possibly) degenerate type. The operators that we consider {\em need not} satisfy the Hörmander condition. Instead, they satisfy the so-called UFG condition, introduced by Herman, Lobry and Sussman in the context of geometric control theory and later by Kusuoka and… ▽ More We study the long time behaviour of a large class of diffusion processes on $R^N$, generated by second order differential operators of (possibly) degenerate type. The operators that we consider {\em need not} satisfy the Hörmander condition. Instead, they satisfy the so-called UFG condition, introduced by Herman, Lobry and Sussman in the context of geometric control theory and later by Kusuoka and Stroock, this time with probabilistic motivations. In this paper we study UFG diffusions and demonstrate the importance of such a class of processes in several respects: roughly speaking i) we show that UFG processes constitute a family of SDEs which exhibit multiple invariant measures and for which one is able to describe a systematic procedure to determine the basin of attraction of each invariant measure (equilibrium state). ii) We use an explicit change of coordinates to prove that every UFG diffusion can be, at least locally, represented as a system consisting of an SDE coupled with an ODE, where the ODE evolves independently of the SDE part of the dynamics. iii) As a result, UFG diffusions are inherently "less smooth" than hypoelliptic SDEs; more precisely, we prove that UFG processes do not admit a density with respect to Lebesgue measure on the entire space, but only on suitable time-evolving submanifolds, which we describe. iv) We show that our results and techniques, which we devised for UFG processes, can be applied to the study of the long-time behaviour of non-autonomous hypoelliptic SDEs and therefore produce several results on this latter class of processes as well. v) Because processes that satisfy the (uniform) parabolic Hörmander condition are UFG processes, our paper contains a wealth of results about the long time behaviour of (uniformly) hypoelliptic processes which are non-ergodic, in the sense that they exhibit multiple invariant measures. △ Less

Submitted 2 June, 2019; v1 submitted 3 May, 2018; originally announced May 2018.

Comments: 66 pages

arXiv:1804.01247 [pdf, ps, other]

doi 10.3934/krm.2019031

A non-linear kinetic model of self-propelled particles with multiple equilibria

Authors: Paolo Buttà, Franco Flandoli, Michela Ottobre, Boguslaw Zegarlinski

Abstract: We introduce and analyse a continuum model for an interacting particle system of Vicsek type. The model is given by a non-linear kinetic partial differential equation (PDE) describing the time-evolution of the density $f_t$, in the single particle phase-space, of a collection of interacting particles confined to move on the one-dimensional torus. The corresponding stochastic differential equation… ▽ More We introduce and analyse a continuum model for an interacting particle system of Vicsek type. The model is given by a non-linear kinetic partial differential equation (PDE) describing the time-evolution of the density $f_t$, in the single particle phase-space, of a collection of interacting particles confined to move on the one-dimensional torus. The corresponding stochastic differential equation for the position and velocity of the particles is a conditional McKean-Vlasov type of evolution (conditional in the sense that the process depends on its own law through its own conditional expectation). In this paper, we study existence and uniqueness of the solution of the PDE in consideration. Challenges arise from the fact that the PDE is neither elliptic (the linear part is only {\em hypoelliptic}) nor in gradient form. Moreover, for some specific choices of the interaction function and for the simplified case in which the density profile does not depend on the spatial variable, we show that the model exhibits multiple stationary states (corresponding to the particles forming a coordinated clockwise/anticlockwise rotational motion) and we study convergence to such states as well. Finally, we prove mean-field convergence of an appropriate $N$-particles system to the solution of our PDE: more precisely, we show that the empirical measures of such a particle system converge weakly, as $N \rightarrow \infty$, to the solution of the PDE. △ Less

Submitted 16 January, 2019; v1 submitted 4 April, 2018; originally announced April 2018.

Comments: 37 pages

Report number: Roma01.Math.MP MSC Class: 35A01; 35B40; 82C22; 82C40; 92D25

Journal ref: Kinetic and Relat. Models Vol. 12 (2019), pp. 791-827

arXiv:1702.01777 [pdf, other]

Optimal Scaling of the MALA algorithm with Irreversible Proposals for Gaussian targets

Authors: Michela Ottobre, Natesh S. Pillai, Konstantinos Spiliopoulos

Abstract: It is well known in many settings that reversible Langevin diffusions in confining potentials converge to equilibrium exponentially fast. Adding irreversible perturbations to the drift of a Langevin diffusion that maintain the same invariant measure accelerates its convergence to stationarity. Many existing works thus advocate the use of such non-reversible dynamics for sampling. When implementing… ▽ More It is well known in many settings that reversible Langevin diffusions in confining potentials converge to equilibrium exponentially fast. Adding irreversible perturbations to the drift of a Langevin diffusion that maintain the same invariant measure accelerates its convergence to stationarity. Many existing works thus advocate the use of such non-reversible dynamics for sampling. When implementing Markov Chain Monte Carlo algorithms (MCMC) using time discretisations of such Stochastic Differential Equations (SDEs), one can append the discretization with the usual Metropolis-Hastings accept-reject step and this is often done in practice because the accept--reject step eliminates bias. On the other hand, such a step makes the resulting chain reversible. It is not known whether adding the accept-reject step preserves the faster mixing properties of the non-reversible dynamics. In this paper, we address this gap between theory and practice by analyzing the optimal scaling of MCMC algorithms constructed from proposal moves that are time-step Euler discretisations of an irreversible SDE, for high dimensional Gaussian target measures. We call the resulting algorithm the \imala, in comparison to the classical MALA algorithm (here {\em ip} is for irreversible proposal). In order to quantify how the cost of the algorithm scales with the dimension $N$, we prove invariance principles for the appropriately rescaled chain. In contrast to the usual MALA algorithm, we show that there could be two regimes asymptotically: (i) a diffusive regime, as in the MALA algorithm and (ii) a ``fluid" regime where the limit is an ordinary differential equation. We provide concrete examples where the limit is a diffusion, as in the standard MALA, but with provably higher limiting acceptance probabilities. Numerical results are also given corroborating the theory. △ Less

Submitted 1 July, 2019; v1 submitted 6 February, 2017; originally announced February 2017.

arXiv:1608.08379 [pdf, ps, other]

Non-stationary phase of the MALA algorithm

Authors: J. Kuntz, M. Ottobre, A. M. Stuart

Abstract: The Metropolis-Adjusted Langevin Algorithm (MALA) is a Markov Chain Monte Carlo method which creates a Markov chain reversible with respect to a given target distribution, pi^N, with Lebesgue density on R^N; it can hence be used to approximately sample the target distribution. When the dimension N is large a key question is to determine the computational cost of the algorithm as a function of N. O… ▽ More The Metropolis-Adjusted Langevin Algorithm (MALA) is a Markov Chain Monte Carlo method which creates a Markov chain reversible with respect to a given target distribution, pi^N, with Lebesgue density on R^N; it can hence be used to approximately sample the target distribution. When the dimension N is large a key question is to determine the computational cost of the algorithm as a function of N. One approach to this question, which we adopt here, is to derive diffusion limits for the algorithm. The family of target measures that we consider in this paper are, in general, in non-product form and are of interest in applied problems as they arise in Bayesian nonparametric statistics and in the study of conditioned diffusions. Furthermore, we study the situation, which arises in practice, where the algorithm is started out of stationarity. We thereby significantly extend previous works which consider either only measures of product form, when the Markov chain is started out of stationarity, or measures defined via a density with respect to a Gaussian, when the Markov chain is started in stationarity. We prove that, in the non-stationary regime, the computational cost of the algorithm is of the order N^(1/2) with dimension, as opposed to what is known to happen in the stationary regime, where the cost is of the order N^(1/3). △ Less

Submitted 23 August, 2017; v1 submitted 30 August, 2016; originally announced August 2016.

Comments: 37 pages. arXiv admin note: text overlap with arXiv:1405.4896

arXiv:1606.01153 [pdf, other]

doi 10.1137/16M107801X

Bounding stationary averages of polynomial diffusions via semidefinite programming

Authors: Juan Kuntz, Michela Ottobre, Guy-Bart Stan, Mauricio Barahona

Abstract: We introduce an algorithm based on semidefinite programming that yields increasing (resp. decreasing) sequences of lower (resp. upper) bounds on polynomial stationary averages of diffusions with polynomial drift vector and diffusion coefficients. The bounds are obtained by optimising an objective, determined by the stationary average of interest, over the set of real vectors defined by certain lin… ▽ More We introduce an algorithm based on semidefinite programming that yields increasing (resp. decreasing) sequences of lower (resp. upper) bounds on polynomial stationary averages of diffusions with polynomial drift vector and diffusion coefficients. The bounds are obtained by optimising an objective, determined by the stationary average of interest, over the set of real vectors defined by certain linear equalities and semidefinite inequalities which are satisfied by the moments of any stationary measure of the diffusion. We exemplify the use of the approach through several applications: a Bayesian inference problem; the computation of Lyapunov exponents of linear ordinary differential equations perturbed by multiplicative white noise; and a reliability problem from structural mechanics. Additionally, we prove that the bounds converge to the infimum and supremum of the set of stationary averages for certain SDEs associated with the computation of the Lyapunov exponents, and we provide numerical evidence of convergence in more general settings. △ Less

Submitted 3 June, 2016; originally announced June 2016.

MSC Class: 60H10; 60H35; 90C22; 37M25

Journal ref: SIAM J. Sci. Comput., 38(6), 3891-3920 (2016)

arXiv:1606.01027 [pdf, ps, other]

doi 10.1098/rspa.2016.0442

Pointwise Gradient Bounds for Degenerate Semigroups (of UFG type)

Authors: Dan Crisan, Michela Ottobre

Abstract: In this paper we consider diffusion semigroups generated by second order differential operators of degenerate type. The operators that we consider do not, in general, satisfy the Hormander condition and are not hypoelliptic. In particular, instead of working under the Hormander paradigm, we consider the so-called UFG condition, introduced by Kusuoka and Strook in the eighties. The UFG condition… ▽ More In this paper we consider diffusion semigroups generated by second order differential operators of degenerate type. The operators that we consider do not, in general, satisfy the Hormander condition and are not hypoelliptic. In particular, instead of working under the Hormander paradigm, we consider the so-called UFG condition, introduced by Kusuoka and Strook in the eighties. The UFG condition is weaker than the uniform Hormander condition, the smoothing effect taking place only in certain directions (rather than in every direction, as it is the case when the Hormander condition is assumed). Under the UFG condition, Kusuoka and Strook deduced sharp small time asymptotic bounds for the derivatives of the semigroup in the directions where smoothing occurs. In this paper, we study the large time asymptotics for the gradients of the diffusion semigroup in the same set of directions and under the same UFG condition. In particular, we identify conditions under which the derivatives of the diffusion semigroup in the smoothing directions decay exponentially in time. This paper constitutes therefore a step** stone in the analysis of the long time behaviour of diffusions which do not satisfy the Hormander condition. △ Less

Submitted 19 October, 2016; v1 submitted 3 June, 2016; originally announced June 2016.

Comments: 23 pages

arXiv:1411.6223 [pdf, other]

Some remarks on degenerate hypoelliptic Ornstein-Uhlenbeck operators

Authors: Michela Ottobre, Grigorios Pavliotis, Karel Pravda-Starov

Abstract: We study degenerate hypoelliptic Ornstein-Uhlenbeck operators in $L^2$ spaces with respect to invariant measures. The purpose of this article is to show how recent results on general quadratic operators apply to the study of degenerate hypoelliptic Ornstein-Uhlenbeck operators. We first show that some known results about the spectral and subelliptic properties of Ornstein-Uhlenbeck operators may b… ▽ More We study degenerate hypoelliptic Ornstein-Uhlenbeck operators in $L^2$ spaces with respect to invariant measures. The purpose of this article is to show how recent results on general quadratic operators apply to the study of degenerate hypoelliptic Ornstein-Uhlenbeck operators. We first show that some known results about the spectral and subelliptic properties of Ornstein-Uhlenbeck operators may be directly recovered from the general analysis of quadratic operators with zero singular spaces. We also provide new resolvent estimates for hypoelliptic Ornstein-Uhlenbeck operators. We show in particular that the spectrum of these non-selfadjoint operators may be very unstable under small perturbations and that their resolvents can blow-up in norm far away from their spectra. Furthermore, we establish sharp resolvent estimates in specific regions of the resolvent set which enable us to prove exponential return to equilibrium. △ Less

Submitted 23 November, 2014; originally announced November 2014.

Comments: 37 pages, 3 figures

MSC Class: 35H10; 35P05

arXiv:1405.4896 [pdf, other]

Diffusion Limit For The Random Walk Metropolis Algorithm Out Of stationarity

Authors: J. Kuntz, M. Ottobre, A. M. Stuart

Abstract: The Random Walk Metropolis (RWM) algorithm is a Metropolis- Hastings MCMC algorithm designed to sample from a given target distribution πwith Lebesgue density on R^N. RWM constructs a Markov chain by randomly proposing a new position (the "proposal move"), which is then accepted or rejected according to a rule which makes the chain reversible with respect to π. When the dimension N is large a key… ▽ More The Random Walk Metropolis (RWM) algorithm is a Metropolis- Hastings MCMC algorithm designed to sample from a given target distribution πwith Lebesgue density on R^N. RWM constructs a Markov chain by randomly proposing a new position (the "proposal move"), which is then accepted or rejected according to a rule which makes the chain reversible with respect to π. When the dimension N is large a key question is to determine the optimal scaling with N of the proposal variance: if the proposal variance is too large, the algorithm will reject the proposed moves too often; if it is too small, the algorithm will explore the state space too slowly. Determining the optimal scaling of the proposal variance gives a measure of the cost of the algorithm as well. One approach to tackle this issue, which we adopt here, is to derive diffusion limits for the algorithm. Such an approach has been proposed in the seminal papers [RGG97, RR98]; in particular in [RGG97] the authors derive a diffusion limit for the RWM algorithm under the two following assumptions: i) the algorithm is started in stationarity; ii) the target measure $π$ is in product form. The present paper considers the situation of practical interest in which both assumptions i) and ii) are removed. That is a) we study the case (which occurs in practice) in which the algorithm is started out of stationarity and b) we consider target measures which are in non-product form. The target measures that we consider arise in Bayesian nonparametric statistics and in the study of conditioned diffusions. We prove that, out of stationarity, the optimal scaling for the proposal variance is O(N), as it is in stationarity. Notice that the optimal scaling in and out of stationatity need not be the same in general, and indeed they differ e.g. in the case of the MALA algorithm [KOS16]. △ Less

Submitted 30 August, 2016; v1 submitted 19 May, 2014; originally announced May 2014.

Comments: 53 pages, 4 figure

MSC Class: 60J22

arXiv:1308.0543 [pdf, other]

A Function Space HMC Algorithm With Second Order Langevin Diffusion Limit

Authors: Michela Ottobre, Natesh S. Pillai, Frank J. Pinski, Andrew M. Stuart

Abstract: We describe a new MCMC method optimized for the sampling of probability measures on Hilbert space which have a density with respect to a Gaussian; such measures arise in the Bayesian approach to inverse problems, and in conditioned diffusions. Our algorithm is based on two key design principles: (i) algorithms which are well-defined in infinite dimensions result in methods which do not suffer from… ▽ More We describe a new MCMC method optimized for the sampling of probability measures on Hilbert space which have a density with respect to a Gaussian; such measures arise in the Bayesian approach to inverse problems, and in conditioned diffusions. Our algorithm is based on two key design principles: (i) algorithms which are well-defined in infinite dimensions result in methods which do not suffer from the curse of dimensionality when they are applied to approximations of the infinite dimensional target measure on $\bbR^N$; (ii) non-reversible algorithms can have better mixing properties compared to their reversible counterparts. The method we introduce is based on the hybrid Monte Carlo algorithm, tailored to incorporate these two design principles. The main result of this paper states that the new algorithm, appropriately rescaled, converges weakly to a second order Langevin diffusion on Hilbert space; as a consequence the algorithm explores the approximate target measures on $\bbR^N$ in a number of steps which is independent of $N$. We also present the underlying theory for the limiting non-reversible diffusion on Hilbert space, including characterization of the invariant measure, and we describe numerical simulations demonstrating that the proposed method has favourable mixing properties as an MCMC algorithm. △ Less

Submitted 3 April, 2014; v1 submitted 2 August, 2013; originally announced August 2013.

Comments: 41 pages, 2 figures. This is the final version, with more comments and an extra appendix added

arXiv:1306.6453 [pdf, ps, other]

Markov semigroups with hypocoercive-type generator in Infinite Dimensions II: Applications

Authors: V. Kontis, M. Ottobre, B. Zegarlinski

Abstract: In this paper we show several applications of the general theory developed in \cite{MV_I}, where we studied smoothing and ergodicity for infinite dimensional Markovian systems with hypocoercive type generator. In this paper we show several applications of the general theory developed in \cite{MV_I}, where we studied smoothing and ergodicity for infinite dimensional Markovian systems with hypocoercive type generator. △ Less

Submitted 27 June, 2013; originally announced June 2013.

arXiv:1306.6452 [pdf, ps, other]

Markov semigroups with hypocoercive-type generator in Infinite Dimensions: Ergodicity and Smoothing

Authors: V. Kontis, M. Ottobre, B. Zegarlinski

Abstract: We start by considering infinite dimensional Markovian dynamics in R^m generated by operators of hypocoercive type and for such models we obtain short and long time pointwise estimates for all the derivatives, of any order and in any direction, along the semigroup. We then look at infinite dimensional models (in (Rm)^{Z ^d}) produced by the interaction of infinitely many finite dimensional dissipa… ▽ More We start by considering infinite dimensional Markovian dynamics in R^m generated by operators of hypocoercive type and for such models we obtain short and long time pointwise estimates for all the derivatives, of any order and in any direction, along the semigroup. We then look at infinite dimensional models (in (Rm)^{Z ^d}) produced by the interaction of infinitely many finite dimensional dissipative dynamics of the type indicated above. For these infinite dimensional models we study finite speed of propagation of information, well-posedness of the semigroup, time behaviour of the derivatives and strong ergodicity problem. △ Less

Submitted 12 March, 2016; v1 submitted 27 June, 2013; originally announced June 2013.

arXiv:1106.2326 [pdf, other]

Exponential return to equilibrium for hypoelliptic quadratic systems

Authors: M. Ottobre, G. A. Pavliotis, K. Pravda-Starov

Abstract: We study the problem of convergence to equilibrium for evolution equations associated to general quadratic operators. Quadratic operators are non-selfadjoint differential operators with complex-valued quadratic symbols. Under appropriate assumptions, a complete description of the spectrum of such operators is given and the exponential return to equilibrium with sharp estimates on the rate of conve… ▽ More We study the problem of convergence to equilibrium for evolution equations associated to general quadratic operators. Quadratic operators are non-selfadjoint differential operators with complex-valued quadratic symbols. Under appropriate assumptions, a complete description of the spectrum of such operators is given and the exponential return to equilibrium with sharp estimates on the rate of convergence is proven. Some applications to the study of chains of oscillators and the generalized Langevin equation are given. △ Less

Submitted 18 October, 2012; v1 submitted 12 June, 2011; originally announced June 2011.

Comments: 28 pages, 4 figures

MSC Class: Primary: 35H10; 35P99; Secondary: 35Q82

Journal ref: published in J. Func. Anal. (2012)

arXiv:1105.1042 [pdf, ps, other]

Long time asymptotics of a Brownian particle coupled with a random environment with non-diffusive feedback force

Authors: Michela Ottobre

Abstract: We study the long time behavior of a Brownian particle moving in an anomalously diffusing field, the evolution of which depends on the particle position. We prove that the process describing the asymptotic behaviour of the Brownian particle has bounded (in time) variance when the particle interacts with a subdiffusive field; when the interaction is with a superdiffusive field the variance of the l… ▽ More We study the long time behavior of a Brownian particle moving in an anomalously diffusing field, the evolution of which depends on the particle position. We prove that the process describing the asymptotic behaviour of the Brownian particle has bounded (in time) variance when the particle interacts with a subdiffusive field; when the interaction is with a superdiffusive field the variance of the limiting process grows in time as t^{2γ-1}, 1/2 < γ < 1. Two different kinds of superdiffusing (random) environments are considered: one is described through the use of the fractional Laplacian; the other via the Riemann-Liouville fractional integral. The subdiffusive field is modeled through the Riemann-Liouville fractional derivative. △ Less

Submitted 5 May, 2011; originally announced May 2011.

Comments: 45 pages

arXiv:1003.4203 [pdf, ps, other]

doi 10.1088/0951-7715/24/5/013

Asymptotic analysis for the generalized langevin equation

Authors: M. Ottobre, G. A. Pavliotis

Abstract: Various qualitative properties of solutions to the generalized Langevin equation (GLE) in a periodic or a confining potential are studied in this paper. We consider a class of quasi-Markovian GLEs, similar to the model that was introduced in \cite{EPR99}. Geometric ergodicity, a homogenization theorem (invariance principle), short time asymptotics and the white noise limit are studied. Our proofs… ▽ More Various qualitative properties of solutions to the generalized Langevin equation (GLE) in a periodic or a confining potential are studied in this paper. We consider a class of quasi-Markovian GLEs, similar to the model that was introduced in \cite{EPR99}. Geometric ergodicity, a homogenization theorem (invariance principle), short time asymptotics and the white noise limit are studied. Our proofs are based on a careful analysis of a hypoelliptic operator which is the generator of an auxiliary Markov process. Systematic use of the recently developed theory of hypocoercivity \cite{Vil04HPI} is made. △ Less

Submitted 22 March, 2010; originally announced March 2010.

Comments: 27 pages, no figures. Submitted to Nonlinearity.

Showing 1–26 of 26 results for author: Ottobre, M