Search | arXiv e-print repository

Particle Semi-Implicit Variational Inference

Abstract: Semi-implicit variational inference (SIVI) enriches the expressiveness of variational families by utilizing a kernel and a mixing distribution to hierarchically define the variational distribution. Existing SIVI methods parameterize the mixing distribution using implicit distributions, leading to intractable variational densities. As a result, directly maximizing the evidence lower bound (ELBO) is… ▽ More Semi-implicit variational inference (SIVI) enriches the expressiveness of variational families by utilizing a kernel and a mixing distribution to hierarchically define the variational distribution. Existing SIVI methods parameterize the mixing distribution using implicit distributions, leading to intractable variational densities. As a result, directly maximizing the evidence lower bound (ELBO) is not possible and so, they resort to either: optimizing bounds on the ELBO, employing costly inner-loop Markov chain Monte Carlo runs, or solving minimax objectives. In this paper, we propose a novel method for SIVI called Particle Variational Inference (PVI) which employs empirical measures to approximate the optimal mixing distributions characterized as the minimizer of a natural free energy functional via a particle approximation of an Euclidean--Wasserstein gradient flow. This approach means that, unlike prior works, PVI can directly optimize the ELBO; furthermore, it makes no parametric assumption about the mixing distribution. Our empirical results demonstrate that PVI performs favourably against other SIVI methods across various tasks. Moreover, we provide a theoretical analysis of the behaviour of the gradient flow of a related free energy functional: establishing the existence and uniqueness of solutions as well as propagation of chaos results. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2406.16465 [pdf, ps, other]

Genealogical processes of non-neutral population models under rapid mutation

Authors: Jere Koskela, Paul A. Jenkins, Adam M. Johansen, Dario Spano

Abstract: We show that genealogical trees arising from a broad class of non-neutral models of population evolution converge to the Kingman coalescent under a suitable rescaling of time. As well as non-neutral biological evolution, our results apply to genetic algorithms encompassing the prominent class of sequential Monte Carlo (SMC) methods. The time rescaling we need differs slightly from that used in cla… ▽ More We show that genealogical trees arising from a broad class of non-neutral models of population evolution converge to the Kingman coalescent under a suitable rescaling of time. As well as non-neutral biological evolution, our results apply to genetic algorithms encompassing the prominent class of sequential Monte Carlo (SMC) methods. The time rescaling we need differs slightly from that used in classical results for convergence to the Kingman coalescent, which has implications for the performance of different resampling schemes in SMC algorithms. In addition, our work substantially simplifies earlier proofs of convergence to the Kingman coalescent, and corrects an error common to several earlier results. △ Less

Submitted 24 June, 2024; originally announced June 2024.

MSC Class: 60J90; 65C35; 92D15

arXiv:2403.02004 [pdf, ps, other]

Error bounds for particle gradient descent, and extensions of the log-Sobolev and Talagrand inequalities

Authors: Rocco Caprio, Juan Kuntz, Samuel Power, Adam M. Johansen

Abstract: We prove non-asymptotic error bounds for particle gradient descent (PGD)~(Kuntz et al., 2023), a recently introduced algorithm for maximum likelihood estimation of large latent variable models obtained by discretizing a gradient flow of the free energy. We begin by showing that, for models satisfying a condition generalizing both the log-Sobolev and the Polyak--Łojasiewicz inequalities (LSI and PŁ… ▽ More We prove non-asymptotic error bounds for particle gradient descent (PGD)~(Kuntz et al., 2023), a recently introduced algorithm for maximum likelihood estimation of large latent variable models obtained by discretizing a gradient flow of the free energy. We begin by showing that, for models satisfying a condition generalizing both the log-Sobolev and the Polyak--Łojasiewicz inequalities (LSI and PŁI, respectively), the flow converges exponentially fast to the set of minimizers of the free energy. We achieve this by extending a result well-known in the optimal transport literature (that the LSI implies the Talagrand inequality) and its counterpart in the optimization literature (that the PŁI implies the so-called quadratic growth condition), and applying it to our new setting. We also generalize the Bakry--Émery Theorem and show that the LSI/PŁI generalization holds for models with strongly concave log-likelihoods. For such models, we further control PGD's discretization error, obtaining non-asymptotic error bounds. While we are motivated by the study of PGD, we believe that the inequalities and results we extend may be of independent interest. △ Less

Submitted 11 April, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

arXiv:2312.07335 [pdf, other]

Momentum Particle Maximum Likelihood

Authors: Jen Ning Lim, Juan Kuntz, Samuel Power, Adam M. Johansen

Abstract: Maximum likelihood estimation (MLE) of latent variable models is often recast as the minimization of a free energy functional over an extended space of parameters and probability distributions. This perspective was recently combined with insights from optimal transport to obtain novel particle-based algorithms for fitting latent variable models to data. Drawing inspiration from prior works which i… ▽ More Maximum likelihood estimation (MLE) of latent variable models is often recast as the minimization of a free energy functional over an extended space of parameters and probability distributions. This perspective was recently combined with insights from optimal transport to obtain novel particle-based algorithms for fitting latent variable models to data. Drawing inspiration from prior works which interpret `momentum-enriched' optimization algorithms as discretizations of ordinary differential equations, we propose an analogous dynamical-systems-inspired approach to minimizing the free energy functional. The result is a dynamical system that blends elements of Nesterov's Accelerated Gradient method, the underdamped Langevin diffusion, and particle methods. Under suitable assumptions, we prove that the continuous-time system minimizes the functional. By discretizing the system, we obtain a practical algorithm for MLE in latent variable models. The algorithm outperforms existing particle methods in numerical experiments and compares favourably with other MLE algorithms. △ Less

Submitted 4 June, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

Comments: ICML 2024 camera ready

arXiv:2310.03853 [pdf, ps, other]

A calculus for Markov chain Monte Carlo: studying approximations in algorithms

Authors: Rocco Caprio, Adam M. Johansen

Abstract: Markov chain Monte Carlo (MCMC) algorithms are based on the construction of a Markov Chain with transition probabilities $P_μ(x,\cdot)$, where $μ$ indicates an invariant distribution of interest. In this work, we look at these transition probabilities as functions of their invariant distributions, and we develop a notion of derivative in the invariant distribution of a MCMC kernel. We build around… ▽ More Markov chain Monte Carlo (MCMC) algorithms are based on the construction of a Markov Chain with transition probabilities $P_μ(x,\cdot)$, where $μ$ indicates an invariant distribution of interest. In this work, we look at these transition probabilities as functions of their invariant distributions, and we develop a notion of derivative in the invariant distribution of a MCMC kernel. We build around this concept a set of tools that we refer to as Markov Chain Monte Carlo Calculus. This allows us to compare Markov chains with different invariant distributions within a suitable class via what we refer to as mean value inequalities. We explain how MCMC Calculus provides a natural framework to study algorithms using an approximation of an invariant distribution, also illustrating how it suggests practical guidelines for MCMC algorithms efficiency. We conclude this work by showing how the tools developed can be applied to prove convergence of interacting and sequential MCMC algorithms, which arise in the context of particle filtering. △ Less

Submitted 5 October, 2023; originally announced October 2023.

arXiv:2303.03498 [pdf, ps, other]

Properties of Marginal Sequential Monte Carlo Methods

Authors: Francesca R. Crucinio, Adam M. Johansen

Abstract: We provide a framework which admits a number of ``marginal'' sequential Monte Carlo (SMC) algorithms as particular cases -- including the marginal particle filter [Klaas et al., 2005, in: Proceedings of Uncertainty in Artificial Intelligence, pp. 308--315], , the independent particle filter [Lin et al., 2005, Journal of the American Statistical Association 100, pp. 1412--1421] and linear-cost Appr… ▽ More We provide a framework which admits a number of ``marginal'' sequential Monte Carlo (SMC) algorithms as particular cases -- including the marginal particle filter [Klaas et al., 2005, in: Proceedings of Uncertainty in Artificial Intelligence, pp. 308--315], , the independent particle filter [Lin et al., 2005, Journal of the American Statistical Association 100, pp. 1412--1421] and linear-cost Approximate Bayesian Computation SMC [Sisson et al., 2007, Proceedings of the National Academy of Sciences (USA) 104, pp. 1760--1765.]. We provide conditions under which such algorithms obey laws of large numbers and central limit theorems and provide some further asymptotic characterizations. Finally, it is shown that the asymptotic variance of a class of estimators associated with certain marginal SMC algorithms is never greater than that of the estimators provided by a standard SMC algorithm using the same proposal distributions. △ Less

Submitted 6 March, 2023; originally announced March 2023.

arXiv:2212.12205 [pdf, other]

Cost free hyper-parameter selection/averaging for Bayesian inverse problems with vanilla and Rao-Blackwellized SMC Samplers

Authors: Alessandro Viani, Adam M Johansen, Alberto Sorrentino

Abstract: In Bayesian inverse problems, one aims at characterizing the posterior distribution of a set of unknowns, given indirect measurements. For non-linear/non-Gaussian problems, analytic solutions are seldom available: Sequential Monte Carlo samplers offer a powerful tool for approximating complex posteriors, by constructing an auxiliary sequence of densities that smoothly reaches the posterior. Often… ▽ More In Bayesian inverse problems, one aims at characterizing the posterior distribution of a set of unknowns, given indirect measurements. For non-linear/non-Gaussian problems, analytic solutions are seldom available: Sequential Monte Carlo samplers offer a powerful tool for approximating complex posteriors, by constructing an auxiliary sequence of densities that smoothly reaches the posterior. Often the posterior depends on a scalar hyper-parameter. In this work, we show that properly designed Sequential Monte Carlo (SMC) samplers naturally provide an approximation of the marginal likelihood associated with this hyper-parameter for free, i.e. at a negligible additional computational cost. The proposed method proceeds by constructing the auxiliary sequence of distributions in such a way that each of them can be interpreted as a posterior distribution corresponding to a different value of the hyper-parameter. This can be exploited to perform selection of the hyper-parameter in Empirical Bayes approaches, as well as averaging across values of the hyper-parameter according to some hyper-prior distribution in Fully Bayesian approaches. For FB approaches, the proposed method has the further benefit of allowing prior sensitivity analysis at a negligible computational cost. In addition, the proposed method exploits particles at all the (relevant) iterations, thus alleviating one of the known limitations of SMC samplers, i.e. the fact that all samples at intermediate iterations are typically discarded. We show numerical results for two distinct cases where the hyper-parameter affects only the likelihood: a toy example, where an SMC sampler is used to approximate the full posterior distribution; and a brain imaging example, where a Rao-Blackwellized SMC sampler is used to approximate the posterior distribution of a subset of parameters in a conditionally linear Gaussian model. △ Less

Submitted 23 December, 2022; originally announced December 2022.

Comments: 17 pages, 9 figures

MSC Class: 62F15; 62L20; 62P10; 62M05 ACM Class: G.1; G.3

arXiv:2211.14201 [pdf, other]

A divide and conquer sequential Monte Carlo approach to high dimensional filtering

Authors: Francesca R. Crucinio, Adam M. Johansen

Abstract: We propose a divide-and-conquer approach to filtering which decomposes the state variable into low-dimensional components to which standard particle filtering tools can be successfully applied and recursively merges them to recover the full filtering distribution. It is less dependent upon factorization of transition densities and observation likelihoods than competing approaches and can be applie… ▽ More We propose a divide-and-conquer approach to filtering which decomposes the state variable into low-dimensional components to which standard particle filtering tools can be successfully applied and recursively merges them to recover the full filtering distribution. It is less dependent upon factorization of transition densities and observation likelihoods than competing approaches and can be applied to a broader class of models. Performance is compared with state-of-the-art methods on a benchmark problem and it is demonstrated that the proposed method is broadly comparable in settings in which those methods are applicable, and that it can be applied in settings in which they cannot. △ Less

Submitted 25 November, 2022; originally announced November 2022.

arXiv:2209.09936 [pdf, other]

Solving Fredholm Integral Equations of the First Kind via Wasserstein Gradient Flows

Authors: Francesca R. Crucinio, Valentin De Bortoli, Arnaud Doucet, Adam M. Johansen

Abstract: Solving Fredholm equations of the first kind is crucial in many areas of the applied sciences. In this work we adopt a probabilistic and variational point of view by considering a minimization problem in the space of probability measures with an entropic regularization. Contrary to classical approaches which discretize the domain of the solutions, we introduce an algorithm to asymptotically sample… ▽ More Solving Fredholm equations of the first kind is crucial in many areas of the applied sciences. In this work we adopt a probabilistic and variational point of view by considering a minimization problem in the space of probability measures with an entropic regularization. Contrary to classical approaches which discretize the domain of the solutions, we introduce an algorithm to asymptotically sample from the unique solution of the regularized minimization problem. As a result our estimators do not depend on any underlying grid and have better scalability properties than most existing methods. Our algorithm is based on a particle approximation of the solution of a McKean--Vlasov stochastic differential equation associated with the Wasserstein gradient flow of our variational formulation. We prove the convergence towards a minimizer and provide practical guidelines for its numerical implementation. Finally, our method is compared with other approaches on several examples including density deconvolution and epidemiology. △ Less

Submitted 15 May, 2024; v1 submitted 16 September, 2022; originally announced September 2022.

Comments: Accepted for publication in Stochastic Processes and their Applications. In the journal version we erroneously state that convergence to the unregularized functional requires stronger assumptions on the kernel $k$ than those considered here; in fact, this is not the case and one can apply [27, Theorem 4.1] or [82, Theorem 1] to obtain this result under A1 and A3

MSC Class: 65Rxx; 65C35; 65K10; 45B05

arXiv:2204.12965 [pdf, other]

Particle algorithms for maximum likelihood training of latent variable models

Authors: Juan Kuntz, Jen Ning Lim, Adam M. Johansen

Abstract: (Neal and Hinton, 1998) recast maximum likelihood estimation of any given latent variable model as the minimization of a free energy functional $F$, and the EM algorithm as coordinate descent applied to $F$. Here, we explore alternative ways to optimize the functional. In particular, we identify various gradient flows associated with $F$ and show that their limits coincide with $F$'s stationary po… ▽ More (Neal and Hinton, 1998) recast maximum likelihood estimation of any given latent variable model as the minimization of a free energy functional $F$, and the EM algorithm as coordinate descent applied to $F$. Here, we explore alternative ways to optimize the functional. In particular, we identify various gradient flows associated with $F$ and show that their limits coincide with $F$'s stationary points. By discretizing the flows, we obtain practical particle-based algorithms for maximum likelihood estimation in broad classes of latent variable models. The novel algorithms scale to high-dimensional settings and perform well in numerical experiments. △ Less

Submitted 18 February, 2023; v1 submitted 27 April, 2022; originally announced April 2022.

arXiv:2110.15782 [pdf, ps, other]

The divide-and-conquer sequential Monte Carlo algorithm: theoretical properties and limit theorems

Authors: Juan Kuntz, Francesca R. Crucinio, Adam M. Johansen

Abstract: We provide a comprehensive characterisation of the theoretical properties of the divide-and-conquer sequential Monte Carlo (DaC-SMC) algorithm. We firmly establish it as a well-founded method by showing that it possesses the same basic properties as conventional sequential Monte Carlo (SMC) algorithms do. In particular, we derive pertinent laws of large numbers, $L^p$ inequalities, and central lim… ▽ More We provide a comprehensive characterisation of the theoretical properties of the divide-and-conquer sequential Monte Carlo (DaC-SMC) algorithm. We firmly establish it as a well-founded method by showing that it possesses the same basic properties as conventional sequential Monte Carlo (SMC) algorithms do. In particular, we derive pertinent laws of large numbers, $L^p$ inequalities, and central limit theorems; and we characterize the bias in the normalized estimates produced by the algorithm and argue the absence thereof in the unnormalized ones. We further consider its practical implementation and several interesting variants; obtain expressions for its globally and locally optimal intermediate targets, auxiliary measures, and proposal kernels; and show that, in comparable conditions, DaC-SMC proves more statistically efficient than its direct SMC analogue. We close the paper with a discussion of our results, open questions, and future research directions. △ Less

Submitted 30 June, 2023; v1 submitted 29 October, 2021; originally announced October 2021.

MSC Class: 65C05 (Primary) 60F05; 60F15; 62F15; 68W15 (Secondary)

arXiv:2110.07265 [pdf, ps, other]

Divide-and-Conquer Fusion

Authors: Ryan S. Y. Chan, Murray Pollock, Adam M. Johansen, Gareth O. Roberts

Abstract: Combining several (sample approximations of) distributions, which we term sub-posteriors, into a single distribution proportional to their product, is a common challenge. Occurring, for instance, in distributed 'big data' problems, or when working under multi-party privacy constraints. Many existing approaches resort to approximating the individual sub-posteriors for practical necessity, then find… ▽ More Combining several (sample approximations of) distributions, which we term sub-posteriors, into a single distribution proportional to their product, is a common challenge. Occurring, for instance, in distributed 'big data' problems, or when working under multi-party privacy constraints. Many existing approaches resort to approximating the individual sub-posteriors for practical necessity, then find either an analytical approximation or sample approximation of the resulting (product-pooled) posterior. The quality of the posterior approximation for these approaches is poor when the sub-posteriors fall out-with a narrow range of distributional form, such as being approximately Gaussian. Recently, a Fusion approach has been proposed which finds an exact Monte Carlo approximation of the posterior, circumventing the drawbacks of approximate approaches. Unfortunately, existing Fusion approaches have a number of computational limitations, particularly when unifying a large number of sub-posteriors. In this paper, we generalise the theory underpinning existing Fusion approaches, and embed the resulting methodology within a recursive divide-and-conquer sequential Monte Carlo paradigm. This ultimately leads to a competitive Fusion approach, which is robust to increasing numbers of sub-posteriors. △ Less

Submitted 12 July, 2023; v1 submitted 14 October, 2021; originally announced October 2021.

Comments: 73 pages, 14 figures

arXiv:2110.05356 [pdf, ps, other]

Weak Convergence of Non-neutral Genealogies to Kingman's Coalescent

Authors: Suzie Brown, Paul A. Jenkins, Adam M. Johansen, Jere Koskela

Abstract: Interacting particle systems undergoing repeated mutation and selection steps model genetic evolution, and also describe a broad class of sequential Monte Carlo methods. The genealogical tree embedded into the system is important in both applications. Under neutrality, when fitnesses of particles are independent from those of their parents, rescaled genealogies are known to converge to Kingman's c… ▽ More Interacting particle systems undergoing repeated mutation and selection steps model genetic evolution, and also describe a broad class of sequential Monte Carlo methods. The genealogical tree embedded into the system is important in both applications. Under neutrality, when fitnesses of particles are independent from those of their parents, rescaled genealogies are known to converge to Kingman's coalescent. Recent work has established convergence under non-neutrality, but only for finite-dimensional distributions. We prove weak convergence of non-neutral genealogies on the space of càdlàg paths under standard assumptions, enabling analysis of the whole genealogical tree. △ Less

Submitted 19 April, 2023; v1 submitted 11 October, 2021; originally announced October 2021.

Comments: 37 pages, 1 figure

MSC Class: 60J90; 65C35; 92D15

arXiv:2109.08573 [pdf, other]

The Node-wise Pseudo-marginal Method

Authors: Denishrouf Thesingarajah, Adam M. Johansen

Abstract: Motivated by problems from neuroimaging in which existing approaches make use of "mass univariate" analysis which neglects spatial structure entirely, but the full joint modelling of all quantities of interest is computationally infeasible, a novel method for incorporating spatial dependence within a (potentially large) family of model-selection problems is presented. Spatial dependence is encoded… ▽ More Motivated by problems from neuroimaging in which existing approaches make use of "mass univariate" analysis which neglects spatial structure entirely, but the full joint modelling of all quantities of interest is computationally infeasible, a novel method for incorporating spatial dependence within a (potentially large) family of model-selection problems is presented. Spatial dependence is encoded via a Markov random field model for which a variant of the pseudo-marginal Markov chain Monte Carlo algorithm is developed and extended by a further augmentation of the underlying state space. This approach allows the exploitation of existing unbiased marginal likelihood estimators used in settings in which spatial independence is normally assumed thereby facilitating the incorporation of spatial dependence using non-spatial estimates with minimal additional development effort. The proposed algorithm can be realistically used for analysis of %smaller subsets of large image moderately sized data sets such as $2$D slices of whole $3$D dynamic PET brain images or other regions of interest. Principled approximations of the proposed method, together with simple extensions based on the augmented spaces, are investigated and shown to provide similar results to the full pseudo-marginal method. Such approximations and extensions allow the improved performance obtained by incorporating spatial dependence to be obtained at negligible additional cost. An application to measured PET image data shows notable improvements in revealing underlying spatial structure when compared to current methods that assume spatial independence. △ Less

Submitted 17 April, 2022; v1 submitted 17 September, 2021; originally announced September 2021.

Comments: 37 pages, 17 figures, 1 table

arXiv:2102.11575 [pdf, other]

Product-form estimators: exploiting independence to scale up Monte Carlo

Authors: Juan Kuntz, Francesca R. Crucinio, Adam M. Johansen

Abstract: We introduce a class of Monte Carlo estimators that aim to overcome the rapid growth of variance with dimension often observed for standard estimators by exploiting the target's independence structure. We identify the most basic incarnations of these estimators with a class of generalized U-statistics, and thus establish their unbiasedness, consistency, and asymptotic normality. Moreover, we show… ▽ More We introduce a class of Monte Carlo estimators that aim to overcome the rapid growth of variance with dimension often observed for standard estimators by exploiting the target's independence structure. We identify the most basic incarnations of these estimators with a class of generalized U-statistics, and thus establish their unbiasedness, consistency, and asymptotic normality. Moreover, we show that they obtain the minimum possible variance amongst a broad class of estimators; and we investigate their computational cost and delineate the settings in which they are most efficient. We exemplify the merger of these estimators with other well-known Monte Carlo estimators so as to better adapt the latter to the target's independence structure and improve their performance. We do this via three simple mergers: one with importance sampling, another with importance sampling squared, and a final one with pseudo-marginal Metropolis-Hasting. In all cases, we show that the resulting estimators are well-founded and achieve lower variances than their standard counterparts. Lastly, we illustrate the various variance reductions through several examples. △ Less

Submitted 1 November, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

arXiv:2102.08057 [pdf]

doi 10.1007/s11009-021-09886-2

Unbiased simulation of rare events in continuous time

Authors: James Hodgson, Adam M. Johansen, Murray Pollock

Abstract: For rare events described in terms of Markov processes, truly unbiased estimation of the rare event probability generally requires the avoidance of numerical approximations of the Markov process. Recent work in the exact and $\varepsilon$-strong simulation of diffusions, which can be used to almost surely constrain sample paths to a given tolerance, suggests one way to do this. We specify how such… ▽ More For rare events described in terms of Markov processes, truly unbiased estimation of the rare event probability generally requires the avoidance of numerical approximations of the Markov process. Recent work in the exact and $\varepsilon$-strong simulation of diffusions, which can be used to almost surely constrain sample paths to a given tolerance, suggests one way to do this. We specify how such algorithms can be combined with the classical multilevel splitting method for rare event simulation. This provides unbiased estimations of the probability in question. We discuss the practical feasibility of the algorithm with reference to existing $\varepsilon$-strong methods and provide proof-of-concept numerical examples. △ Less

Submitted 5 November, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

Comments: 25 pages, 6 figures

arXiv:2009.09974 [pdf, other]

A Particle Method for Solving Fredholm Equations of the First Kind

Authors: Francesca R Crucinio, Arnaud Doucet, Adam M Johansen

Abstract: Fredholm integral equations of the first kind are the prototypical example of ill-posed linear inverse problems. They model, among other things, reconstruction of distorted noisy observations and indirect density estimation and also appear in instrumental variable regression. However, their numerical solution remains a challenging problem. Many techniques currently available require a preliminary… ▽ More Fredholm integral equations of the first kind are the prototypical example of ill-posed linear inverse problems. They model, among other things, reconstruction of distorted noisy observations and indirect density estimation and also appear in instrumental variable regression. However, their numerical solution remains a challenging problem. Many techniques currently available require a preliminary discretization of the domain of the solution and make strong assumptions about its regularity. For example, the popular expectation maximization smoothing (EMS) scheme requires the assumption of piecewise constant solutions which is inappropriate for most applications. We propose here a novel particle method that circumvents these two issues. This algorithm can be thought of as a Monte Carlo approximation of the EMS scheme which not only performs an adaptive stochastic discretization of the domain but also results in smooth approximate solutions. We analyze the theoretical properties of the EMS iteration and of the corresponding particle algorithm. Compared to standard EMS, we show experimentally that our novel particle method provides state-of-the-art performance for realistic systems, including motion deblurring and reconstruction of cross-section images of the brain from positron emission tomography. △ Less

Submitted 23 April, 2021; v1 submitted 21 September, 2020; originally announced September 2020.

arXiv:2007.00096 [pdf, ps, other]

Simple conditions for convergence of sequential Monte Carlo genealogies with applications

Authors: Suzie Brown, Paul A. Jenkins, Adam M. Johansen, Jere Koskela

Abstract: We present simple conditions under which the limiting genealogical process associated with a class of interacting particle systems with non-neutral selection mechanisms, as the number of particles grows, is a time-rescaled Kingman coalescent. Sequential Monte Carlo algorithms are popular methods for approximating integrals in problems such as non-linear filtering and smoothing which employ this ty… ▽ More We present simple conditions under which the limiting genealogical process associated with a class of interacting particle systems with non-neutral selection mechanisms, as the number of particles grows, is a time-rescaled Kingman coalescent. Sequential Monte Carlo algorithms are popular methods for approximating integrals in problems such as non-linear filtering and smoothing which employ this type of particle system. Their performance depends strongly on the properties of the induced genealogical process. We verify the conditions of our main result for standard sequential Monte Carlo algorithms with a broad class of low-variance resampling schemes, as well as for conditional sequential Monte Carlo with multinomial resampling. △ Less

Submitted 7 December, 2020; v1 submitted 30 June, 2020; originally announced July 2020.

Comments: 22 pages, 1 figure

MSC Class: 60J90; 60J95; 65C05; 65C35

arXiv:2002.09998 [pdf, other]

Generalized Bayesian Filtering via Sequential Monte Carlo

Authors: Ayman Boustati, Ömer Deniz Akyildiz, Theodoros Damoulas, Adam M. Johansen

Abstract: We introduce a framework for inference in general state-space hidden Markov models (HMMs) under likelihood misspecification. In particular, we leverage the loss-theoretic perspective of Generalized Bayesian Inference (GBI) to define generalised filtering recursions in HMMs, that can tackle the problem of inference under model misspecification. In doing so, we arrive at principled procedures for ro… ▽ More We introduce a framework for inference in general state-space hidden Markov models (HMMs) under likelihood misspecification. In particular, we leverage the loss-theoretic perspective of Generalized Bayesian Inference (GBI) to define generalised filtering recursions in HMMs, that can tackle the problem of inference under model misspecification. In doing so, we arrive at principled procedures for robust inference against observation contamination by utilising the $β$-divergence. Operationalising the proposed framework is made possible via sequential Monte Carlo methods (SMC), where most standard particle methods, and their associated convergence results, are readily adapted to the new setting. We apply our approach to object tracking and Gaussian process regression problems, and observe improved performance over both standard filtering algorithms and other robust filters. △ Less

Submitted 21 October, 2020; v1 submitted 23 February, 2020; originally announced February 2020.

arXiv:1902.00509 [pdf, other]

doi 10.1016/j.spa.2021.04.007

Limit theorems for cloning algorithms

Authors: Letizia Angeli, Stefan Grosskinsky, Adam M. Johansen

Abstract: Large deviations for additive path functionals of stochastic processes have attracted significant research interest, in particular in the context of stochastic particle systems and statistical physics. Efficient numerical `cloning' algorithms have been developed to estimate the scaled cumulant generating function, based on importance sampling via cloning of rare event trajectories. So far, attempt… ▽ More Large deviations for additive path functionals of stochastic processes have attracted significant research interest, in particular in the context of stochastic particle systems and statistical physics. Efficient numerical `cloning' algorithms have been developed to estimate the scaled cumulant generating function, based on importance sampling via cloning of rare event trajectories. So far, attempts to study the convergence properties of these algorithms in continuous time have only led to partial results for particular cases. Adapting previous results from the literature of particle filters and sequential Monte Carlo methods, we establish a first comprehensive and fully rigorous approach to bound systematic and random errors of cloning algorithms in continuous time. To this end we develop a method to compare different algorithms for particular classes of observables, based on the martingale characterization of stochastic processes. Our results apply to a large class of jump processes on compact state space, and do not involve any time discretization in contrast to previous approaches. This provides a robust and rigorous framework that can also be used to evaluate and improve the efficiency of algorithms. △ Less

Submitted 14 April, 2021; v1 submitted 1 February, 2019; originally announced February 2019.

Comments: 42 pages

MSC Class: 65C35; 60F25; 62L20 (Primary) 60F10; 60J75; 60K3 (Secondary)

Journal ref: Stoch. Proc. Appl. 138, 117-152 (2021)

arXiv:1810.00693 [pdf, other]

doi 10.1007/s10955-019-02340-1

Rare event simulation for stochastic dynamics in continuous time

Authors: Letizia Angeli, Stefan Grosskinsky, Adam M. Johansen, Andrea Pizzoferrato

Abstract: Large deviations for additive path functionals of stochastic dynamics and related numerical approaches have attracted significant recent research interest. We focus on the question of convergence properties for cloning algorithms in continuous time, and establish connections to the literature of particle filters and sequential Monte Carlo methods. This enables us to derive rigorous convergence bou… ▽ More Large deviations for additive path functionals of stochastic dynamics and related numerical approaches have attracted significant recent research interest. We focus on the question of convergence properties for cloning algorithms in continuous time, and establish connections to the literature of particle filters and sequential Monte Carlo methods. This enables us to derive rigorous convergence bounds for cloning algorithms which we report in this paper, with details of proofs given in a further publication. The tilted generator characterizing the large deviation rate function can be associated to non-linear processes which give rise to several representations of the dynamics and additional freedom for associated numerical approximations. We discuss these choices in detail, and combine insights from the filtering literature and cloning algorithms to compare different approaches and improve efficiency. △ Less

Submitted 10 June, 2019; v1 submitted 1 October, 2018; originally announced October 2018.

Comments: 33 pages, 3 figures

Journal ref: J. Stat. Phys. 176(5), 1185-1210 (2019)

arXiv:1807.09288 [pdf, other]

Global consensus Monte Carlo

Authors: Lewis J. Rendell, Adam M. Johansen, Anthony Lee, Nick Whiteley

Abstract: To conduct Bayesian inference with large data sets, it is often convenient or necessary to distribute the data across multiple machines. We consider a likelihood function expressed as a product of terms, each associated with a subset of the data. Inspired by global variable consensus optimisation, we introduce an instrumental hierarchical model associating auxiliary statistical parameters with eac… ▽ More To conduct Bayesian inference with large data sets, it is often convenient or necessary to distribute the data across multiple machines. We consider a likelihood function expressed as a product of terms, each associated with a subset of the data. Inspired by global variable consensus optimisation, we introduce an instrumental hierarchical model associating auxiliary statistical parameters with each term, which are conditionally independent given the top-level parameters. One of these top-level parameters controls the unconditional strength of association between the auxiliary parameters. This model leads to a distributed MCMC algorithm on an extended state space yielding approximations of posterior expectations. A trade-off between computational tractability and fidelity to the original model can be controlled by changing the association strength in the instrumental model. We further propose the use of a SMC sampler with a sequence of association strengths, allowing both the automatic determination of appropriate strengths and for a bias correction technique to be applied. In contrast to similar distributed Monte Carlo algorithms, this approach requires few distributional assumptions. The performance of the algorithms is illustrated with a number of simulated examples. △ Less

Submitted 7 April, 2020; v1 submitted 24 July, 2018; originally announced July 2018.

arXiv:1807.01057 [pdf, other]

doi 10.1017/apr.2020.9

Limit theorems for sequential MCMC methods

Authors: Axel Finke, Arnaud Doucet, Adam M. Johansen

Abstract: Sequential Monte Carlo (SMC) methods, also known as particle filters, constitute a class of algorithms used to approximate expectations with respect to a sequence of probability distributions as well as the normalising constants of those distributions. Sequential MCMC methods are an alternative class of techniques addressing similar problems in which particles are sampled according to an MCMC kern… ▽ More Sequential Monte Carlo (SMC) methods, also known as particle filters, constitute a class of algorithms used to approximate expectations with respect to a sequence of probability distributions as well as the normalising constants of those distributions. Sequential MCMC methods are an alternative class of techniques addressing similar problems in which particles are sampled according to an MCMC kernel rather than conditionally independently at each time step. These methods were introduced over twenty years ago by Berzuini et al. (1997). Recently, there has been a renewed interest in such algorithms as they demonstrate an empirical performance superior to that of SMC methods in some applications. We establish a strong law of large numbers and a central limit theorem for sequential MCMC methods and provide conditions under which errors can be controlled uniformly in time. In the context of state-space models, we provide conditions under which sequential MCMC methods can indeed outperform standard SMC methods in terms of asymptotic variance of the corresponding Monte Carlo estimators. △ Less

Submitted 25 July, 2018; v1 submitted 3 July, 2018; originally announced July 2018.

Journal ref: Adv. Appl. Probab. 52 (2020) 377-403

arXiv:1805.03924 [pdf, other]

Unbiased and Consistent Nested Sampling via Sequential Monte Carlo

Authors: Robert Salomone, Leah F. South, Adam M. Johansen, Christopher Drovandi, Dirk P. Kroese

Abstract: We introduce a new class of sequential Monte Carlo methods called nested sampling via sequential Monte Carlo (NS-SMC), which reformulates the essence of the nested sampling method of Skilling (2006) in terms of sequential Monte Carlo techniques. This new framework allows convergence results to be obtained in the setting when Markov chain Monte Carlo (MCMC) is used to produce new samples. An additi… ▽ More We introduce a new class of sequential Monte Carlo methods called nested sampling via sequential Monte Carlo (NS-SMC), which reformulates the essence of the nested sampling method of Skilling (2006) in terms of sequential Monte Carlo techniques. This new framework allows convergence results to be obtained in the setting when Markov chain Monte Carlo (MCMC) is used to produce new samples. An additional benefit is that marginal likelihood (normalizing constant) estimates are unbiased. In contrast to NS, the analysis of NS-SMC does not require the (unrealistic) assumption that the simulated samples be independent. We show that a minor adjustment to our adaptive NS-SMC algorithm recovers the original NS algorithm, which provides insights as to why NS seems to produce accurate estimates despite a typical violation of its assumptions. A numerical study is conducted where the performance of NS-SMC and temperature-annealed SMC is compared on challenging problems. Code for the experiments is made available online at https://github.com/LeahPrice/SMC-NS . △ Less

Submitted 20 December, 2023; v1 submitted 10 May, 2018; originally announced May 2018.

Comments: 21 pages main text, 6 pages supplementary material. Includes proof of consistency for an adaptive nested sampling sequential Monte Carlo algorithm

arXiv:1804.01811 [pdf, other]

doi 10.1214/19-AOS1823

Asymptotic genealogies of interacting particle systems with an application to sequential Monte Carlo

Authors: Jere Koskela, Paul A. Jenkins, Adam M. Johansen, Dario Spano

Abstract: We study weighted particle systems in which new generations are resampled from current particles with probabilities proportional to their weights. This covers a broad class of sequential Monte Carlo (SMC) methods, widely-used in applied statistics and cognate disciplines. We consider the genealogical tree embedded into such particle systems, and identify conditions, as well as an appropriate time-… ▽ More We study weighted particle systems in which new generations are resampled from current particles with probabilities proportional to their weights. This covers a broad class of sequential Monte Carlo (SMC) methods, widely-used in applied statistics and cognate disciplines. We consider the genealogical tree embedded into such particle systems, and identify conditions, as well as an appropriate time-scaling, under which they converge to the Kingman n-coalescent in the infinite system size limit in the sense of finite-dimensional distributions. Thus, the tractable n-coalescent can be used to predict the shape and size of SMC genealogies, as we illustrate by characterising the limiting mean and variance of the tree height. SMC genealogies are known to be connected to algorithm performance, so that our results are likely to have applications in the design of new methods as well. Our conditions for convergence are strong, but we show by simulation that they do not appear to be necessary. △ Less

Submitted 16 July, 2021; v1 submitted 5 April, 2018; originally announced April 2018.

Comments: 28 pages, 1 figure. An earlier version of this manuscript contained an error, which we have been able to correct and in so doing give a stronger result under cleaner conditions. v7: Added several technical lemmas which make the overall argument more explicit

MSC Class: Primary 60E15; secondary 60G99; 62E20

Journal ref: Annals of Statistics 48(1):560-583, 2020

arXiv:1610.08962 [pdf, other]

On embedded hidden Markov models and particle Markov chain Monte Carlo methods

Authors: Axel Finke, Arnaud Doucet, Adam M. Johansen

Abstract: The embedded hidden Markov model (EHMM) sampling method is a Markov chain Monte Carlo (MCMC) technique for state inference in non-linear non-Gaussian state-space models which was proposed in Neal (2003); Neal et al. (2004) and extended in Shestopaloff and Neal (2016). An extension to Bayesian parameter inference was presented in Shestopaloff and Neal (2013). An alternative class of MCMC schemes ad… ▽ More The embedded hidden Markov model (EHMM) sampling method is a Markov chain Monte Carlo (MCMC) technique for state inference in non-linear non-Gaussian state-space models which was proposed in Neal (2003); Neal et al. (2004) and extended in Shestopaloff and Neal (2016). An extension to Bayesian parameter inference was presented in Shestopaloff and Neal (2013). An alternative class of MCMC schemes addressing similar inference problems is provided by particle MCMC (PMCMC) methods (Andrieu et al. 2009; 2010). All these methods rely on the introduction of artificial extended target distributions for multiple state sequences which, by construction, are such that one randomly indexed sequence is distributed according to the posterior of interest. By adapting the Metropolis-Hastings algorithms developed in the framework of PMCMC methods to the EHMM framework, we obtain novel particle filter (PF)-type algorithms for state inference and novel MCMC schemes for parameter and state inference. In addition, we show that most of these algorithms can be viewed as particular cases of a general PF and PMCMC framework. We compare the empirical performance of the various algorithms on low- to high-dimensional state-space models. We demonstrate that a properly tuned conditional PF with "local" MCMC moves proposed in Shestopaloff and Neal (2016) can outperform the standard conditional PF significantly when applied to high-dimensional state-space models while the novel PF-type algorithm could prove to be an interesting alternative to standard PFs for likelihood estimation in some lower-dimensional scenarios. △ Less

Submitted 27 October, 2016; originally announced October 2016.

Comments: 23 pages, 7 figures

arXiv:1609.03436 [pdf, other]

Quasi-stationary Monte Carlo and the ScaLE Algorithm

Authors: Murray Pollock, Paul Fearnhead, Adam M. Johansen, Gareth O. Roberts

Abstract: This paper introduces a class of Monte Carlo algorithms which are based upon the simulation of a Markov process whose quasi-stationary distribution coincides with a distribution of interest. This differs fundamentally from, say, current Markov chain Monte Carlo methods which simulate a Markov chain whose stationary distribution is the target. We show how to approximate distributions of interest by… ▽ More This paper introduces a class of Monte Carlo algorithms which are based upon the simulation of a Markov process whose quasi-stationary distribution coincides with a distribution of interest. This differs fundamentally from, say, current Markov chain Monte Carlo methods which simulate a Markov chain whose stationary distribution is the target. We show how to approximate distributions of interest by carefully combining sequential Monte Carlo methods with methodology for the exact simulation of diffusions. The methodology introduced here is particularly promising in that it is applicable to the same class of problems as gradient based Markov chain Monte Carlo algorithms but entirely circumvents the need to conduct Metropolis-Hastings type accept/reject steps whilst retaining exactness: the paper gives theoretical guarantees ensuring the algorithm has the correct limiting target distribution. Furthermore, this methodology is highly amenable to big data problems. By employing a modification to existing na{\"ı}ve sub-sampling and control variate techniques it is possible to obtain an algorithm which is still exact but has sub-linear iterative cost as a function of data size. △ Less

Submitted 13 April, 2020; v1 submitted 12 September, 2016; originally announced September 2016.

Comments: Substantially revised with clearer presentation and more extensive simulation study. 59 pages, 6 figures

arXiv:1511.06286 [pdf, other]

The iterated auxiliary particle filter

Authors: Pieralberto Guarniero, Adam M. Johansen, Anthony Lee

Abstract: We present an offline, iterated particle filter to facilitate statistical inference in general state space hidden Markov models. Given a model and a sequence of observations, the associated marginal likelihood L is central to likelihood-based inference for unknown statistical parameters. We define a class of "twisted" models: each member is specified by a sequence of positive functions psi and has… ▽ More We present an offline, iterated particle filter to facilitate statistical inference in general state space hidden Markov models. Given a model and a sequence of observations, the associated marginal likelihood L is central to likelihood-based inference for unknown statistical parameters. We define a class of "twisted" models: each member is specified by a sequence of positive functions psi and has an associated psi-auxiliary particle filter that provides unbiased estimates of L. We identify a sequence psi* that is optimal in the sense that the psi*-auxiliary particle filter's estimate of L has zero variance. In practical applications, psi* is unknown so the psi*-auxiliary particle filter cannot straightforwardly be implemented. We use an iterative scheme to approximate psi*, and demonstrate empirically that the resulting iterated auxiliary particle filter significantly outperforms the bootstrap particle filter in challenging settings. Applications include parameter estimation using a particle Markov chain Monte Carlo algorithm. △ Less

Submitted 15 June, 2016; v1 submitted 19 November, 2015; originally announced November 2015.

arXiv:1506.08450 [pdf, ps, other]

Pointwise Convergence in Probability of General Smoothing Splines

Authors: Matthew Thorpe, Adam M. Johansen

Abstract: Establishing the convergence of splines can be cast as a variational problem which is amenable to a $Γ$-convergence approach. We consider the case in which the regularization coefficient scales with the number of observations, $n$, as $λ_n=n^{-p}$. Using standard theorems from the $Γ$-convergence literature, we prove that the general spline model is consistent in that estimators converge in a sens… ▽ More Establishing the convergence of splines can be cast as a variational problem which is amenable to a $Γ$-convergence approach. We consider the case in which the regularization coefficient scales with the number of observations, $n$, as $λ_n=n^{-p}$. Using standard theorems from the $Γ$-convergence literature, we prove that the general spline model is consistent in that estimators converge in a sense slightly weaker than weak convergence in probability for $p\leq \frac{1}{2}$. Without further assumptions we show this rate is sharp. This differs from rates for strong convergence using Hilbert scales where one can often choose $p>\frac{1}{2}$. △ Less

Submitted 11 March, 2017; v1 submitted 28 June, 2015; originally announced June 2015.

arXiv:1506.02676 [pdf, ps, other]

Convergence and Rates for Fixed-Interval Multiple-Track Smoothing Using $k$-Means Type Optimization

Authors: Matthew Thorpe, Adam M. Johansen

Abstract: We address the task of estimating multiple trajectories from unlabeled data. This problem arises in many settings, one could think of the construction of maps of transport networks from passive observation of travellers, or the reconstruction of the behaviour of uncooperative vehicles from external observations, for example. There are two coupled problems. The first is a data association problem:… ▽ More We address the task of estimating multiple trajectories from unlabeled data. This problem arises in many settings, one could think of the construction of maps of transport networks from passive observation of travellers, or the reconstruction of the behaviour of uncooperative vehicles from external observations, for example. There are two coupled problems. The first is a data association problem: how to map data points onto individual trajectories. The second is, given a solution to the data association problem, to estimate those trajectories. We construct estimators as a solution to a regularized variational problem (to which approximate solutions can be obtained via the simple, efficient and widespread $k$-means method) and show that, as the number of data points, $n$, increases, these estimators exhibit stable behaviour. More precisely, we show that they converge in an appropriate Sobolev space in probability and with rate $n^{-1/2}$. △ Less

Submitted 3 November, 2016; v1 submitted 8 June, 2015; originally announced June 2015.

arXiv:1504.00298 [pdf, other]

doi 10.1007/s11222-016-9629-2

Bayesian model comparison with un-normalised likelihoods

Authors: Richard G. Everitt, Adam M. Johansen, Ellen Rowing, Melina Evdemon-Hogan

Abstract: Models for which the likelihood function can be evaluated only up to a parameter-dependent unknown normalising constant, such as Markov random field models, are used widely in computer science, statistical physics, spatial statistics, and network analysis. However, Bayesian analysis of these models using standard Monte Carlo methods is not possible due to the intractability of their likelihood fun… ▽ More Models for which the likelihood function can be evaluated only up to a parameter-dependent unknown normalising constant, such as Markov random field models, are used widely in computer science, statistical physics, spatial statistics, and network analysis. However, Bayesian analysis of these models using standard Monte Carlo methods is not possible due to the intractability of their likelihood functions. Several methods that permit exact, or close to exact, simulation from the posterior distribution have recently been developed. However, estimating the evidence and Bayes' factors (BFs) for these models remains challenging in general. This paper describes new random weight importance sampling and sequential Monte Carlo methods for estimating BFs that use simulation to circumvent the evaluation of the intractable likelihood, and compares them to existing methods. In some cases we observe an advantage in the use of biased weight estimates. An initial investigation into the theoretical and empirical properties of this class of methods is presented. Some support for the use of biased estimates is presented, but we advocate caution in the use of such estimates. △ Less

Submitted 20 January, 2016; v1 submitted 1 April, 2015; originally announced April 2015.

arXiv:1502.01194 [pdf, ps, other]

Discussion of "Sequential Quasi-Monte-Carlo Sampling" by M. Gerber and N. Chopin

Authors: M. Pollock, A. M. Johansen, K. Łatuszyński, G. O. Roberts

Abstract: In this comment we consider whether QMC methods can be further embedded within SMC schemes in settings in which the transition density of the latent process is intractable and pseudo-marginal methods are deployed. In this comment we consider whether QMC methods can be further embedded within SMC schemes in settings in which the transition density of the latent process is intractable and pseudo-marginal methods are deployed. △ Less

Submitted 4 February, 2015; originally announced February 2015.

Comments: Journal of the Royal Statistical Society B, (In Press)

arXiv:1501.01320 [pdf, ps, other]

Convergence of the $k$-Means Minimization Problem using $Γ$-Convergence

Authors: Matthew Thorpe, Florian Theil, Adam M. Johansen, Neil Cade

Abstract: The $k$-means method is an iterative clustering algorithm which associates each observation with one of $k$ clusters. It traditionally employs cluster centers in the same space as the observed data. By relaxing this requirement, it is possible to apply the $k$-means method to infinite dimensional problems, for example multiple target tracking and smoothing problems in the presence of unknown data… ▽ More The $k$-means method is an iterative clustering algorithm which associates each observation with one of $k$ clusters. It traditionally employs cluster centers in the same space as the observed data. By relaxing this requirement, it is possible to apply the $k$-means method to infinite dimensional problems, for example multiple target tracking and smoothing problems in the presence of unknown data association. Via a $Γ$-convergence argument, the associated optimization problem is shown to converge in the sense that both the $k$-means minimum and minimizers converge in the large data limit to quantities which depend upon the observed data only through its distribution. The theory is supplemented with two examples to demonstrate the range of problems now accessible by the $k$-means method. The first example combines a non-parametric smoothing problem with unknown data association. The second addresses tracking using sparse data from a network of passive sensors. △ Less

Submitted 3 April, 2015; v1 submitted 6 January, 2015; originally announced January 2015.

arXiv:1406.4993 [pdf, other]

doi 10.1080/10618600.2016.1237363

Divide-and-Conquer with Sequential Monte Carlo

Authors: Fredrik Lindsten, Adam M. Johansen, Christian A. Naesseth, Bonnie Kirkpatrick, Thomas B. Schön, John Aston, Alexandre Bouchard-Côté

Abstract: We propose a novel class of Sequential Monte Carlo (SMC) algorithms, appropriate for inference in probabilistic graphical models. This class of algorithms adopts a divide-and-conquer approach based upon an auxiliary tree-structured decomposition of the model of interest, turning the overall inferential task into a collection of recursively solved sub-problems. The proposed method is applicable to… ▽ More We propose a novel class of Sequential Monte Carlo (SMC) algorithms, appropriate for inference in probabilistic graphical models. This class of algorithms adopts a divide-and-conquer approach based upon an auxiliary tree-structured decomposition of the model of interest, turning the overall inferential task into a collection of recursively solved sub-problems. The proposed method is applicable to a broad class of probabilistic graphical models, including models with loops. Unlike a standard SMC sampler, the proposed Divide-and-Conquer SMC employs multiple independent populations of weighted particles, which are resampled, merged, and propagated as the method progresses. We illustrate empirically that this approach can outperform standard methods in terms of the accuracy of the posterior expectation and marginal likelihood approximations. Divide-and-Conquer SMC also opens up novel parallel implementation options and the possibility of concentrating the computational effort on the most challenging sub-problems. We demonstrate its performance on a Markov random field and on a hierarchical logistic regression problem. △ Less

Submitted 30 June, 2015; v1 submitted 19 June, 2014; originally announced June 2014.

Journal ref: Journal of Computational and Graphical Statistics, 26(2):445-458, 2017

arXiv:1303.3123 [pdf, other]

Towards Automatic Model Comparison: An Adaptive Sequential Monte Carlo Approach

Authors: Yan Zhou, Adam M Johansen, John A D Aston

Abstract: Model comparison for the purposes of selection, averaging and validation is a problem found throughout statistics. Within the Bayesian paradigm, these problems all require the calculation of the posterior probabilities of models within a particular class. Substantial progress has been made in recent years, but difficulties remain in the implementation of existing schemes. This paper presents adapt… ▽ More Model comparison for the purposes of selection, averaging and validation is a problem found throughout statistics. Within the Bayesian paradigm, these problems all require the calculation of the posterior probabilities of models within a particular class. Substantial progress has been made in recent years, but difficulties remain in the implementation of existing schemes. This paper presents adaptive sequential Monte Carlo (\smc) sampling strategies to characterise the posterior distribution of a collection of models, as well as the parameters of those models. Both a simple product estimator and a combination of \smc and a path sampling estimator are considered and existing theoretical results are extended to include the path sampling variant. A novel approach to the automatic specification of distributions within \smc algorithms is presented and shown to outperform the state of the art in this area. The performance of the proposed strategies is demonstrated via an extensive empirical study. Comparisons with state of the art algorithms show that the proposed algorithms are always competitive, and often substantially superior to alternative techniques, at equal computational cost and considerably less application-specific implementation effort. △ Less

Submitted 5 June, 2015; v1 submitted 13 March, 2013; originally announced March 2013.

Comments: 31 pages; 2 figures

arXiv:1302.6964 [pdf, ps, other]

doi 10.3150/14-BEJ676

On the exact and $\varepsilon$-strong simulation of (jump) diffusions

Authors: Murray Pollock, Adam M. Johansen, Gareth O. Roberts

Abstract: This paper introduces a framework for simulating finite dimensional representations of (jump) diffusion sample paths over finite intervals, without discretisation error (exactly), in such a way that the sample path can be restored at any finite collection of time points. Within this framework we extend existing exact algorithms and introduce novel adaptive approaches. We consider an application of… ▽ More This paper introduces a framework for simulating finite dimensional representations of (jump) diffusion sample paths over finite intervals, without discretisation error (exactly), in such a way that the sample path can be restored at any finite collection of time points. Within this framework we extend existing exact algorithms and introduce novel adaptive approaches. We consider an application of the methodology developed within this paper which allows the simulation of upper and lower bounding processes which almost surely constrain (jump) diffusion sample paths to any specified tolerance. We demonstrate the efficacy of our approach by showing that with finite computation it is possible to determine whether or not sample paths cross various irregular barriers, simulate to any specified tolerance the first hitting time of the irregular barrier and simulate killed diffusion sample paths. △ Less

Submitted 9 February, 2016; v1 submitted 27 February, 2013; originally announced February 2013.

Comments: Published at http://dx.doi.org/10.3150/14-BEJ676 in the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Report number: IMS-BEJ-BEJ676

Journal ref: Bernoulli 2016, Vol. 22, No. 2, 794-856

arXiv:1301.0463 [pdf, ps, other]

A Simple Approach to Maximum Intractable Likelihood Estimation

Authors: F. J. Rubio, Adam M. Johansen

Abstract: Approximate Bayesian Computation (ABC) can be viewed as an analytic approximation of an intractable likelihood coupled with an elementary simulation step. Such a view, combined with a suitable instrumental prior distribution permits maximum-likelihood (or maximum-a-posteriori) inference to be conducted, approximately, using essentially the same techniques. An elementary approach to this problem wh… ▽ More Approximate Bayesian Computation (ABC) can be viewed as an analytic approximation of an intractable likelihood coupled with an elementary simulation step. Such a view, combined with a suitable instrumental prior distribution permits maximum-likelihood (or maximum-a-posteriori) inference to be conducted, approximately, using essentially the same techniques. An elementary approach to this problem which simply obtains a nonparametric approximation of the likelihood surface which is then used as a smooth proxy for the likelihood in a subsequent maximisation step is developed here and the convergence of this class of algorithms is characterised theoretically. The use of non-sufficient summary statistics in this context is considered. Applying the proposed method to four problems demonstrates good performance. The proposed approach provides an alternative for approximating the maximum likelihood estimator (MLE) in complex scenarios. △ Less

Submitted 3 January, 2013; originally announced January 2013.

arXiv:1205.6310 [pdf, ps, other]

doi 10.1214/12-AOAS611

Dynamic filtering of static dipoles in magnetoencephalography

Authors: Alberto Sorrentino, Adam M. Johansen, John A. D. Aston, Thomas E. Nichols, Wilfrid S. Kendall

Abstract: We consider the problem of estimating neural activity from measurements of the magnetic fields recorded by magnetoencephalography. We exploit the temporal structure of the problem and model the neural current as a collection of evolving current dipoles, which appear and disappear, but whose locations are constant throughout their lifetime. This fully reflects the physiological interpretation of th… ▽ More We consider the problem of estimating neural activity from measurements of the magnetic fields recorded by magnetoencephalography. We exploit the temporal structure of the problem and model the neural current as a collection of evolving current dipoles, which appear and disappear, but whose locations are constant throughout their lifetime. This fully reflects the physiological interpretation of the model. In order to conduct inference under this proposed model, it was necessary to develop an algorithm based around state-of-the-art sequential Monte Carlo methods employing carefully designed importance distributions. Previous work employed a bootstrap filter and an artificial dynamic structure where dipoles performed a random walk in space, yielding nonphysical artefacts in the reconstructions; such artefacts are not observed when using the proposed model. The algorithm is validated with simulated data, in which it provided an average localisation error which is approximately half that of the bootstrap filter. An application to complex real data derived from a somatosensory experiment is presented. Assessment of model fit via marginal likelihood showed a clear preference for the proposed model and the associated reconstructions show better localisation. △ Less

Submitted 6 December, 2013; v1 submitted 29 May, 2012; originally announced May 2012.

Comments: Published in at http://dx.doi.org/10.1214/12-AOAS611 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS611

Journal ref: Annals of Applied Statistics 2013, Vol. 7, No. 2, 955-988

arXiv:1011.0834 [pdf, other]

Discussions on "Riemann manifold Langevin and Hamiltonian Monte Carlo methods"

Authors: Simon Barthelme, Magali Beffy, Nicolas Chopin, Arnaud Doucet, Pierre Jacob, Adam M. Johansen, Jean-Michel Marin, Christian P. Robert

Abstract: This is a collection of discussions of `Riemann manifold Langevin and Hamiltonian Monte Carlo methods" by Girolami and Calderhead, to appear in the Journal of the Royal Statistical Society, Series B. This is a collection of discussions of `Riemann manifold Langevin and Hamiltonian Monte Carlo methods" by Girolami and Calderhead, to appear in the Journal of the Royal Statistical Society, Series B. △ Less

Submitted 3 November, 2010; originally announced November 2010.

Comments: 6 pages, one figure

Showing 1–39 of 39 results for author: Johansen, A M