Search | arXiv e-print repository

Flexible Tails for Normalizing Flows

Authors: Tennessee Hickling, Dennis Prangle

Abstract: Normalizing flows are a flexible class of probability distributions, expressed as transformations of a simple base distribution. A limitation of standard normalizing flows is representing distributions with heavy tails, which arise in applications to both density estimation and variational inference. A popular current solution to this problem is to use a heavy tailed base distribution. Examples in… ▽ More Normalizing flows are a flexible class of probability distributions, expressed as transformations of a simple base distribution. A limitation of standard normalizing flows is representing distributions with heavy tails, which arise in applications to both density estimation and variational inference. A popular current solution to this problem is to use a heavy tailed base distribution. Examples include the tail adaptive flow (TAF) methods of Laszkiewicz et al (2022). We argue this can lead to poor performance due to the difficulty of optimising neural networks, such as normalizing flows, under heavy tailed input. This problem is demonstrated in our paper. We propose an alternative: use a Gaussian base distribution and a final transformation layer which can produce heavy tails. We call this approach tail transform flow (TTF). Experimental results show this approach outperforms current methods, especially when the target distribution has large dimension or tail weight. △ Less

Submitted 22 June, 2024; originally announced June 2024.

arXiv:2311.00580 [pdf, ps, other]

Flexible Tails for Normalising Flows, with Application to the Modelling of Financial Return Data

Authors: Tennessee Hickling, Dennis Prangle

Abstract: We propose a transformation capable of altering the tail properties of a distribution, motivated by extreme value theory, which can be used as a layer in a normalizing flow to approximate multivariate heavy tailed distributions. We apply this approach to model financial returns, capturing potentially extreme shocks that arise in such data. The trained models can be used directly to generate new sy… ▽ More We propose a transformation capable of altering the tail properties of a distribution, motivated by extreme value theory, which can be used as a layer in a normalizing flow to approximate multivariate heavy tailed distributions. We apply this approach to model financial returns, capturing potentially extreme shocks that arise in such data. The trained models can be used directly to generate new synthetic sets of potentially extreme returns △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: 7 pages, 1 figure

arXiv:2010.11779 [pdf, other]

Measure Transport with Kernel Stein Discrepancy

Authors: Matthew A. Fisher, Tui Nolan, Matthew M. Graham, Dennis Prangle, Chris J. Oates

Abstract: Measure transport underpins several recent algorithms for posterior approximation in the Bayesian context, wherein a transport map is sought to minimise the Kullback--Leibler divergence (KLD) from the posterior to the approximation. The KLD is a strong mode of convergence, requiring absolute continuity of measures and placing restrictions on which transport maps can be permitted. Here we propose t… ▽ More Measure transport underpins several recent algorithms for posterior approximation in the Bayesian context, wherein a transport map is sought to minimise the Kullback--Leibler divergence (KLD) from the posterior to the approximation. The KLD is a strong mode of convergence, requiring absolute continuity of measures and placing restrictions on which transport maps can be permitted. Here we propose to minimise a kernel Stein discrepancy (KSD) instead, requiring only that the set of transport maps is dense in an $L^2$ sense and demonstrating how this condition can be validated. The consistency of the associated posterior approximation is established and empirical results suggest that KSD is competitive and more flexible alternative to KLD for measure transport. △ Less

Submitted 26 October, 2020; v1 submitted 22 October, 2020; originally announced October 2020.

arXiv:2006.09264 [pdf, other]

Bonsai-Net: One-Shot Neural Architecture Search via Differentiable Pruners

Authors: Rob Geada, Dennis Prangle, Andrew Stephen McGough

Abstract: One-shot Neural Architecture Search (NAS) aims to minimize the computational expense of discovering state-of-the-art models. However, in the past year attention has been drawn to the comparable performance of naive random search across the same search spaces used by leading NAS algorithms. To address this, we explore the effects of drastically relaxing the NAS search space, and we present Bonsai-N… ▽ More One-shot Neural Architecture Search (NAS) aims to minimize the computational expense of discovering state-of-the-art models. However, in the past year attention has been drawn to the comparable performance of naive random search across the same search spaces used by leading NAS algorithms. To address this, we explore the effects of drastically relaxing the NAS search space, and we present Bonsai-Net, an efficient one-shot NAS method to explore our relaxed search space. Bonsai-Net is built around a modified differential pruner and can consistently discover state-of-the-art architectures that are significantly better than random search with fewer parameters than other state-of-the-art methods. Additionally, Bonsai-Net performs simultaneous model search and training, dramatically reducing the total time it takes to generate fully-trained models from scratch. △ Less

Submitted 4 June, 2021; v1 submitted 12 June, 2020; originally announced June 2020.

Comments: Accepted to CVPR-NAS 2020. https://github.com/RobGeada/bonsai-net-lite

arXiv:1910.03632 [pdf, other]

Distilling Importance Sampling for Likelihood Free Inference

Authors: Dennis Prangle, Cecilia Viscardi

Abstract: Likelihood-free inference involves inferring parameter values given observed data and a simulator model. The simulator is computer code which takes parameters, performs stochastic calculations, and outputs simulated data. In this work, we view the simulator as a function whose inputs are (1) the parameters and (2) a vector of pseudo-random draws. We attempt to infer all these inputs conditional on… ▽ More Likelihood-free inference involves inferring parameter values given observed data and a simulator model. The simulator is computer code which takes parameters, performs stochastic calculations, and outputs simulated data. In this work, we view the simulator as a function whose inputs are (1) the parameters and (2) a vector of pseudo-random draws. We attempt to infer all these inputs conditional on the observations. This is challenging as the resulting posterior can be high dimensional and involve strong dependence. We approximate the posterior using normalizing flows, a flexible parametric family of densities. Training data is generated by likelihood-free importance sampling with a large bandwidth value epsilon, which makes the target similar to the prior. The training data is "distilled" by using it to train an updated normalizing flow. The process is iterated, using the updated flow as the importance sampling proposal, and slowly reducing epsilon so the target becomes closer to the posterior. Unlike most other likelihood-free methods, we avoid the need to reduce data to low dimensional summary statistics, and hence can achieve more accurate results. We illustrate our method in two challenging examples, on queuing and epidemiology. △ Less

Submitted 27 January, 2023; v1 submitted 8 October, 2019; originally announced October 2019.

Comments: This version makes minor edits, in particular adding more details on the final importance sampling step

arXiv:1910.00879 [pdf, other]

The Neural Moving Average Model for Scalable Variational Inference of State Space Models

Authors: Tom Ryder, Dennis Prangle, Andrew Golightly, Isaac Matthews

Abstract: Variational inference has had great success in scaling approximate Bayesian inference to big data by exploiting mini-batch training. To date, however, this strategy has been most applicable to models of independent data. We propose an extension to state space models of time series data based on a novel generative model for latent temporal states: the neural moving average model. This permits a sub… ▽ More Variational inference has had great success in scaling approximate Bayesian inference to big data by exploiting mini-batch training. To date, however, this strategy has been most applicable to models of independent data. We propose an extension to state space models of time series data based on a novel generative model for latent temporal states: the neural moving average model. This permits a subsequence to be sampled without drawing from the entire distribution, enabling training iterations to use mini-batches of the time series at low computational cost. We illustrate our method on autoregressive, Lotka-Volterra, FitzHugh-Nagumo and stochastic volatility models, achieving accurate parameter estimation in a short time. △ Less

Submitted 18 May, 2021; v1 submitted 2 October, 2019; originally announced October 2019.

arXiv:1906.09199 [pdf, other]

Black-Box Inference for Non-Linear Latent Force Models

Authors: Wil O. C. Ward, Tom Ryder, Dennis Prangle, Mauricio A. Álvarez

Abstract: Latent force models are systems whereby there is a mechanistic model describing the dynamics of the system state, with some unknown forcing term that is approximated with a Gaussian process. If such dynamics are non-linear, it can be difficult to estimate the posterior state and forcing term jointly, particularly when there are system parameters that also need estimating. This paper uses black-box… ▽ More Latent force models are systems whereby there is a mechanistic model describing the dynamics of the system state, with some unknown forcing term that is approximated with a Gaussian process. If such dynamics are non-linear, it can be difficult to estimate the posterior state and forcing term jointly, particularly when there are system parameters that also need estimating. This paper uses black-box variational inference to jointly estimate the posterior, designing a multivariate extension to local inverse autoregressive flows as a flexible approximater of the system. We compare estimates on systems where the posterior is known, demonstrating the effectiveness of the approximation, and apply to problems with non-linear dynamics, multi-output systems and models with non-Gaussian likelihoods. △ Less

Submitted 4 November, 2019; v1 submitted 21 June, 2019; originally announced June 2019.

Comments: 13 pages plus references and supplementary

arXiv:1906.02014 [pdf, other]

Ensemble MCMC: Accelerating Pseudo-Marginal MCMC for State Space Models using the Ensemble Kalman Filter

Authors: Christopher Drovandi, Richard G Everitt, Andrew Golightly, Dennis Prangle

Abstract: Particle Markov chain Monte Carlo (pMCMC) is now a popular method for performing Bayesian statistical inference on challenging state space models (SSMs) with unknown static parameters. It uses a particle filter (PF) at each iteration of an MCMC algorithm to unbiasedly estimate the likelihood for a given static parameter value. However, pMCMC can be computationally intensive when a large number of… ▽ More Particle Markov chain Monte Carlo (pMCMC) is now a popular method for performing Bayesian statistical inference on challenging state space models (SSMs) with unknown static parameters. It uses a particle filter (PF) at each iteration of an MCMC algorithm to unbiasedly estimate the likelihood for a given static parameter value. However, pMCMC can be computationally intensive when a large number of particles in the PF is required, such as when the data is highly informative, the model is misspecified and/or the time series is long. In this paper we exploit the ensemble Kalman filter (EnKF) developed in the data assimilation literature to speed up pMCMC. We replace the unbiased PF likelihood with the biased EnKF likelihood estimate within MCMC to sample over the space of the static parameter. On a wide class of different non-linear SSM models, we demonstrate that our new ensemble MCMC (eMCMC) method can significantly reduce the computational cost whilst maintaining reasonable accuracy. We also propose several extensions of the vanilla eMCMC algorithm to further improve computational efficiency. Computer code to implement our methods on all the examples can be downloaded from https://github.com/cdrovandi/Ensemble-MCMC. △ Less

Submitted 16 August, 2019; v1 submitted 5 June, 2019; originally announced June 2019.

Comments: minor edits, more extensive results, added web link to supporting computer code

arXiv:1904.05703 [pdf, other]

Bayesian experimental design without posterior calculations: an adversarial approach

Authors: Dennis Prangle, Sophie Harbisher, Colin S Gillespie

Abstract: Most computational approaches to Bayesian experimental design require making posterior calculations repeatedly for a large number of potential designs and/or simulated datasets. This can be expensive and prohibit scaling up these methods to models with many parameters, or designs with many unknowns to select. We introduce an efficient alternative approach without posterior calculations, based on o… ▽ More Most computational approaches to Bayesian experimental design require making posterior calculations repeatedly for a large number of potential designs and/or simulated datasets. This can be expensive and prohibit scaling up these methods to models with many parameters, or designs with many unknowns to select. We introduce an efficient alternative approach without posterior calculations, based on optimising the expected trace of the Fisher information, as discussed by Walker (2016). We illustrate drawbacks of this approach, including lack of invariance to reparameterisation and encouraging designs in which one parameter combination is inferred accurately but not any others. We show these can be avoided by using an adversarial approach: the experimenter must select their design while a critic attempts to select the least favourable parameterisation. We present theoretical properties of this approach and show it can be used with gradient based optimisation methods to find designs efficiently in practice. △ Less

Submitted 17 November, 2021; v1 submitted 11 April, 2019; originally announced April 2019.

Comments: V5 has minor typo corrections and presentational changes

arXiv:1901.04326 [pdf, other]

doi 10.1515/9783110635461-005

Optimality Criteria for Probabilistic Numerical Methods

Authors: Chris. J. Oates, Jon Cockayne, Dennis Prangle, T. J. Sullivan, Mark Girolami

Abstract: It is well understood that Bayesian decision theory and average case analysis are essentially identical. However, if one is interested in performing uncertainty quantification for a numerical task, it can be argued that standard approaches from the decision-theoretic framework are neither appropriate nor sufficient. Instead, we consider a particular optimality criterion from Bayesian experimental… ▽ More It is well understood that Bayesian decision theory and average case analysis are essentially identical. However, if one is interested in performing uncertainty quantification for a numerical task, it can be argued that standard approaches from the decision-theoretic framework are neither appropriate nor sufficient. Instead, we consider a particular optimality criterion from Bayesian experimental design and study its implied optimal information in the numerical context. This information is demonstrated to differ, in general, from the information that would be used in an average-case-optimal numerical method. The explicit connection to Bayesian experimental design suggests several distinct regimes in which optimal probabilistic numerical methods can be developed. △ Less

Submitted 10 May, 2019; v1 submitted 14 January, 2019; originally announced January 2019.

Comments: Prepared for the proceedings of the RICAM workshop on Multivariate Algorithms and Information-Based Complexity, November 2018

Journal ref: Multivariate Algorithms and Information-Based Complexity, Radon Series on Computational and Applied Mathematics 27:65--88, 2020

arXiv:1811.08337 [pdf, other]

Black-Box Autoregressive Density Estimation for State-Space Models

Authors: Tom Ryder, Andrew Golighty, A. Stephen McGough, Dennis Prangle

Abstract: State-space models (SSMs) provide a flexible framework for modelling time-series data. Consequently, SSMs are ubiquitously applied in areas such as engineering, econometrics and epidemiology. In this paper we provide a fast approach for approximate Bayesian inference in SSMs using the tools of deep learning and variational inference. State-space models (SSMs) provide a flexible framework for modelling time-series data. Consequently, SSMs are ubiquitously applied in areas such as engineering, econometrics and epidemiology. In this paper we provide a fast approach for approximate Bayesian inference in SSMs using the tools of deep learning and variational inference. △ Less

Submitted 21 November, 2018; v1 submitted 20 November, 2018; originally announced November 2018.

Comments: V2

arXiv:1802.03335 [pdf, other]

Black-box Variational Inference for Stochastic Differential Equations

Authors: Thomas Ryder, Andrew Golightly, A. Stephen McGough, Dennis Prangle

Abstract: Parameter inference for stochastic differential equations is challenging due to the presence of a latent diffusion process. Working with an Euler-Maruyama discretisation for the diffusion, we use variational inference to jointly learn the parameters and the diffusion paths. We use a standard mean-field variational approximation of the parameter posterior, and introduce a recurrent neural network t… ▽ More Parameter inference for stochastic differential equations is challenging due to the presence of a latent diffusion process. Working with an Euler-Maruyama discretisation for the diffusion, we use variational inference to jointly learn the parameters and the diffusion paths. We use a standard mean-field variational approximation of the parameter posterior, and introduce a recurrent neural network to approximate the posterior for the diffusion paths conditional on the parameters. This neural network learns how to provide Gaussian state transitions which bridge between observations in a very similar way to the conditioned diffusion process. The resulting black-box inference method can be applied to any SDE system with light tuning requirements. We illustrate the method on a Lotka-Volterra system and an epidemic model, producing accurate parameter estimates in a few hours. △ Less

Submitted 14 May, 2018; v1 submitted 9 February, 2018; originally announced February 2018.

Comments: V3 - revised based on ICML reviewer comments V2 - added acknowledgements and link to code

arXiv:1710.04382 [pdf, other]

Marginal sequential Monte Carlo for doubly intractable models

Authors: Richard G. Everitt, Dennis Prangle, Philip Maybank, Mark Bell

Abstract: Bayesian inference for models that have an intractable partition function is known as a doubly intractable problem, where standard Monte Carlo methods are not applicable. The past decade has seen the development of auxiliary variable Monte Carlo techniques (Møller et al., 2006; Murray et al., 2006) for tackling this problem; these approaches being members of the more general class of pseudo-margin… ▽ More Bayesian inference for models that have an intractable partition function is known as a doubly intractable problem, where standard Monte Carlo methods are not applicable. The past decade has seen the development of auxiliary variable Monte Carlo techniques (Møller et al., 2006; Murray et al., 2006) for tackling this problem; these approaches being members of the more general class of pseudo-marginal, or exact-approximate, Monte Carlo algorithms (Andrieu and Roberts, 2009), which make use of unbiased estimates of intractable posteriors. Everitt et al. (2017) investigated the use of exact-approximate importance sampling (IS) and sequential Monte Carlo (SMC) in doubly intractable problems, but focussed only on SMC algorithms that used data-point tempering. This paper describes SMC samplers that may use alternative sequences of distributions, and describes ways in which likelihood estimates may be improved adaptively as the algorithm progresses, building on ideas from Moores et al. (2015). This approach is compared with a number of alternative algorithms for doubly intractable problems, including approximate Bayesian computation (ABC), which we show is closely related to the method of Møller et al. (2006). △ Less

Submitted 12 October, 2017; originally announced October 2017.

arXiv:1706.06889 [pdf, other]

gk: An R Package for the g-and-k and generalised g-and-h Distributions

Authors: Dennis Prangle

Abstract: The g-and-k and (generalised) g-and-h distributions are flexible univariate distributions which can model highly skewed or heavy tailed data through only four parameters: location and scale, and two shape parameters influencing the skewness and kurtosis. These distributions have the unusual property that they are defined through their quantile function (inverse cumulative distribution function) an… ▽ More The g-and-k and (generalised) g-and-h distributions are flexible univariate distributions which can model highly skewed or heavy tailed data through only four parameters: location and scale, and two shape parameters influencing the skewness and kurtosis. These distributions have the unusual property that they are defined through their quantile function (inverse cumulative distribution function) and their density is unavailable in closed form, which makes parameter inference complicated. This paper presents the gk R package to work with these distributions. It provides the usual distribution functions and several algorithms for inference of independent identically distributed data, including the finite difference stochastic approximation method, which has not been used before for this problem. △ Less

Submitted 21 June, 2017; originally announced June 2017.

arXiv:1704.06374 [pdf, ps, other]

Recalibration: A post-processing method for approximate Bayesian computation

Authors: G. S. Rodrigues, D. Prangle, S. A. Sisson

Abstract: A new recalibration post-processing method is presented to improve the quality of the posterior approximation when using Approximate Bayesian Computation (ABC) algorithms. Recalibration may be used in conjunction with existing post-processing methods, such as regression-adjustments. In addition, this work extends and strengthens the links between ABC and indirect inference algorithms, allowing mor… ▽ More A new recalibration post-processing method is presented to improve the quality of the posterior approximation when using Approximate Bayesian Computation (ABC) algorithms. Recalibration may be used in conjunction with existing post-processing methods, such as regression-adjustments. In addition, this work extends and strengthens the links between ABC and indirect inference algorithms, allowing more extensive use of misspecified auxiliary models in the ABC context. The method is illustrated using simulated examples to demonstrate the effects of recalibration under various conditions, and through an application to an analysis of stereological extremes both with and without the use of auxiliary models. Code to implement recalibration post-processing is available in the R package, abctools. △ Less

Submitted 20 April, 2017; originally announced April 2017.

arXiv:1611.02492 [pdf, other]

A rare event approach to high dimensional Approximate Bayesian computation

Authors: Dennis Prangle, Richard G. Everitt, Theodore Kypraios

Abstract: Approximate Bayesian computation (ABC) methods permit approximate inference for intractable likelihoods when it is possible to simulate from the model. However they perform poorly for high dimensional data, and in practice must usually be used in conjunction with dimension reduction methods, resulting in a loss of accuracy which is hard to quantify or control. We propose a new ABC method for high… ▽ More Approximate Bayesian computation (ABC) methods permit approximate inference for intractable likelihoods when it is possible to simulate from the model. However they perform poorly for high dimensional data, and in practice must usually be used in conjunction with dimension reduction methods, resulting in a loss of accuracy which is hard to quantify or control. We propose a new ABC method for high dimensional data based on rare event methods which we refer to as RE-ABC. This uses a latent variable representation of the model. For a given parameter value, we estimate the probability of the rare event that the latent variables correspond to data roughly consistent with the observations. This is performed using sequential Monte Carlo and slice sampling to systematically search the space of latent variables. In contrast standard ABC can be viewed as using a more naive Monte Carlo estimate. We use our rare event probability estimator as a likelihood estimate within the pseudo-marginal Metropolis-Hastings algorithm for parameter inference. We provide asymptotics showing that RE-ABC has a lower computational cost for high dimensional data than standard ABC methods. We also illustrate our approach empirically, on a Gaussian distribution and an application in infectious disease modelling. △ Less

Submitted 4 April, 2017; v1 submitted 8 November, 2016; originally announced November 2016.

Comments: Supplementary material at end of pdf

arXiv:1604.08102 [pdf, ps, other]

An ABC interpretation of the multiple auxiliary variable method

Authors: Dennis Prangle, Richard G. Everitt

Abstract: We show that the auxiliary variable method (Møller et al., 2006; Murray et al., 2006) for inference of Markov random fields can be viewed as an approximate Bayesian computation method for likelihood estimation. We show that the auxiliary variable method (Møller et al., 2006; Murray et al., 2006) for inference of Markov random fields can be viewed as an approximate Bayesian computation method for likelihood estimation. △ Less

Submitted 27 April, 2016; originally announced April 2016.

arXiv:1512.05633 [pdf, ps, other]

Summary Statistics in Approximate Bayesian Computation

Authors: Dennis Prangle

Abstract: This document is due to appear as a chapter of the forthcoming Handbook of Approximate Bayesian Computation (ABC) edited by S. Sisson, Y. Fan, and M. Beaumont. Since the earliest work on ABC, it has been recognised that using summary statistics is essential to produce useful inference results. This is because ABC suffers from a curse of dimensionality effect, whereby using high dimensional input… ▽ More This document is due to appear as a chapter of the forthcoming Handbook of Approximate Bayesian Computation (ABC) edited by S. Sisson, Y. Fan, and M. Beaumont. Since the earliest work on ABC, it has been recognised that using summary statistics is essential to produce useful inference results. This is because ABC suffers from a curse of dimensionality effect, whereby using high dimensional inputs causes large approximation errors in the output. It is therefore crucial to find low dimensional summaries which are informative about the parameter inference or model choice task at hand. This chapter reviews the methods which have been proposed to select such summaries, extending the previous review paper of Blum et al. (2013) with recent developments. Related theoretical results on the ABC curse of dimensionality and sufficiency are also discussed. △ Less

Submitted 17 December, 2015; originally announced December 2015.

arXiv:1507.00874 [pdf, other]

Adapting the ABC distance function

Authors: Dennis Prangle

Abstract: Approximate Bayesian computation performs approximate inference for models where likelihood computations are expensive or impossible. Instead simulations from the model are performed for various parameter values and accepted if they are close enough to the observations. There has been much progress on deciding which summary statistics of the data should be used to judge closeness, but less work on… ▽ More Approximate Bayesian computation performs approximate inference for models where likelihood computations are expensive or impossible. Instead simulations from the model are performed for various parameter values and accepted if they are close enough to the observations. There has been much progress on deciding which summary statistics of the data should be used to judge closeness, but less work on how to weight them. Typically weights are chosen at the start of the algorithm which normalise the summary statistics to vary on similar scales. However these may not be appropriate in iterative ABC algorithms, where the distribution from which the parameters are proposed is updated. This can substantially alter the resulting distribution of summary statistics, so that different weights are needed for normalisation. This paper presents two iterative ABC algorithms which adaptively update their weights and demonstrates improved results on test applications. △ Less

Submitted 15 December, 2015; v1 submitted 3 July, 2015; originally announced July 2015.

Comments: Revised based on referee reports, including addition of a new method (Algorithm 4)

arXiv:1501.05144 [pdf, ps, other]

Lazier ABC

Authors: Dennis Prangle

Abstract: ABC algorithms involve a large number of simulations from the model of interest, which can be very computationally costly. This paper summarises the lazy ABC algorithm of Prangle (2015), which reduces the computational demand by abandoning many unpromising simulations before completion. By using a random stop** decision and reweighting the output sample appropriately, the target distribution is… ▽ More ABC algorithms involve a large number of simulations from the model of interest, which can be very computationally costly. This paper summarises the lazy ABC algorithm of Prangle (2015), which reduces the computational demand by abandoning many unpromising simulations before completion. By using a random stop** decision and reweighting the output sample appropriately, the target distribution is the same as for standard ABC. Lazy ABC is also extended here to the case of non-uniform ABC kernels, which is shown to simplify the process of tuning the algorithm effectively. △ Less

Submitted 21 January, 2015; originally announced January 2015.

Comments: Presented as contributed paper at "ABC in Montreal" NIPS workshop in December 2014

arXiv:1405.7867 [pdf, other]

Lazy ABC

Authors: Dennis Prangle

Abstract: Approximate Bayesian computation (ABC) performs statistical inference for otherwise intractable probability models by accepting parameter proposals when corresponding simulated datasets are sufficiently close to the observations. Producing the large quantity of simulations needed requires considerable computing time. However, it is often clear before a simulation ends that it is unpromising: it is… ▽ More Approximate Bayesian computation (ABC) performs statistical inference for otherwise intractable probability models by accepting parameter proposals when corresponding simulated datasets are sufficiently close to the observations. Producing the large quantity of simulations needed requires considerable computing time. However, it is often clear before a simulation ends that it is unpromising: it is likely to produce a poor match or require excessive time. This paper proposes lazy ABC, an ABC importance sampling algorithm which saves time by sometimes abandoning such simulations. This makes ABC more scalable to applications where simulation is expensive. By using a random stop** rule and appropriate reweighting step, the target distribution is unchanged from that of standard ABC. Theory and practical methods to tune lazy ABC are presented and illustrated on a simple epidemic model example. They are also demonstrated on the computationally demanding spatial extremes application of Erhardt and Smith (2012), producing efficiency gains, in terms of effective sample size per unit CPU time, of roughly 3 times for a 20 location dataset, and 8 times for 35 locations. △ Less

Submitted 4 December, 2014; v1 submitted 30 May, 2014; originally announced May 2014.

Comments: Pre-publication version. Revised to fix typos and update bibliography

arXiv:1302.5624 [pdf, ps, other]

Semi-automatic selection of summary statistics for ABC model choice

Authors: Dennis Prangle, Paul Fearnhead, Murray P. Cox, Patrick J. Biggs, Nigel P. French

Abstract: A central statistical goal is to choose between alternative explanatory models of data. In many modern applications, such as population genetics, it is not possible to apply standard methods based on evaluating the likelihood functions of the models, as these are numerically intractable. Approximate Bayesian computation (ABC) is a commonly used alternative for such situations. ABC simulates data x… ▽ More A central statistical goal is to choose between alternative explanatory models of data. In many modern applications, such as population genetics, it is not possible to apply standard methods based on evaluating the likelihood functions of the models, as these are numerically intractable. Approximate Bayesian computation (ABC) is a commonly used alternative for such situations. ABC simulates data x for many parameter values under each model, which is compared to the observed data xobs. More weight is placed on models under which S(x) is close to S(xobs), where S maps data to a vector of summary statistics. Previous work has shown the choice of S is crucial to the efficiency and accuracy of ABC. This paper provides a method to select good summary statistics for model choice. It uses a preliminary step, simulating many x values from all models and fitting regressions to this with the model as response. The resulting model weight estimators are used as S in an ABC analysis. Theoretical results are given to justify this as approximating low dimensional sufficient statistics. A substantive application is presented: choosing between competing coalescent models of demographic growth for Campylobacter jejuni in New Zealand using multi-locus sequence ty** data. △ Less

Submitted 22 February, 2013; originally announced February 2013.

arXiv:1301.3166 [pdf, other]

Diagnostic tools of approximate Bayesian computation using the coverage property

Authors: D. Prangle, M. G. B. Blum, G. Popovic, S. A. Sisson

Abstract: Approximate Bayesian computation (ABC) is an approach for sampling from an approximate posterior distribution in the presence of a computationally intractable likelihood function. A common implementation is based on simulating model, parameter and dataset triples, (m,θ,y), from the prior, and then accepting as samples from the approximate posterior, those pairs (m,θ) for which y, or a summary of y… ▽ More Approximate Bayesian computation (ABC) is an approach for sampling from an approximate posterior distribution in the presence of a computationally intractable likelihood function. A common implementation is based on simulating model, parameter and dataset triples, (m,θ,y), from the prior, and then accepting as samples from the approximate posterior, those pairs (m,θ) for which y, or a summary of y, is "close" to the observed data. Closeness is typically determined though a distance measure and a kernel scale parameter, ε. Appropriate choice of εis important to producing a good quality approximation. This paper proposes diagnostic tools for the choice of εbased on assessing the coverage property, which asserts that credible intervals have the correct coverage levels. We provide theoretical results on coverage for both model and parameter inference, and adapt these into diagnostics for the ABC context. We re-analyse a study on human demographic history to determine whether the adopted posterior approximation was appropriate. R code implementing the proposed methodology is freely available in the package "abc." △ Less

Submitted 14 January, 2013; originally announced January 2013.

Comments: Figures 8-13 are Supplementary Information Figures S1-S6

arXiv:1202.3819 [pdf, ps, other]

doi 10.1214/12-STS406

A Comparative Review of Dimension Reduction Methods in Approximate Bayesian Computation

Authors: M. G. B. Blum, M. A. Nunes, D. Prangle, S. A. Sisson

Abstract: Approximate Bayesian computation (ABC) methods make use of comparisons between simulated and observed summary statistics to overcome the problem of computationally intractable likelihood functions. As the practical implementation of ABC requires computations based on vectors of summary statistics, rather than full data sets, a central question is how to derive low-dimensional summary statistics fr… ▽ More Approximate Bayesian computation (ABC) methods make use of comparisons between simulated and observed summary statistics to overcome the problem of computationally intractable likelihood functions. As the practical implementation of ABC requires computations based on vectors of summary statistics, rather than full data sets, a central question is how to derive low-dimensional summary statistics from the observed data with minimal loss of information. In this article we provide a comprehensive review and comparison of the performance of the principal methods of dimension reduction proposed in the ABC literature. The methods are split into three nonmutually exclusive classes consisting of best subset selection methods, projection techniques and regularization. In addition, we introduce two new methods of dimension reduction. The first is a best subset selection method based on Akaike and Bayesian information criteria, and the second uses ridge regression as a regularization procedure. We illustrate the performance of these dimension reduction techniques through the analysis of three challenging models and data sets. △ Less

Submitted 11 June, 2013; v1 submitted 16 February, 2012; originally announced February 2012.

Comments: Published in at http://dx.doi.org/10.1214/12-STS406 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-STS-STS406

Journal ref: Statistical Science 2013, Vol. 28, No. 2, 189-208

arXiv:1004.1112 [pdf, ps, other]

Constructing Summary Statistics for Approximate Bayesian Computation: Semi-automatic ABC

Authors: Paul Fearnhead, Dennis Prangle

Abstract: Many modern statistical applications involve inference for complex stochastic models, where it is easy to simulate from the models, but impossible to calculate likelihoods. Approximate Bayesian computation (ABC) is a method of inference for such models. It replaces calculation of the likelihood by a step which involves simulating artificial data for different parameter values, and comparing summar… ▽ More Many modern statistical applications involve inference for complex stochastic models, where it is easy to simulate from the models, but impossible to calculate likelihoods. Approximate Bayesian computation (ABC) is a method of inference for such models. It replaces calculation of the likelihood by a step which involves simulating artificial data for different parameter values, and comparing summary statistics of the simulated data to summary statistics of the observed data. Here we show how to construct appropriate summary statistics for ABC in a semi-automatic manner. We aim for summary statistics which will enable inference about certain parameters of interest to be as accurate as possible. Theoretical results show that optimal summary statistics are the posterior means of the parameters. While these cannot be calculated analytically, we use an extra stage of simulation to estimate how the posterior means vary as a function of the data; and then use these estimates of our summary statistics within ABC. Empirical results show that our approach is a robust method for choosing summary statistics, that can result in substantially more accurate ABC analyses than the ad-hoc choices of summary statistics proposed in the literature. We also demonstrate advantages over two alternative methods of simulation-based inference. △ Less

Submitted 13 April, 2011; v1 submitted 7 April, 2010; originally announced April 2010.

Comments: v2: Revised in response to reviewer comments, adding more examples and a method for inference from multiple data sources

Showing 1–25 of 25 results for author: Prangle, D