-
A synthetic likelihood-based Laplace approximation for efficient design of biological processes
Authors:
Mahasen Dehideniya,
Antony M. Overstall,
Chris C. Drovandi,
James M. McGree
Abstract:
Complex models used to describe biological processes in epidemiology and ecology often have computationally intractable or expensive likelihoods. This poses significant challenges in terms of Bayesian inference but more significantly in the design of experiments. Bayesian designs are found by maximising the expectation of a utility function over a design space, and typically this requires sampling…
▽ More
Complex models used to describe biological processes in epidemiology and ecology often have computationally intractable or expensive likelihoods. This poses significant challenges in terms of Bayesian inference but more significantly in the design of experiments. Bayesian designs are found by maximising the expectation of a utility function over a design space, and typically this requires sampling from or approximating a large number of posterior distributions. This renders approaches adopted in inference computationally infeasible to implement in design. Consequently, optimal design in such fields has been limited to a small number of dimensions or a restricted range of utility functions. To overcome such limitations, we propose a synthetic likelihood-based Laplace approximation for approximating utility functions for models with intractable likelihoods. As will be seen, the proposed approximation is flexible in that a wide range of utility functions can be considered, and remains computationally efficient in high dimensions. To explore the validity of this approximation, an illustrative example from epidemiology is considered. Then, our approach is used to design experiments with a relatively large number of observations in two motivating applications from epidemiology and ecology.
△ Less
Submitted 11 March, 2019;
originally announced March 2019.
-
Approximating the Likelihood in Approximate Bayesian Computation
Authors:
Christopher C Drovandi,
Clara Grazian,
Kerrie Mengersen,
Christian Robert
Abstract:
This chapter will appear in the forthcoming Handbook of Approximate Bayesian Computation (2018).
The conceptual and methodological framework that underpins approximate Bayesian computation (ABC) is targetted primarily towards problems in which the likelihood is either challenging or missing. ABC uses a simulation-based non-parametric estimate of the likelihood of a summary statistic and assumes…
▽ More
This chapter will appear in the forthcoming Handbook of Approximate Bayesian Computation (2018).
The conceptual and methodological framework that underpins approximate Bayesian computation (ABC) is targetted primarily towards problems in which the likelihood is either challenging or missing. ABC uses a simulation-based non-parametric estimate of the likelihood of a summary statistic and assumes that the generation of data from the model is computationally cheap. This chapter reviews two alternative approaches for estimating the intractable likelihood, with the goal of reducing the necessary model simulations to produce an approximate posterior. The first of these is a Bayesian version of the synthetic likelihood (SL), initially developed by Wood (2010), which uses a multivariate normal approximation to the summary statistic likelihood. Using the parametric approximation as opposed to the non-parametric approximation of ABC, it is possible to reduce the number of model simulations required. The second likelihood approximation method we consider in this chapter is based on the empirical likelihood (EL), which is a non-parametric technique and involves maximising a likelihood constructed empirically under a set of moment constraints. Mengersen et al (2013) adapt the EL framework so that it can be used to form an approximate posterior for problems where ABC can be applied, that is, for models with intractable likelihoods. However, unlike ABC and the Bayesian SL (BSL), the Bayesian EL (BCel) approach can be used to completely avoid model simulations in some cases. The BSL and BCel methods are illustrated on models of varying complexity.
△ Less
Submitted 18 March, 2018;
originally announced March 2018.
-
ABC and Indirect Inference
Authors:
Christopher C Drovandi
Abstract:
This chapter will appear in the forthcoming Handbook of Approximate Bayesian Computation (2018).
Indirect inference (II) is a classical likelihood-free approach that pre-dates the main developments of ABC and relies on simulation from a parametric model of interest to determine point estimates of the parameters. It is not surprising then that some likelihood-free Bayesian approaches have harness…
▽ More
This chapter will appear in the forthcoming Handbook of Approximate Bayesian Computation (2018).
Indirect inference (II) is a classical likelihood-free approach that pre-dates the main developments of ABC and relies on simulation from a parametric model of interest to determine point estimates of the parameters. It is not surprising then that some likelihood-free Bayesian approaches have harnessed the II literature. This chapter provides an introduction to II and details the connections between ABC and II. A particular focus is placed on the use of an auxiliary model with a tractable likelihood function, an approach commonly adopted in the II literature, to facilitate likelihood-free Bayesian inferences.
△ Less
Submitted 5 March, 2018;
originally announced March 2018.
-
A Semi-Automatic Method for History Matching using Sequential Monte Carlo
Authors:
Christopher C Drovandi,
David J Nott,
Daniel E Pagendam
Abstract:
The aim of the history matching method is to locate non-implausible regions of the parameter space of complex deterministic or stochastic models by matching model outputs with data. It does this via a series of waves where at each wave an emulator is fitted to a small number of training samples. An implausibility measure is defined which takes into account the closeness of simulated and observed o…
▽ More
The aim of the history matching method is to locate non-implausible regions of the parameter space of complex deterministic or stochastic models by matching model outputs with data. It does this via a series of waves where at each wave an emulator is fitted to a small number of training samples. An implausibility measure is defined which takes into account the closeness of simulated and observed outputs as well as emulator uncertainty. As the waves progress, the emulator becomes more accurate so that training samples are more concentrated on promising regions of the space and poorer parts of the space are rejected with more confidence. Whilst history matching has proved to be useful, existing implementations are not fully automated and some ad-hoc choices are made during the process, which involves user intervention and is time consuming. This occurs especially when the non-implausible region becomes small and it is difficult to sample this space uniformly to generate new training points. In this article we develop a sequential Monte Carlo (SMC) algorithm for implementing history matching that is semi-automated. Our novel SMC approach reveals that the history matching method yields a non-implausible region that can be multi-modal, highly irregular and very difficult to sample uniformly. Our SMC approach offers a much more reliable sampling of the non-implausible space, which requires additional computation compared to other approaches used in the literature.
△ Less
Submitted 8 May, 2021; v1 submitted 9 October, 2017;
originally announced October 2017.
-
ABC model selection for spatial extremes models applied to South Australian maximum temperature data
Authors:
Xing Ju Lee,
Markus Hainy,
James P. McKeone,
Christopher C. Drovandi,
Anthony N. Pettitt
Abstract:
Max-stable processes are a common choice for modelling spatial extreme data as they arise naturally as the infinite-dimensional generalisation of multivariate extreme value theory. Statistical inference for such models is complicated by the intractability of the multivariate density function. Nonparametric, composite likelihood-based, and Bayesian approaches have been proposed to address this diff…
▽ More
Max-stable processes are a common choice for modelling spatial extreme data as they arise naturally as the infinite-dimensional generalisation of multivariate extreme value theory. Statistical inference for such models is complicated by the intractability of the multivariate density function. Nonparametric, composite likelihood-based, and Bayesian approaches have been proposed to address this difficulty. More recently, a simulation-based approach using approximate Bayesian computation (ABC) has been employed for estimating parameters of max-stable models. ABC algorithms rely on the evaluation of discrepancies between model simulations and the observed data rather than explicit evaluations of computationally expensive or intractable likelihood functions. The use of an ABC method to perform model selection for max-stable models is explored. Three max-stable models are regarded: the extremal-t model with either a Whittle-Matérn or a powered exponential covariance function, and the Brown-Resnick model with power variogram. In addition, the non-extremal Student-t copula model with a Whittle-Matérn or a powered exponential covariance function is also considered. The method is applied to annual maximum temperature data from 25 weather stations dispersed around South Australia.
△ Less
Submitted 9 August, 2018; v1 submitted 9 October, 2017;
originally announced October 2017.
-
Unlocking datasets by calibrating populations of models to data density: a study in atrial electrophysiology
Authors:
Brodie A. J. Lawson,
Christopher C. Drovandi,
Nicole Cusimano,
Pamela Burrage,
Blanca Rodriguez,
Kevin Burrage
Abstract:
The understanding of complex physical or biological systems nearly always requires a characterisation of the variability that underpins these processes. In addition, the data used to calibrate such models may also often exhibit considerable variability. A recent approach to deal with these issues has been to calibrate populations of models (POMs), that is multiple copies of a single mathematical m…
▽ More
The understanding of complex physical or biological systems nearly always requires a characterisation of the variability that underpins these processes. In addition, the data used to calibrate such models may also often exhibit considerable variability. A recent approach to deal with these issues has been to calibrate populations of models (POMs), that is multiple copies of a single mathematical model but with different parameter values. To date this calibration has been limited to selecting models that produce outputs that fall within the ranges of the dataset, ignoring any trends that might be present in the data. We present here a novel and general methodology for calibrating POMs to the distributions of a set of measured values in a dataset. We demonstrate the benefits of our technique using a dataset from a cardiac atrial electrophysiology study based on the differences in atrial action potential readings between patients exhibiting sinus rhythm (SR) or chronic atrial fibrillation (cAF) and the Courtemanche--Ramirez--Nattel model for human atrial action potentials. Our approach accurately captures the variability inherent in the experimental population, and allows us to identify the differences underlying stratified data as well as the effects of drug block.
△ Less
Submitted 21 June, 2017;
originally announced June 2017.
-
An approach for finding fully Bayesian optimal designs using normal-based approximations to loss functions
Authors:
Antony M. Overstall,
James M. McGree,
Christopher C. Drovandi
Abstract:
The generation of decision-theoretic Bayesian optimal designs is complicated by the significant computational challenge of minimising an analytically intractable expected loss function over a, potentially, high-dimensional design space. A new general approach for approximately finding Bayesian optimal designs is proposed which uses computationally efficient normal-based approximations to posterior…
▽ More
The generation of decision-theoretic Bayesian optimal designs is complicated by the significant computational challenge of minimising an analytically intractable expected loss function over a, potentially, high-dimensional design space. A new general approach for approximately finding Bayesian optimal designs is proposed which uses computationally efficient normal-based approximations to posterior summaries to aid in approximating the expected loss. This new approach is demonstrated on illustrative, yet challenging, examples including hierarchical models for blocked experiments, and experimental aims of parameter estimation and model discrimination. Where possible, the results of the proposed methodology are compared, both in terms of performance and computing time, to results from using computationally more expensive, but potentially more accurate, Monte Carlo approximations. Moreover the methodology is also applied to problems where the use of Monte Carlo approximations is computationally infeasible.
△ Less
Submitted 6 February, 2017; v1 submitted 20 August, 2016;
originally announced August 2016.
-
Variational Bayes with Synthetic Likelihood
Authors:
Victor M-H. Ong,
David J. Nott,
Minh-Ngoc Tran,
Scott A. Sisson,
Christopher C. Drovandi
Abstract:
Synthetic likelihood is an attractive approach to likelihood-free inference when an approximately Gaussian summary statistic for the data, informative for inference about the parameters, is available. The synthetic likelihood method derives an approximate likelihood function from a plug-in normal density estimate for the summary statistic, with plug-in mean and covariance matrix obtained by Monte…
▽ More
Synthetic likelihood is an attractive approach to likelihood-free inference when an approximately Gaussian summary statistic for the data, informative for inference about the parameters, is available. The synthetic likelihood method derives an approximate likelihood function from a plug-in normal density estimate for the summary statistic, with plug-in mean and covariance matrix obtained by Monte Carlo simulation from the model. In this article, we develop alternatives to Markov chain Monte Carlo implementations of Bayesian synthetic likelihoods with reduced computational overheads. Our approach uses stochastic gradient variational inference methods for posterior approximation in the synthetic likelihood context, employing unbiased estimates of the log likelihood. We compare the new method with a related likelihood free variational inference technique in the literature, while at the same time improving the implementation of that approach in a number of ways. These new algorithms are feasible to implement in situations which are challenging for conventional approximate Bayesian computation (ABC) methods, in terms of the dimensionality of the parameter and summary statistic.
△ Less
Submitted 10 August, 2016;
originally announced August 2016.
-
Using history matching for prior choice
Authors:
Xueou Wang,
David J. Nott,
C. C. Drovandi,
Kerrie Mengersen,
Michael Evans
Abstract:
It can be important in Bayesian analyses of complex models to construct informative prior distributions which reflect knowledge external to the data at hand. Nevertheless, how much prior information an analyst can elicit from an expert will be limited due to constraints of time, cost and other factors. This paper develops effective numerical methods for exploring reasonable choices of a prior dist…
▽ More
It can be important in Bayesian analyses of complex models to construct informative prior distributions which reflect knowledge external to the data at hand. Nevertheless, how much prior information an analyst can elicit from an expert will be limited due to constraints of time, cost and other factors. This paper develops effective numerical methods for exploring reasonable choices of a prior distribution from a parametric class, when prior information is specified in the form of some limited constraints on prior predictive distributions, and where these prior predictive distributions are analytically intractable. The methods developed may be thought of as a novel application of the ideas of history matching, a technique developed in the literature on assessment of computer models. We illustrate the approach in the context of logistic regression and sparse signal shrinkage prior distributions for high-dimensional linear models.
△ Less
Submitted 9 November, 2017; v1 submitted 28 May, 2016;
originally announced May 2016.
-
Bayesian Indirect Inference Using a Parametric Auxiliary Model
Authors:
Christopher C. Drovandi,
Anthony N. Pettitt,
Anthony Lee
Abstract:
Indirect inference (II) is a methodology for estimating the parameters of an intractable (generative) model on the basis of an alternative parametric (auxiliary) model that is both analytically and computationally easier to deal with. Such an approach has been well explored in the classical literature but has received substantially less attention in the Bayesian paradigm. The purpose of this paper…
▽ More
Indirect inference (II) is a methodology for estimating the parameters of an intractable (generative) model on the basis of an alternative parametric (auxiliary) model that is both analytically and computationally easier to deal with. Such an approach has been well explored in the classical literature but has received substantially less attention in the Bayesian paradigm. The purpose of this paper is to compare and contrast a collection of what we call parametric Bayesian indirect inference (pBII) methods. One class of pBII methods uses approximate Bayesian computation (referred to here as ABC II) where the summary statistic is formed on the basis of the auxiliary model, using ideas from II. Another approach proposed in the literature, referred to here as parametric Bayesian indirect likelihood (pBIL), uses the auxiliary likelihood as a replacement to the intractable likelihood. We show that pBIL is a fundamentally different approach to ABC II. We devise new theoretical results for pBIL to give extra insights into its behaviour and also its differences with ABC II. Furthermore, we examine in more detail the assumptions required to use each pBII method. The results, insights and comparisons developed in this paper are illustrated on simple examples and two other substantive applications. The first of the substantive examples involves performing inference for complex quantile distributions based on simulated data while the second is for estimating the parameters of a trivariate stochastic process describing the evolution of macroparasites within a host based on real data. We create a novel framework called Bayesian indirect likelihood (BIL) that encompasses pBII as well as general ABC methods so that the connections between the methods can be established.
△ Less
Submitted 13 May, 2015;
originally announced May 2015.
-
Pre-processing for approximate Bayesian computation in image analysis
Authors:
Matthew T. Moores,
Christopher C. Drovandi,
Kerrie Mengersen,
Christian P. Robert
Abstract:
Most of the existing algorithms for approximate Bayesian computation (ABC) assume that it is feasible to simulate pseudo-data from the model at each iteration. However, the computational cost of these simulations can be prohibitive for high dimensional data. An important example is the Potts model, which is commonly used in image analysis. Images encountered in real world applications can have mil…
▽ More
Most of the existing algorithms for approximate Bayesian computation (ABC) assume that it is feasible to simulate pseudo-data from the model at each iteration. However, the computational cost of these simulations can be prohibitive for high dimensional data. An important example is the Potts model, which is commonly used in image analysis. Images encountered in real world applications can have millions of pixels, therefore scalability is a major concern. We apply ABC with a synthetic likelihood to the hidden Potts model with additive Gaussian noise. Using a pre-processing step, we fit a binding function to model the relationship between the model parameters and the synthetic likelihood parameters. Our numerical experiments demonstrate that the precomputed binding function dramatically improves the scalability of ABC, reducing the average runtime required for model fitting from 71 hours to only 7 minutes. We also illustrate the method by estimating the smoothing parameter for remotely sensed satellite imagery. Without precomputation, Bayesian inference is impractical for datasets of that scale.
△ Less
Submitted 5 September, 2014; v1 submitted 18 March, 2014;
originally announced March 2014.