-
An Introduction to the Calibration of Computer Models
Authors:
Richard D. Wilkinson,
Christopher W. Lanyon
Abstract:
In the context of computer models, calibration is the process of estimating unknown simulator parameters from observational data. Calibration is variously referred to as model fitting, parameter estimation/inference, an inverse problem, and model tuning. The need for calibration occurs in most areas of science and engineering, and has been used to estimate hard to measure parameters in climate, ca…
▽ More
In the context of computer models, calibration is the process of estimating unknown simulator parameters from observational data. Calibration is variously referred to as model fitting, parameter estimation/inference, an inverse problem, and model tuning. The need for calibration occurs in most areas of science and engineering, and has been used to estimate hard to measure parameters in climate, cardiology, drug therapy response, hydrology, and many other disciplines. Although the statistical method used for calibration can vary substantially, the underlying approach is essentially the same and can be considered abstractly. In this survey, we review the decisions that need to be taken when calibrating a model, and discuss a range of computational methods that can be used to compute Bayesian posterior distributions.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Calibrating cardiac electrophysiology models using latent Gaussian processes on atrial manifolds
Authors:
Sam Coveney,
Caroline H Roney,
Cesare Corrado,
Richard D Wilkinson,
Jeremy E Oakley,
Steven A Niederer,
Richard H Clayton
Abstract:
Models of electrical excitation and recovery in the heart have become increasingly detailed, but have yet to be used routinely in the clinical setting to guide personalized intervention in patients. One of the main challenges is calibrating models from the limited measurements that can be made in a patient during a standard clinical procedure. In this work, we propose a novel framework for the pro…
▽ More
Models of electrical excitation and recovery in the heart have become increasingly detailed, but have yet to be used routinely in the clinical setting to guide personalized intervention in patients. One of the main challenges is calibrating models from the limited measurements that can be made in a patient during a standard clinical procedure. In this work, we propose a novel framework for the probabilistic calibration of electrophysiology parameters on the left atrium of the heart using local measurements of cardiac excitability. Parameter fields are represented as Gaussian processes on manifolds and are linked to measurements via surrogate functions that map from local parameter values to measurements. The posterior distribution of parameter fields is then obtained. We show that our method can recover parameter fields used to generate localised synthetic measurements of effective refractory period. Our methodology is applicable to other measurement types collected with clinical protocols, and more generally for calibration where model parameters vary over a manifold.
△ Less
Submitted 15 September, 2022; v1 submitted 8 June, 2022;
originally announced June 2022.
-
Adjoint-aided inference of Gaussian process driven differential equations
Authors:
Paterne Gahungu,
Christopher W Lanyon,
Mauricio A Alvarez,
Engineer Bainomugisha,
Michael Smith,
Richard D. Wilkinson
Abstract:
Linear systems occur throughout engineering and the sciences, most notably as differential equations. In many cases the forcing function for the system is unknown, and interest lies in using noisy observations of the system to infer the forcing, as well as other unknown parameters. In differential equations, the forcing function is an unknown function of the independent variables (typically time a…
▽ More
Linear systems occur throughout engineering and the sciences, most notably as differential equations. In many cases the forcing function for the system is unknown, and interest lies in using noisy observations of the system to infer the forcing, as well as other unknown parameters. In differential equations, the forcing function is an unknown function of the independent variables (typically time and space), and can be modelled as a Gaussian process (GP). In this paper we show how the adjoint of a linear system can be used to efficiently infer forcing functions modelled as GPs, using a truncated basis expansion of the GP kernel. We show how exact conjugate Bayesian inference for the truncated GP can be achieved, in many cases with substantially lower computation than would be required using MCMC methods. We demonstrate the approach on systems of both ordinary and partial differential equations, and show that the basis expansion approach approximates well the true forcing with a modest number of basis vectors. Finally, we show how to infer point estimates for the non-linear model parameters, such as the kernel length-scales, using Bayesian optimisation.
△ Less
Submitted 5 December, 2022; v1 submitted 9 February, 2022;
originally announced February 2022.
-
Gaussian Process Manifold Interpolation for Probabilistic Atrial Activation Maps and Uncertain Conduction Velocity
Authors:
Sam Coveney,
Cesare Corrado,
Caroline H Roney,
Daniel O'Hare,
Steven E Williams,
Mark D O'Neill,
Steven A Niederer,
Richard H Clayton,
Jeremy E Oakley,
Richard D Wilkinson
Abstract:
In patients with atrial fibrillation, local activation time (LAT) maps are routinely used for characterising patient pathophysiology. The gradient of LAT maps can be used to calculate conduction velocity (CV), which directly relates to material conductivity and may provide an important measure of atrial substrate properties. Including uncertainty in CV calculations would help with interpreting the…
▽ More
In patients with atrial fibrillation, local activation time (LAT) maps are routinely used for characterising patient pathophysiology. The gradient of LAT maps can be used to calculate conduction velocity (CV), which directly relates to material conductivity and may provide an important measure of atrial substrate properties. Including uncertainty in CV calculations would help with interpreting the reliability of these measurements. Here, we build upon a recent insight into reduced-rank Gaussian processes (GP) to perform probabilistic interpolation of uncertain LAT directly on human atrial manifolds. Our Gaussian Process Manifold Interpolation (GPMI) method accounts for the topology of the atria, and allows for calculation of statistics for predicted CV. We demonstrate our method on two clinical cases, and perform validation against a simulated ground truth. CV uncertainty depends on data density, wave propagation direction, and CV magnitude. GPMI is suitable for probabilistic interpolation of other uncertain quantities on non-Euclidean manifolds.
△ Less
Submitted 23 April, 2020; v1 submitted 22 April, 2020;
originally announced April 2020.
-
Considering discrepancy when calibrating a mechanistic electrophysiology model
Authors:
Chon Lok Lei,
Sanmitra Ghosh,
Dominic G. Whittaker,
Yasser Aboelkassem,
Kylie A. Beattie,
Chris D. Cantwell,
Tammo Delhaas,
Charles Houston,
Gustavo Montes Novaes,
Alexander V. Panfilov,
Pras Pathmanathan,
Marina Riabiz,
Rodrigo Weber dos Santos,
John Walmsley,
Keith Worden,
Gary R. Mirams,
Richard D. Wilkinson
Abstract:
Uncertainty quantification (UQ) is a vital step in using mathematical models and simulations to take decisions. The field of cardiac simulation has begun to explore and adopt UQ methods to characterise uncertainty in model inputs and how that propagates through to outputs or predictions. In this perspective piece we draw attention to an important and under-addressed source of uncertainty in our pr…
▽ More
Uncertainty quantification (UQ) is a vital step in using mathematical models and simulations to take decisions. The field of cardiac simulation has begun to explore and adopt UQ methods to characterise uncertainty in model inputs and how that propagates through to outputs or predictions. In this perspective piece we draw attention to an important and under-addressed source of uncertainty in our predictions -- that of uncertainty in the model structure or the equations themselves. The difference between imperfect models and reality is termed model discrepancy, and we are often uncertain as to the size and consequences of this discrepancy. Here we provide two examples of the consequences of discrepancy when calibrating models at the ion channel and action potential scales. Furthermore, we attempt to account for this discrepancy when calibrating and validating an ion channel model using different methods, based on modelling the discrepancy using Gaussian processes (GPs) and autoregressive-moving-average (ARMA) models, then highlight the advantages and shortcomings of each approach. Finally, suggestions and lines of enquiry for future work are provided.
△ Less
Submitted 23 April, 2020; v1 submitted 13 January, 2020;
originally announced January 2020.
-
Quantifying Age and Model Uncertainties in Paleoclimate Data and Dynamical Climate Models with a Joint Inferential Analysis
Authors:
Jake Carson,
Michel Crucifix,
Simon P. Preston,
Richard D. Wilkinson
Abstract:
A major goal in paleoclimate science is to reconstruct historical climates using proxies for climate variables such as those observed in sediment cores, and in the process learn about climate dynamics. This is hampered by uncertainties in how sediment core depths relate to ages, how proxy quantities relate to climate variables, how climate models are specified, and the values of parameters in clim…
▽ More
A major goal in paleoclimate science is to reconstruct historical climates using proxies for climate variables such as those observed in sediment cores, and in the process learn about climate dynamics. This is hampered by uncertainties in how sediment core depths relate to ages, how proxy quantities relate to climate variables, how climate models are specified, and the values of parameters in climate models. Quantifying these uncertainties is key in drawing well founded conclusions. Analyses are often performed in separate stages with, for example, a sediment core's depth-age relation being estimated as stage one, then fed as an input to calibrate climate models as stage two. Here, we show that such "multi-stage" approaches can lead to misleading conclusions. We develop a joint inferential approach for climate reconstruction, model calibration, and age model estimation. We focus on the glacial-interglacial cycle over the past 780 kyr, analysing two sediment cores that span this range. Our age estimates are largely in agreement with previous studies, but provides the full joint specification of all uncertainties, estimation of model parameters, and the model evidence. By sampling plausible chronologies from the posterior distribution, we demonstrate that downstream scientific conclusions can differ greatly both between different sampled chronologies, and in comparison with conclusions obtained in the complete joint inferential analysis. We conclude that multi-stage analyses are insufficient when dealing with uncertainty, and that to draw sound conclusions the full joint inferential analysis must be performed.
△ Less
Submitted 17 April, 2019; v1 submitted 22 March, 2018;
originally announced March 2018.
-
Detecting non-binomial sex allocation when developmental mortality operates
Authors:
Richard D. Wilkinson,
Apostolos Kapranas,
Ian C. W. Hardy
Abstract:
Optimal sex allocation theory is one of the most intricately developed areas of evolutionary ecology. Under a range of conditions, particularly under population sub-division, selection favours sex being allocated to offspring non-randomly, generating non-binomial variances of offspring group sex ratios. Detecting non-binomial sex allocation is complicated by stochastic developmental mortality, as…
▽ More
Optimal sex allocation theory is one of the most intricately developed areas of evolutionary ecology. Under a range of conditions, particularly under population sub-division, selection favours sex being allocated to offspring non-randomly, generating non-binomial variances of offspring group sex ratios. Detecting non-binomial sex allocation is complicated by stochastic developmental mortality, as offspring sex can often only be identified on maturity with the sex of non-maturing offspring remaining unknown. We show that current approaches for detecting non-binomiality have limited ability to detect non-binomial sex allocation when developmental mortality has occurred. We present a new procedure using an explicit model of sex allocation and mortality and develop a Bayesian model selection approach (available as an R package). We use the double and multiplicative binomial distributions to model over- and under-dispersed sex allocation and show how to calculate Bayes factors for comparing these alternative models to the null hypothesis of binomial sex allocation. The ability to detect non-binomial sex allocation is greatly increased, particularly in cases where mortality is common. The use of Bayesian methods allows for the quantification of the evidence in favour of each hypothesis, and our modelling approach provides an improved descriptive capability over existing approaches. We use a simulation study to demonstrate substantial improvements in power for detecting non-binomial sex allocation in situations where current methods fail, and we illustrate the approach in real scenarios using empirically obtained datasets on the sexual composition of groups of gregarious parasitoid wasps.
△ Less
Submitted 10 August, 2016;
originally announced August 2016.
-
ABC for climate: dealing with expensive simulators
Authors:
Philip B. Holden,
Neil R. Edwards,
James Hensman,
Richard D. Wilkinson
Abstract:
This paper is due to appear as a chapter of the forthcoming Handbook of Approximate Bayesian Computation (ABC) by S. Sisson, L. Fan, and M. Beaumont. We describe the challenge of calibrating climate simulators, and discuss the differences in emphasis in climate science compared to many of the more traditional ABC application areas. The primary difficulty is how to do inference with a computational…
▽ More
This paper is due to appear as a chapter of the forthcoming Handbook of Approximate Bayesian Computation (ABC) by S. Sisson, L. Fan, and M. Beaumont. We describe the challenge of calibrating climate simulators, and discuss the differences in emphasis in climate science compared to many of the more traditional ABC application areas. The primary difficulty is how to do inference with a computationally expensive simulator which we can only afford to run a small number of times, and we describe how Gaussian process emulators are used as surrogate models in this case. We introduce the idea of history matching, which is a non-probabilistic calibration method, which divides the parameter space into (not im)plausible and implausible regions. History matching can be shown to be a special case of ABC, but with a greater emphasis on defining realistic simulator discrepancy bounds, and using these to define tolerances and metrics. We describe a design approach for choosing parameter values at which to run the simulator, and illustrate the approach on a toy climate model, showing that with careful design we can find the plausible region with a very small number of model evaluations. Finally, we describe how calibrated GENIE-1 (an earth system model of intermediate complexity) predictions have been used, and why it is important to accurately characterise parametric uncertainty.
△ Less
Submitted 11 November, 2015;
originally announced November 2015.
-
Bayesian model selection for the glacial-interglacial cycle
Authors:
Jake Carson,
Michel Crucifix,
Simon Preston,
Richard D. Wilkinson
Abstract:
A prevailing viewpoint in palaeoclimate science is that a single palaeoclimate record contains insufficient information to discriminate between most competing explanatory models. Results we present here suggest the contrary. Using SMC^2 combined with novel Brownian bridge type proposals for the state trajectories, we show that even with relatively short time series it is possible to estimate Bayes…
▽ More
A prevailing viewpoint in palaeoclimate science is that a single palaeoclimate record contains insufficient information to discriminate between most competing explanatory models. Results we present here suggest the contrary. Using SMC^2 combined with novel Brownian bridge type proposals for the state trajectories, we show that even with relatively short time series it is possible to estimate Bayes factors to sufficient accuracy to be able to select between competing models. The results show that Monte Carlo methodology and computer power have now advanced to the point where a full Bayesian analysis for a wide class of conceptual climate models is now possible. The results also highlight a problem with estimating the chronology of the climate record prior to further statistical analysis, a practice which is common in palaeoclimate science. Using two datasets based on the same record but with different estimated chronologies results in conflicting conclusions about the importance of the orbital forcing on the glacial cycle, and about the internal dynamics generating the glacial cycle, even though the difference between the two estimated chronologies is consistent with dating uncertainty. This highlights a need for chronology estimation and other inferential questions to be addressed in a joint statistical procedure.
△ Less
Submitted 11 November, 2015;
originally announced November 2015.
-
Accelerating ABC methods using Gaussian processes
Authors:
Richard D Wilkinson
Abstract:
Approximate Bayesian computation (ABC) methods are used to approximate posterior distributions using simulation rather than likelihood calculations. We introduce Gaussian process (GP) accelerated ABC, which we show can significantly reduce the number of simulations required. As computational resource is usually the main determinant of accuracy in ABC, GP-accelerated methods can thus enable more ac…
▽ More
Approximate Bayesian computation (ABC) methods are used to approximate posterior distributions using simulation rather than likelihood calculations. We introduce Gaussian process (GP) accelerated ABC, which we show can significantly reduce the number of simulations required. As computational resource is usually the main determinant of accuracy in ABC, GP-accelerated methods can thus enable more accurate inference in some models. GP models of the unknown log-likelihood function are used to exploit continuity and smoothness, reducing the required computation. We use a sequence of models that increase in accuracy, using intermediate models to rule out regions of the parameter space as implausible. The methods will not be suitable for all problems, but when they can be used, can result in significant computational savings. For the Ricker model, we are able to achieve accurate approximations to the posterior distribution using a factor of 100 fewer simulator evaluations than comparable Monte Carlo approaches, and for a population genetics model we are able to approximate the exact posterior for the first time.
△ Less
Submitted 23 February, 2014; v1 submitted 7 January, 2014;
originally announced January 2014.
-
Approximate Bayesian computation (ABC) gives exact results under the assumption of model error
Authors:
Richard D. Wilkinson
Abstract:
Approximate Bayesian computation (ABC) or likelihood-free inference algorithms are used to find approximations to posterior distributions without making explicit use of the likelihood function, depending instead on simulation of sample data sets from the model. In this paper we show that under the assumption of the existence of a uniform additive model error term, ABC algorithms give exact results…
▽ More
Approximate Bayesian computation (ABC) or likelihood-free inference algorithms are used to find approximations to posterior distributions without making explicit use of the likelihood function, depending instead on simulation of sample data sets from the model. In this paper we show that under the assumption of the existence of a uniform additive model error term, ABC algorithms give exact results when sufficient summaries are used. This interpretation allows the approximation made in many previous application papers to be understood, and should guide the choice of metric and tolerance in future work. ABC algorithms can be generalized by replacing the 0-1 cut-off with an acceptance probability that varies with the distance of the simulated data from the observed data. The acceptance density gives the distribution of the error term, enabling the uniform error usually used to be replaced by a general distribution. This generalization can also be applied to approximate Markov chain Monte Carlo algorithms. In light of this work, ABC algorithms can be seen as calibration techniques for implicit stochastic models, inferring parameter values in light of the computer model, data, prior beliefs about the parameter values, and any measurement or model errors.
△ Less
Submitted 23 April, 2013; v1 submitted 20 November, 2008;
originally announced November 2008.