-
Bayesian calibration of stochastic agent based model via random forest
Authors:
Connor Robertson,
Cosmin Safta,
Nicholson Collier,
Jonathan Ozik,
Jaideep Ray
Abstract:
Agent-based models (ABM) provide an excellent framework for modeling outbreaks and interventions in epidemiology by explicitly accounting for diverse individual interactions and environments. However, these models are usually stochastic and highly parametrized, requiring precise calibration for predictive performance. When considering realistic numbers of agents and properly accounting for stochas…
▽ More
Agent-based models (ABM) provide an excellent framework for modeling outbreaks and interventions in epidemiology by explicitly accounting for diverse individual interactions and environments. However, these models are usually stochastic and highly parametrized, requiring precise calibration for predictive performance. When considering realistic numbers of agents and properly accounting for stochasticity, this high dimensional calibration can be computationally prohibitive. This paper presents a random forest based surrogate modeling technique to accelerate the evaluation of ABMs and demonstrates its use to calibrate an epidemiological ABM named CityCOVID via Markov chain Monte Carlo (MCMC). The technique is first outlined in the context of CityCOVID's quantities of interest, namely hospitalizations and deaths, by exploring dimensionality reduction via temporal decomposition with principal component analysis (PCA) and via sensitivity analysis. The calibration problem is then presented and samples are generated to best match COVID-19 hospitalization and death numbers in Chicago from March to June in 2020. These results are compared with previous approximate Bayesian calibration (IMABC) results and their predictive performance is analyzed showing improved performance with a reduction in computation.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Towards Improved Uncertainty Quantification of Stochastic Epidemic Models Using Sequential Monte Carlo
Authors:
Arindam Fadikar,
Abby Stevens,
Nicholson Collier,
Kok Ben Toh,
Olga Morozova,
Anna Hotton,
Jared Clark,
David Higdon,
Jonathan Ozik
Abstract:
Sequential Monte Carlo (SMC) algorithms represent a suite of robust computational methodologies utilized for state estimation and parameter inference within dynamical systems, particularly in real-time or online environments where data arrives sequentially over time. In this research endeavor, we propose an integrated framework that combines a stochastic epidemic simulator with a sequential import…
▽ More
Sequential Monte Carlo (SMC) algorithms represent a suite of robust computational methodologies utilized for state estimation and parameter inference within dynamical systems, particularly in real-time or online environments where data arrives sequentially over time. In this research endeavor, we propose an integrated framework that combines a stochastic epidemic simulator with a sequential importance sampling (SIS) scheme to dynamically infer model parameters, which evolve due to social as well as biological processes throughout the progression of an epidemic outbreak and are also influenced by evolving data measurement bias. Through iterative updates of a set of weighted simulated trajectories based on observed data, this framework enables the estimation of posterior distributions for these parameters, thereby capturing their temporal variability and associated uncertainties. Through simulation studies, we showcase the efficacy of SMC in accurately tracking the evolving dynamics of epidemics while appropriately accounting for uncertainties. Moreover, we delve into practical considerations and challenges inherent in implementing SMC for parameter estimation within dynamic epidemiological settings, areas where the substantial computational capabilities of high-performance computing resources can be usefully brought to bear.
△ Less
Submitted 6 March, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Trajectory-oriented optimization of stochastic epidemiological models
Authors:
Arindam Fadikar,
Mickael Binois,
Nicholson Collier,
Abby Stevens,
Kok Ben Toh,
Jonathan Ozik
Abstract:
Epidemiological models must be calibrated to ground truth for downstream tasks such as producing forward projections or running what-if scenarios. The meaning of calibration changes in case of a stochastic model since output from such a model is generally described via an ensemble or a distribution. Each member of the ensemble is usually mapped to a random number seed (explicitly or implicitly). W…
▽ More
Epidemiological models must be calibrated to ground truth for downstream tasks such as producing forward projections or running what-if scenarios. The meaning of calibration changes in case of a stochastic model since output from such a model is generally described via an ensemble or a distribution. Each member of the ensemble is usually mapped to a random number seed (explicitly or implicitly). With the goal of finding not only the input parameter settings but also the random seeds that are consistent with the ground truth, we propose a class of Gaussian process (GP) surrogates along with an optimization strategy based on Thompson sampling. This Trajectory Oriented Optimization (TOO) approach produces actual trajectories close to the empirical observations instead of a set of parameter settings where only the mean simulation behavior matches with the ground truth.
△ Less
Submitted 13 September, 2023; v1 submitted 6 May, 2023;
originally announced May 2023.
-
A portfolio approach to massively parallel Bayesian optimization
Authors:
Mickael Binois,
Nicholson Collier,
Jonathan Ozik
Abstract:
One way to reduce the time of conducting optimization studies is to evaluate designs in parallel rather than just one-at-a-time. For expensive-to-evaluate black-boxes, batch versions of Bayesian optimization have been proposed. They work by building a surrogate model of the black-box to simultaneously select multiple designs via an infill criterion. Still, despite the increased availability of com…
▽ More
One way to reduce the time of conducting optimization studies is to evaluate designs in parallel rather than just one-at-a-time. For expensive-to-evaluate black-boxes, batch versions of Bayesian optimization have been proposed. They work by building a surrogate model of the black-box to simultaneously select multiple designs via an infill criterion. Still, despite the increased availability of computing resources that enable large-scale parallelism, the strategies that work for selecting a few tens of parallel designs for evaluations become limiting due to the complexity of selecting more designs. It is even more crucial when the black-box is noisy, necessitating more evaluations as well as repeating experiments. Here we propose a scalable strategy that can keep up with massive batching natively, focused on the exploration/exploitation trade-off and a portfolio allocation. We compare the approach with related methods on noisy functions, for mono and multi-objective optimization tasks. These experiments show orders of magnitude speed improvements over existing methods with similar or better performance.
△ Less
Submitted 3 April, 2023; v1 submitted 18 October, 2021;
originally announced October 2021.
-
Characterization and valuation of uncertainty of calibrated parameters in stochastic decision models
Authors:
Fernando Alarid-Escudero,
Amy B. Knudsen,
Jonathan Ozik,
Nicholson Collier,
Karen M. Kuntz
Abstract:
We evaluated the implications of different approaches to characterize uncertainty of calibrated parameters of stochastic decision models (DMs) in the quantified value of such uncertainty in decision making. We used a microsimulation DM of colorectal cancer (CRC) screening to conduct a cost-effectiveness analysis (CEA) of a 10-year colonoscopy screening. We calibrated the natural history model of C…
▽ More
We evaluated the implications of different approaches to characterize uncertainty of calibrated parameters of stochastic decision models (DMs) in the quantified value of such uncertainty in decision making. We used a microsimulation DM of colorectal cancer (CRC) screening to conduct a cost-effectiveness analysis (CEA) of a 10-year colonoscopy screening. We calibrated the natural history model of CRC to epidemiological data with different degrees of uncertainty and obtained the joint posterior distribution of the parameters using a Bayesian approach. We conducted a probabilistic sensitivity analysis (PSA) on all the model parameters with different characterizations of uncertainty of the calibrated parameters and estimated the value of uncertainty of the different characterizations with a value of information analysis. All analyses were conducted using high performance computing resources running the Extreme-scale Model Exploration with Swift (EMEWS) framework. The posterior distribution had high correlation among some parameters. The parameters of the Weibull hazard function for the age of onset of adenomas had the highest posterior correlation of -0.958. Considering full posterior distributions and the maximum-a-posteriori estimate of the calibrated parameters, there is little difference on the spread of the distribution of the CEA outcomes with a similar expected value of perfect information (EVPI) of \$653 and \$685, respectively, at a WTP of \$66,000/QALY. Ignoring correlation on the posterior distribution of the calibrated parameters, produced the widest distribution of CEA outcomes and the highest EVPI of \$809 at the same WTP. Different characterizations of uncertainty of calibrated parameters have implications on the expect value of reducing uncertainty on the CEA. Ignoring inherent correlation among calibrated parameters on a PSA overestimates the value of uncertainty.
△ Less
Submitted 11 June, 2019;
originally announced June 2019.
-
Microsimulation Model Calibration using Incremental Mixture Approximate Bayesian Computation
Authors:
Carolyn Rutter,
Jonathan Ozik,
Maria DeYoreo,
Nicholson Collier
Abstract:
Microsimulation models (MSMs) are used to predict population-level effects of health care policies by simulating individual-level outcomes. Simulated outcomes are governed by unknown parameters that are chosen so that the model accurately predicts specific targets, a process referred to as model calibration. Calibration targets can come from randomized controlled trials, observational studies, and…
▽ More
Microsimulation models (MSMs) are used to predict population-level effects of health care policies by simulating individual-level outcomes. Simulated outcomes are governed by unknown parameters that are chosen so that the model accurately predicts specific targets, a process referred to as model calibration. Calibration targets can come from randomized controlled trials, observational studies, and expert opinion, and are typically summary statistics. A well calibrated model can reproduce a wide range of targets. MSM calibration generally involves searching a high dimensional parameter space and predicting many targets through model simulation. This requires efficient methods for exploring the parameter space and sufficient computational resources. We develop Incremental Mixture Approximate Bayesian Computation (IMABC) as a method for MSM calibration and implement it via a high-performance computing workflow, which provides the necessary computational scale. IMABC begins with a rejection-based approximate Bayesian computation (ABC) step, drawing a sample of parameters from the prior distribution and simulating calibration targets. Next, the sample is iteratively updated by drawing additional points from a mixture of multivariate normal distributions, centered at the points that yield simulated targets that are near observed targets. Posterior estimates are obtained by weighting sampled parameter vectors to account for the adaptive sampling scheme. We demonstrate IMABC by calibrating a MSM for the natural history of colorectal cancer to obtain simulated draws from the joint posterior distribution of model parameters.
△ Less
Submitted 13 August, 2018; v1 submitted 5 April, 2018;
originally announced April 2018.