-
Evaluating infectious disease forecasts with allocation scoring rules
Authors:
Aaron Gerding,
Nicholas G. Reich,
Benjamin Rogers,
Evan L. Ray
Abstract:
Recent years have seen increasing efforts to forecast infectious disease burdens, with a primary goal being to help public health workers make informed policy decisions. However, there has only been limited discussion of how predominant forecast evaluation metrics might indicate the success of policies based in part on those forecasts. We explore one possible tether between forecasts and policy: t…
▽ More
Recent years have seen increasing efforts to forecast infectious disease burdens, with a primary goal being to help public health workers make informed policy decisions. However, there has only been limited discussion of how predominant forecast evaluation metrics might indicate the success of policies based in part on those forecasts. We explore one possible tether between forecasts and policy: the allocation of limited medical resources so as to minimize unmet need. We use probabilistic forecasts of disease burden in each of several regions to determine optimal resource allocations, and then we score forecasts according to how much unmet need their associated allocations would have allowed. We illustrate with forecasts of COVID-19 hospitalizations in the US, and we find that the forecast skill ranking given by this allocation scoring rule can vary substantially from the ranking given by the weighted interval score. We see this as evidence that the allocation scoring rule detects forecast value that is missed by traditional accuracy measures and that the general strategy of designing scoring rules that are directly linked to policy performance is a promising direction for epidemic forecast evaluation.
△ Less
Submitted 4 March, 2024; v1 submitted 23 December, 2023;
originally announced December 2023.
-
Comparison of Combination Methods to Create Calibrated Ensemble Forecasts for Seasonal Influenza in the U.S
Authors:
Nutcha Wattanachit,
Evan L. Ray,
Thomas C. McAndrew,
Nicholas G. Reich
Abstract:
The characteristics of influenza seasons varies substantially from year to year, posing challenges for public health preparation and response. Influenza forecasting is used to inform seasonal outbreak response, which can in turn potentially reduce the societal impact of an epidemic. The United States Centers for Disease Control and Prevention, in collaboration with external researchers, has run an…
▽ More
The characteristics of influenza seasons varies substantially from year to year, posing challenges for public health preparation and response. Influenza forecasting is used to inform seasonal outbreak response, which can in turn potentially reduce the societal impact of an epidemic. The United States Centers for Disease Control and Prevention, in collaboration with external researchers, has run an annual prospective influenza forecasting exercise, known as the FluSight challenge. A subset of participating teams has worked together to produce a collaborative multi-model ensemble, the FluSight Network ensemble. Uniting theoretical results from the forecasting literature with domain-specific forecasts from influenza outbreaks, we applied parametric forecast combination methods that simultaneously optimize individual model weights and calibrate the ensemble via a beta transformation. We used the beta-transformed linear pool and the finite beta mixture model to produce ensemble forecasts retrospectively for the 2016/2017 to 2018/2019 influenza seasons in the U.S. We compared their performance to methods currently used in the FluSight challenge, namely the equally weighted linear pool and the linear pool. Ensemble forecasts produced from methods with a beta transformation were shown to outperform those from the equally weighted linear pool and the linear pool for all week-ahead targets across in the test seasons based on average log scores. We observed improvements in overall accuracy despite the beta-transformed linear pool or beta mixture methods' modest under-prediction across all targets and seasons. Combination techniques that explicitly adjust for known calibration issues in linear pooling should be considered to improve ensemble probabilistic scores in outbreak settings.
△ Less
Submitted 15 March, 2022; v1 submitted 23 February, 2022;
originally announced February 2022.
-
Comparing trained and untrained probabilistic ensemble forecasts of COVID-19 cases and deaths in the United States
Authors:
Evan L. Ray,
Logan C. Brooks,
Jacob Bien,
Matthew Biggerstaff,
Nikos I. Bosse,
Johannes Bracher,
Estee Y. Cramer,
Sebastian Funk,
Aaron Gerding,
Michael A. Johansson,
Aaron Rumack,
Yi** Wang,
Martha Zorn,
Ryan J. Tibshirani,
Nicholas G. Reich
Abstract:
The U.S. COVID-19 Forecast Hub aggregates forecasts of the short-term burden of COVID-19 in the United States from many contributing teams. We study methods for building an ensemble that combines forecasts from these teams. These experiments have informed the ensemble methods used by the Hub. To be most useful to policy makers, ensemble forecasts must have stable performance in the presence of two…
▽ More
The U.S. COVID-19 Forecast Hub aggregates forecasts of the short-term burden of COVID-19 in the United States from many contributing teams. We study methods for building an ensemble that combines forecasts from these teams. These experiments have informed the ensemble methods used by the Hub. To be most useful to policy makers, ensemble forecasts must have stable performance in the presence of two key characteristics of the component forecasts: (1) occasional misalignment with the reported data, and (2) instability in the relative performance of component forecasters over time. Our results indicate that in the presence of these challenges, an untrained and robust approach to ensembling using an equally weighted median of all component forecasts is a good choice to support public health decision makers. In settings where some contributing forecasters have a stable record of good performance, trained ensembles that give those forecasters higher weight can also be helpful.
△ Less
Submitted 7 June, 2022; v1 submitted 28 January, 2022;
originally announced January 2022.
-
The Zoltar forecast archive: a tool to facilitate standardization and storage of interdisciplinary prediction research
Authors:
Nicholas G Reich,
Matthew Cornell,
Evan L Ray,
Katie House,
Khoa Le
Abstract:
Forecasting has emerged as an important component of informed, data-driven decision-making in a wide array of fields. We introduce a new data model for probabilistic predictions that encompasses a wide range of forecasting settings. This framework clearly defines the constituent parts of a probabilistic forecast and proposes one approach for representing these data elements. The data model is impl…
▽ More
Forecasting has emerged as an important component of informed, data-driven decision-making in a wide array of fields. We introduce a new data model for probabilistic predictions that encompasses a wide range of forecasting settings. This framework clearly defines the constituent parts of a probabilistic forecast and proposes one approach for representing these data elements. The data model is implemented in Zoltar, a new software application that stores forecasts using the data model and provides standardized API access to the data. In one real-time case study, an instance of the Zoltar web application was used to store, provide access to, and evaluate real-time forecast data on the order of 10$^7$ rows, provided by over 20 international research teams from academia and industry making forecasts of the COVID-19 outbreak in the US. Tools and data infrastructure for probabilistic forecasts, such as those introduced here, will play an increasingly important role in ensuring that future forecasting research adheres to a strict set of rigorous and reproducible standards.
△ Less
Submitted 6 June, 2020;
originally announced June 2020.
-
Infectious Disease Forecasting for Public Health
Authors:
Stephen A Lauer,
Alexandria C Brown,
Nicholas G Reich
Abstract:
Forecasting transmission of infectious diseases, especially for vector-borne diseases, poses unique challenges for researchers. Behaviors of and interactions between viruses, vectors, hosts, and the environment each play a part in determining the transmission of a disease. Public health surveillance systems and other sources provide valuable data that can be used to accurately forecast disease inc…
▽ More
Forecasting transmission of infectious diseases, especially for vector-borne diseases, poses unique challenges for researchers. Behaviors of and interactions between viruses, vectors, hosts, and the environment each play a part in determining the transmission of a disease. Public health surveillance systems and other sources provide valuable data that can be used to accurately forecast disease incidence. However, many aspects of common infectious disease surveillance data are imperfect: cases may be reported with a delay or in some cases not at all, data on vectors may not be available, and case data may not be available at high geographical or temporal resolution. In the face of these challenges, researchers must make assumptions to either account for these underlying processes in a mechanistic model or to justify their exclusion altogether in a statistical model. Whether a model is mechanistic or statistical, researchers should evaluate their model using accepted best practices from the emerging field of infectious disease forecasting while adopting conventions from other fields that have been develo** forecasting methods for decades. Accounting for assumptions and properly evaluating models will allow researchers to generate forecasts that have the potential to provide valuable insights for public health officials. This chapter provides a background to the practice of forecasting in general, discusses the biological and statistical models used for infectious disease forecasting, presents technical details about making and evaluating forecasting models, and explores the issues in communicating forecasting results in a public health context.
△ Less
Submitted 29 May, 2020;
originally announced June 2020.
-
Evaluating epidemic forecasts in an interval format
Authors:
Johannes Bracher,
Evan L. Ray,
Tilmann Gneiting,
Nicholas G. Reich
Abstract:
For practical reasons, many forecasts of case, hospitalization and death counts in the context of the current COVID-19 pandemic are issued in the form of central predictive intervals at various levels. This is also the case for the forecasts collected in the COVID-19 Forecast Hub (https://covid19forecasthub.org/). Forecast evaluation metrics like the logarithmic score, which has been applied in se…
▽ More
For practical reasons, many forecasts of case, hospitalization and death counts in the context of the current COVID-19 pandemic are issued in the form of central predictive intervals at various levels. This is also the case for the forecasts collected in the COVID-19 Forecast Hub (https://covid19forecasthub.org/). Forecast evaluation metrics like the logarithmic score, which has been applied in several infectious disease forecasting challenges, are then not available as they require full predictive distributions. This article provides an overview of how established methods for the evaluation of quantile and interval forecasts can be applied to epidemic forecasts in this format. Specifically, we discuss the computation and interpretation of the weighted interval score, which is a proper score that approximates the continuous ranked probability score. It can be interpreted as a generalization of the absolute error to probabilistic forecasts and allows for a decomposition into a measure of sharpness and penalties for over- and underprediction.
△ Less
Submitted 8 January, 2021; v1 submitted 26 May, 2020;
originally announced May 2020.
-
Aggregating predictions from experts: a sco** review of statistical methods, experiments, and applications
Authors:
Thomas McAndrew,
Nutcha Wattanachit,
G. Casey Gibson,
Nicholas G. Reich
Abstract:
Forecasts support decision making in a variety of applications. Statistical models can produce accurate forecasts given abundant training data, but when data is sparse, rapidly changing, or unavailable, statistical models may not be able to make accurate predictions. Expert judgmental forecasts---models that combine expert-generated predictions into a single forecast---can make predictions when tr…
▽ More
Forecasts support decision making in a variety of applications. Statistical models can produce accurate forecasts given abundant training data, but when data is sparse, rapidly changing, or unavailable, statistical models may not be able to make accurate predictions. Expert judgmental forecasts---models that combine expert-generated predictions into a single forecast---can make predictions when training data is limited by relying on expert intuition to take the place of concrete training data. Researchers have proposed a wide array of algorithms to combine expert predictions into a single forecast, but there is no consensus on an optimal aggregation model. This sco** review surveyed recent literature on aggregating expert-elicited predictions. We gathered common terminology, aggregation methods, and forecasting performance metrics, and offer guidance to strengthen future work that is growing at an accelerated pace.
△ Less
Submitted 16 May, 2020; v1 submitted 24 December, 2019;
originally announced December 2019.
-
The covariate-adjusted residual estimator and its use in both randomized trials and observational settings
Authors:
Stephen A. Lauer,
Nicholas G. Reich,
Laura B. Balzer
Abstract:
We often seek to estimate the causal effect of an exposure on a particular outcome in both randomized and observational settings. One such estimation method is the covariate-adjusted residuals estimator, which was designed for individually or cluster randomized trials. In this manuscript, we study the properties of this estimator and develop a new estimator that utilizes both covariate adjustment…
▽ More
We often seek to estimate the causal effect of an exposure on a particular outcome in both randomized and observational settings. One such estimation method is the covariate-adjusted residuals estimator, which was designed for individually or cluster randomized trials. In this manuscript, we study the properties of this estimator and develop a new estimator that utilizes both covariate adjustment and inverse probability weighting We support our theoretical results with a simulation study and an application in an infectious disease setting. The covariate-adjusted residuals estimator is an efficient and unbiased estimator of the average treatment effect in randomized trials; however, it is not guaranteed to be unbiased in observational studies. Our novel estimator, the covariate-adjusted residuals estimator with inverse probability weighting, is unbiased in randomized and observational settings, under a reasonable set of assumptions. Furthermore, when these assumptions hold, it provides efficiency gains over inverse probability weighting in observational studies. The covariate-adjusted residuals estimator is valid for use in randomized trials, but should not be used in observational studies. The covariate-adjusted residuals estimator with inverse probability weighting provides an efficient alternative for use in randomized and observational settings.
△ Less
Submitted 24 October, 2019;
originally announced October 2019.
-
Adaptively stacking ensembles for influenza forecasting with incomplete data
Authors:
Thomas McAndrew,
Nicholas G. Reich
Abstract:
Seasonal influenza infects between 10 and 50 million people in the United States every year, overburdening hospitals during weeks of peak incidence. Named by the CDC as an important tool to fight the damaging effects of these epidemics, accurate forecasts of influenza and influenza-like illness (ILI) forewarn public health officials about when, and where, seasonal influenza outbreaks will hit hard…
▽ More
Seasonal influenza infects between 10 and 50 million people in the United States every year, overburdening hospitals during weeks of peak incidence. Named by the CDC as an important tool to fight the damaging effects of these epidemics, accurate forecasts of influenza and influenza-like illness (ILI) forewarn public health officials about when, and where, seasonal influenza outbreaks will hit hardest. Multi-model ensemble forecasts---weighted combinations of component models---have shown positive results in forecasting. Ensemble forecasts of influenza outbreaks have been static, training on all past ILI data at the beginning of a season, generating a set of optimal weights for each model in the ensemble, and kee** the weights constant. We propose an adaptive ensemble forecast that (i) changes model weights week-by-week throughout the influenza season, (ii) only needs the current influenza season's data to make predictions, and (iii) by introducing a prior distribution, shrinks weights toward the reference equal weighting approach and adjusts for observed ILI percentages that are subject to future revisions. We investigate the prior's ability to impact adaptive ensemble performance and, after finding an optimal prior via a cross-validation approach, compare our adaptive ensemble's performance to equal-weighted and static ensembles. Applied to forecasts of short-term ILI incidence at the regional and national level in the US, our adaptive model outperforms a naive equal-weighted ensemble, and has similar or better performance to the static ensemble, which requires multiple years of training data. Adaptive ensembles are able to quickly train and forecast during epidemics, and provide a practical tool to public health officials looking for forecasts that can conform to unique features of a specific season.
△ Less
Submitted 16 May, 2020; v1 submitted 26 July, 2019;
originally announced August 2019.
-
Prediction of infectious disease epidemics via weighted density ensembles
Authors:
Evan L. Ray,
Nicholas G. Reich
Abstract:
Accurate and reliable predictions of infectious disease dynamics can be valuable to public health organizations that plan interventions to decrease or prevent disease transmission. A great variety of models have been developed for this task, using different model structures, covariates, and targets for prediction. Experience has shown that the performance of these models varies; some tend to do be…
▽ More
Accurate and reliable predictions of infectious disease dynamics can be valuable to public health organizations that plan interventions to decrease or prevent disease transmission. A great variety of models have been developed for this task, using different model structures, covariates, and targets for prediction. Experience has shown that the performance of these models varies; some tend to do better or worse in different seasons or at different points within a season. Ensemble methods combine multiple models to obtain a single prediction that leverages the strengths of each model. We considered a range of ensemble methods that each form a predictive density for a target of interest as a weighted sum of the predictive densities from component models. In the simplest case, equal weight is assigned to each component model; in the most complex case, the weights vary with the region, prediction target, week of the season when the predictions are made, a measure of component model uncertainty, and recent observations of disease incidence. We applied these methods to predict measures of influenza season timing and severity in the United States, both at the national and regional levels, using three component models. We trained the models on retrospective predictions from 14 seasons (1997/1998 - 2010/2011) and evaluated each model's prospective, out-of-sample performance in the five subsequent influenza seasons. In this test phase, the ensemble methods showed overall performance that was similar to the best of the component models, but offered more consistent performance across seasons than the component models. Ensemble methods offer the potential to deliver more reliable predictions to public health decision makers.
△ Less
Submitted 31 March, 2017;
originally announced March 2017.
-
Enriching students' conceptual understanding of confidence intervals: An interactive trivia-based classroom activity
Authors:
Xiaofei Wang,
Nicholas G. Reich,
Nicholas J. Horton
Abstract:
Confidence intervals provide a way to determine plausible values for a population parameter. They are omnipresent in research articles involving statistical analyses. Appropriately, a key statistical literacy learning objective is the ability to interpret and understand confidence intervals in a wide range of settings. As instructors, we devote a considerable amount of time and effort to ensure th…
▽ More
Confidence intervals provide a way to determine plausible values for a population parameter. They are omnipresent in research articles involving statistical analyses. Appropriately, a key statistical literacy learning objective is the ability to interpret and understand confidence intervals in a wide range of settings. As instructors, we devote a considerable amount of time and effort to ensure that students master this topic in introductory courses and beyond. Yet, studies continue to find that confidence intervals are commonly misinterpreted and that even experts have trouble calibrating their individual confidence levels. In this article, we present a ten-minute trivia game-based activity that addresses these misconceptions by exposing students to confidence intervals from a personal perspective. We describe how the activity can be integrated into a statistics course as a one-time activity or with repetition at intervals throughout a course, discuss results of using the activity in class, and present possible extensions.
△ Less
Submitted 29 January, 2017;
originally announced January 2017.
-
Infrastructure and methods for real-time predictions of the 2014 dengue fever season in Thailand
Authors:
Nicholas G. Reich,
Stephen A. Lauer,
Krzysztof Sakrejda,
Sopon Iamsirithaworn,
Soawapak Hinjoy,
Paphanij Suangtho,
Suthanun Suthachana,
Hannah E. Clapham,
Henrik Salje,
Derek A. T. Cummings,
Justin Lessler
Abstract:
Epidemics of communicable diseases place a huge burden on public health infrastructures across the world. Producing accurate and actionable forecasts of infectious disease incidence at short and long time scales will improve public health response to outbreaks. However, scientists and public health officials face many obstacles in trying to create accurate and actionable real-time forecasts of inf…
▽ More
Epidemics of communicable diseases place a huge burden on public health infrastructures across the world. Producing accurate and actionable forecasts of infectious disease incidence at short and long time scales will improve public health response to outbreaks. However, scientists and public health officials face many obstacles in trying to create accurate and actionable real-time forecasts of infectious disease incidence. Dengue is a mosquito-borne virus that annually infects over 400 million people worldwide. We developed a real-time forecasting model for dengue hemorrhagic fever in the 77 provinces of Thailand. We created an operational and computational infrastructure that generated multi-step predictions of dengue incidence in Thai provinces every two weeks throughout 2014. These predictions show mixed performance across provinces, out-performing naïve seasonal models in over half of provinces at a 1.5 month horizon. Additionally, to assess the degree to which delays in case reporting make long-range prediction a challenging task, we compared the performance of our real-time predictions with predictions made with fully reported data. This paper provides valuable lessons for the implementation of real-time predictions in the context of public health decision making.
△ Less
Submitted 15 November, 2015;
originally announced November 2015.