-
Bayesian Methods for Modeling Cumulative Exposure to Extensive Environmental Health Hazards
Authors:
Rob Trangucci,
Jesse Contreras,
Jon Zelner,
Joseph N. S. Eisenberg,
Yang Chen
Abstract:
Measuring the impact of an environmental point source exposure on the risk of disease, like cancer or childhood asthma, is well-developed. Modeling how an environmental health hazard that is extensive in space, like a wastewater canal, is not. We propose a novel Bayesian generative semiparametric model for characterizing the cumulative spatial exposure to an environmental health hazard that is not…
▽ More
Measuring the impact of an environmental point source exposure on the risk of disease, like cancer or childhood asthma, is well-developed. Modeling how an environmental health hazard that is extensive in space, like a wastewater canal, is not. We propose a novel Bayesian generative semiparametric model for characterizing the cumulative spatial exposure to an environmental health hazard that is not well-represented by a single point in space. The model couples a dose-response model with a log-Gaussian Cox process integrated against a distance kernel with an unknown length-scale. We show that this model is a well-defined Bayesian inverse model, namely that the posterior exists under a Gaussian process prior for the log-intensity of exposure, and that a simple integral approximation adequately controls the computational error. We quantify the finite-sample properties and the computational tractability of the discretization scheme in a simulation study. Finally, we apply the model to survey data on household risk of childhood diarrheal illness from exposure to a system of wastewater canals in Mezquital Valley, Mexico.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Identified vaccine efficacy for binary post-infection outcomes under misclassification without monotonicity
Authors:
Rob Trangucci,
Yang Chen,
Jon Zelner
Abstract:
In order to meet regulatory approval, pharmaceutical companies often must demonstrate that new vaccines reduce the total risk of a post-infection outcome like transmission, symptomatic disease, severe illness, or death in randomized, placebo-controlled trials. Given that infection is a necessary precondition for a post-infection outcome, one can use principal stratification to partition the total…
▽ More
In order to meet regulatory approval, pharmaceutical companies often must demonstrate that new vaccines reduce the total risk of a post-infection outcome like transmission, symptomatic disease, severe illness, or death in randomized, placebo-controlled trials. Given that infection is a necessary precondition for a post-infection outcome, one can use principal stratification to partition the total causal effect of vaccination into two causal effects: vaccine efficacy against infection, and the principal effect of vaccine efficacy against a post-infection outcome in the patients that would be infected under both placebo and vaccination. Despite the importance of such principal effects to policymakers, these estimands are generally unidentifiable, even under strong assumptions that are rarely satisfied in real-world trials. We develop a novel method to nonparametrically point identify these principal effects while eliminating the monotonicity assumption and allowing for measurement error. Furthermore, our results allow for multiple treatments, and are general enough to be applicable outside of vaccine efficacy. Our method relies on the fact that many vaccine trials are run at geographically disparate health centers, and measure biologically-relevant categorical pretreatment covariates. We show that our method can be applied to a variety of clinical trial settings where vaccine efficacy against infection and a post-infection outcome can be jointly inferred. This can yield new insights from existing vaccine efficacy trial data and will aid researchers in designing new multi-arm clinical trials.
△ Less
Submitted 20 December, 2022; v1 submitted 29 November, 2022;
originally announced November 2022.
-
Modeling racial/ethnic differences in COVID-19 incidence with covariates subject to non-random missingness
Authors:
Rob Trangucci,
Yang Chen,
Jon Zelner
Abstract:
Characterizing the cumulative burden of COVID-19 by race/ethnicity is of the utmost importance for public health researchers and policy makers in order to design effective mitigation measures. This analysis is hampered, however, by surveillance case data with substantial missingness in race and ethnicity covariates. Worse yet, this missingness likely depends on the values of these missing covariat…
▽ More
Characterizing the cumulative burden of COVID-19 by race/ethnicity is of the utmost importance for public health researchers and policy makers in order to design effective mitigation measures. This analysis is hampered, however, by surveillance case data with substantial missingness in race and ethnicity covariates. Worse yet, this missingness likely depends on the values of these missing covariates, i.e. they are not missing at random (NMAR). We propose a Bayesian parametric model that leverages joint information on spatial variation in the disease and covariate missingness processes and can accommodate both MAR and NMAR missingness. We show that the model is locally identifiable when the spatial distribution of the population covariates is known and observed cases can be associated with a spatial unit of observation. We also use a simulation study to investigate the model's finite-sample performance. We compare our model's performance on NMAR data against complete-case analysis and multiple imputation (MI), both of which are commonly used by public health researchers when confronted with missing categorical covariates. Finally, we model spatial variation in cumulative COVID-19 incidence in Wayne County, Michigan using data from the Michigan Department and Health and Human Services. The analysis suggests that population relative risk estimates by race during the early part of the COVID-19 pandemic in Michigan were understated for non-white residents compared to white residents when cases missing race were dropped or had these values imputed using MI.
△ Less
Submitted 16 August, 2022; v1 submitted 16 June, 2022;
originally announced June 2022.
-
Quantifying Observed Prior Impact
Authors:
David E Jones,
Robert N Trangucci,
Yang Chen
Abstract:
We distinguish two questions (i) how much information does the prior contain? and (ii) what is the effect of the prior? Several measures have been proposed for quantifying effective prior sample size, for example Clarke [1996] and Morita et al. [2008]. However, these measures typically ignore the likelihood for the inference currently at hand, and therefore address (i) rather than (ii). Since in p…
▽ More
We distinguish two questions (i) how much information does the prior contain? and (ii) what is the effect of the prior? Several measures have been proposed for quantifying effective prior sample size, for example Clarke [1996] and Morita et al. [2008]. However, these measures typically ignore the likelihood for the inference currently at hand, and therefore address (i) rather than (ii). Since in practice (ii) is of great concern, Reimherr et al. [2014] introduced a new class of effective prior sample size measures based on prior-likelihood discordance. We take this idea further towards its natural Bayesian conclusion by proposing measures of effective prior sample size that not only incorporate the general mathematical form of the likelihood but also the specific data at hand. Thus, our measures do not average across datasets from the working model, but condition on the current observed data. Consequently, our measures can be highly variable, but we demonstrate that this is because the impact of a prior can be highly variable. Our measures are Bayes estimates of meaningful quantities and well communicate the extent to which inference is determined by the prior, or framed differently, the amount of effort saved due to having prior information. We illustrate our ideas through a number of examples including a Gaussian conjugate model (continuous observations), a Beta-Binomial model (discrete observations), and a linear regression model (two unknown parameters). Future work on further developments of the methodology and an application to astronomy are discussed at the end.
△ Less
Submitted 28 January, 2020;
originally announced January 2020.
-
Voting patterns in 2016: Exploration using multilevel regression and poststratification (MRP) on pre-election polls
Authors:
Rob Trangucci,
Imad Ali,
Andrew Gelman,
Doug Rivers
Abstract:
We analyzed 2012 and 2016 YouGov pre-election polls in order to understand how different population groups voted in the 2012 and 2016 elections. We broke the data down by demographics and state. We display our findings with a series of graphs and maps. The R code associated with this project is available at https://github.com/rtrangucci/mrp_2016_election/.
We analyzed 2012 and 2016 YouGov pre-election polls in order to understand how different population groups voted in the 2012 and 2016 elections. We broke the data down by demographics and state. We display our findings with a series of graphs and maps. The R code associated with this project is available at https://github.com/rtrangucci/mrp_2016_election/.
△ Less
Submitted 14 March, 2018; v1 submitted 2 February, 2018;
originally announced February 2018.
-
Bayesian hierarchical weighting adjustment and survey inference
Authors:
Yajuan Si,
Rob Trangucci,
Jonah Sol Gabry,
Andrew Gelman
Abstract:
We combine Bayesian prediction and weighted inference as a unified approach to survey inference. The general principles of Bayesian analysis imply that models for survey outcomes should be conditional on all variables that affect the probability of inclusion. We incorporate the weighting variables under the framework of multilevel regression and poststratification, as a byproduct generating model-…
▽ More
We combine Bayesian prediction and weighted inference as a unified approach to survey inference. The general principles of Bayesian analysis imply that models for survey outcomes should be conditional on all variables that affect the probability of inclusion. We incorporate the weighting variables under the framework of multilevel regression and poststratification, as a byproduct generating model-based weights after smoothing. We investigate deep interactions and introduce structured prior distributions for smoothing and stability of estimates. The computation is done via Stan and implemented in the open source R package "rstanarm" ready for public use. Simulation studies illustrate that model-based prediction and weighting inference outperform classical weighting. We apply the proposal to the New York Longitudinal Study of Wellbeing. The new approach generates robust weights and increases efficiency for finite population inference, especially for subsets of the population.
△ Less
Submitted 23 June, 2020; v1 submitted 25 July, 2017;
originally announced July 2017.
-
A Hierarchical Bayes Approach to Adjust for Selection Bias in Before-After Analyses of Vision Zero Policies
Authors:
Jonathan Auerbach,
Christopher Eshleman,
Rob Trangucci
Abstract:
American cities devote significant resources to the implementation of traffic safety countermeasures that prevent pedestrian fatalities. However, the before-after comparisons typically used to evaluate the success of these countermeasures often suffer from selection bias. This paper motivates the tendency for selection bias to overestimate the benefits of traffic safety policy, using New York City…
▽ More
American cities devote significant resources to the implementation of traffic safety countermeasures that prevent pedestrian fatalities. However, the before-after comparisons typically used to evaluate the success of these countermeasures often suffer from selection bias. This paper motivates the tendency for selection bias to overestimate the benefits of traffic safety policy, using New York City's Vision Zero strategy as an example. The NASS General Estimates System, Fatality Analysis Reporting System and other databases are combined into a Bayesian hierarchical model to calculate a more realistic before-after comparison. The results confirm the before-after analysis of New York City's Vision Zero policy did in fact overestimate the effect of the policy, and a more realistic estimate is roughly two-thirds the size.
△ Less
Submitted 7 September, 2018; v1 submitted 30 May, 2017;
originally announced May 2017.