Search | arXiv e-print repository

Bayesian Methods for Modeling Cumulative Exposure to Extensive Environmental Health Hazards

Authors: Rob Trangucci, Jesse Contreras, Jon Zelner, Joseph N. S. Eisenberg, Yang Chen

Abstract: Measuring the impact of an environmental point source exposure on the risk of disease, like cancer or childhood asthma, is well-developed. Modeling how an environmental health hazard that is extensive in space, like a wastewater canal, is not. We propose a novel Bayesian generative semiparametric model for characterizing the cumulative spatial exposure to an environmental health hazard that is not… ▽ More Measuring the impact of an environmental point source exposure on the risk of disease, like cancer or childhood asthma, is well-developed. Modeling how an environmental health hazard that is extensive in space, like a wastewater canal, is not. We propose a novel Bayesian generative semiparametric model for characterizing the cumulative spatial exposure to an environmental health hazard that is not well-represented by a single point in space. The model couples a dose-response model with a log-Gaussian Cox process integrated against a distance kernel with an unknown length-scale. We show that this model is a well-defined Bayesian inverse model, namely that the posterior exists under a Gaussian process prior for the log-intensity of exposure, and that a simple integral approximation adequately controls the computational error. We quantify the finite-sample properties and the computational tractability of the discretization scheme in a simulation study. Finally, we apply the model to survey data on household risk of childhood diarrheal illness from exposure to a system of wastewater canals in Mezquital Valley, Mexico. △ Less

Submitted 5 April, 2024; originally announced April 2024.

arXiv:2211.16502 [pdf, ps, other]

Identified vaccine efficacy for binary post-infection outcomes under misclassification without monotonicity

Authors: Rob Trangucci, Yang Chen, Jon Zelner

Abstract: In order to meet regulatory approval, pharmaceutical companies often must demonstrate that new vaccines reduce the total risk of a post-infection outcome like transmission, symptomatic disease, severe illness, or death in randomized, placebo-controlled trials. Given that infection is a necessary precondition for a post-infection outcome, one can use principal stratification to partition the total… ▽ More In order to meet regulatory approval, pharmaceutical companies often must demonstrate that new vaccines reduce the total risk of a post-infection outcome like transmission, symptomatic disease, severe illness, or death in randomized, placebo-controlled trials. Given that infection is a necessary precondition for a post-infection outcome, one can use principal stratification to partition the total causal effect of vaccination into two causal effects: vaccine efficacy against infection, and the principal effect of vaccine efficacy against a post-infection outcome in the patients that would be infected under both placebo and vaccination. Despite the importance of such principal effects to policymakers, these estimands are generally unidentifiable, even under strong assumptions that are rarely satisfied in real-world trials. We develop a novel method to nonparametrically point identify these principal effects while eliminating the monotonicity assumption and allowing for measurement error. Furthermore, our results allow for multiple treatments, and are general enough to be applicable outside of vaccine efficacy. Our method relies on the fact that many vaccine trials are run at geographically disparate health centers, and measure biologically-relevant categorical pretreatment covariates. We show that our method can be applied to a variety of clinical trial settings where vaccine efficacy against infection and a post-infection outcome can be jointly inferred. This can yield new insights from existing vaccine efficacy trial data and will aid researchers in designing new multi-arm clinical trials. △ Less

Submitted 20 December, 2022; v1 submitted 29 November, 2022; originally announced November 2022.

arXiv:2206.08161 [pdf, other]

doi 10.1214/22-AOAS1711

Modeling racial/ethnic differences in COVID-19 incidence with covariates subject to non-random missingness

Authors: Rob Trangucci, Yang Chen, Jon Zelner

Abstract: Characterizing the cumulative burden of COVID-19 by race/ethnicity is of the utmost importance for public health researchers and policy makers in order to design effective mitigation measures. This analysis is hampered, however, by surveillance case data with substantial missingness in race and ethnicity covariates. Worse yet, this missingness likely depends on the values of these missing covariat… ▽ More Characterizing the cumulative burden of COVID-19 by race/ethnicity is of the utmost importance for public health researchers and policy makers in order to design effective mitigation measures. This analysis is hampered, however, by surveillance case data with substantial missingness in race and ethnicity covariates. Worse yet, this missingness likely depends on the values of these missing covariates, i.e. they are not missing at random (NMAR). We propose a Bayesian parametric model that leverages joint information on spatial variation in the disease and covariate missingness processes and can accommodate both MAR and NMAR missingness. We show that the model is locally identifiable when the spatial distribution of the population covariates is known and observed cases can be associated with a spatial unit of observation. We also use a simulation study to investigate the model's finite-sample performance. We compare our model's performance on NMAR data against complete-case analysis and multiple imputation (MI), both of which are commonly used by public health researchers when confronted with missing categorical covariates. Finally, we model spatial variation in cumulative COVID-19 incidence in Wayne County, Michigan using data from the Michigan Department and Health and Human Services. The analysis suggests that population relative risk estimates by race during the early part of the COVID-19 pandemic in Michigan were understated for non-white residents compared to white residents when cases missing race were dropped or had these values imputed using MI. △ Less

Submitted 16 August, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

arXiv:2001.10664 [pdf, other]

Quantifying Observed Prior Impact

Authors: David E Jones, Robert N Trangucci, Yang Chen

Abstract: We distinguish two questions (i) how much information does the prior contain? and (ii) what is the effect of the prior? Several measures have been proposed for quantifying effective prior sample size, for example Clarke [1996] and Morita et al. [2008]. However, these measures typically ignore the likelihood for the inference currently at hand, and therefore address (i) rather than (ii). Since in p… ▽ More We distinguish two questions (i) how much information does the prior contain? and (ii) what is the effect of the prior? Several measures have been proposed for quantifying effective prior sample size, for example Clarke [1996] and Morita et al. [2008]. However, these measures typically ignore the likelihood for the inference currently at hand, and therefore address (i) rather than (ii). Since in practice (ii) is of great concern, Reimherr et al. [2014] introduced a new class of effective prior sample size measures based on prior-likelihood discordance. We take this idea further towards its natural Bayesian conclusion by proposing measures of effective prior sample size that not only incorporate the general mathematical form of the likelihood but also the specific data at hand. Thus, our measures do not average across datasets from the working model, but condition on the current observed data. Consequently, our measures can be highly variable, but we demonstrate that this is because the impact of a prior can be highly variable. Our measures are Bayes estimates of meaningful quantities and well communicate the extent to which inference is determined by the prior, or framed differently, the amount of effort saved due to having prior information. We illustrate our ideas through a number of examples including a Gaussian conjugate model (continuous observations), a Beta-Binomial model (discrete observations), and a linear regression model (two unknown parameters). Future work on further developments of the methodology and an application to astronomy are discussed at the end. △ Less

Submitted 28 January, 2020; originally announced January 2020.

arXiv:1802.00842 [pdf, other]

Voting patterns in 2016: Exploration using multilevel regression and poststratification (MRP) on pre-election polls

Authors: Rob Trangucci, Imad Ali, Andrew Gelman, Doug Rivers

Abstract: We analyzed 2012 and 2016 YouGov pre-election polls in order to understand how different population groups voted in the 2012 and 2016 elections. We broke the data down by demographics and state. We display our findings with a series of graphs and maps. The R code associated with this project is available at https://github.com/rtrangucci/mrp_2016_election/. We analyzed 2012 and 2016 YouGov pre-election polls in order to understand how different population groups voted in the 2012 and 2016 elections. We broke the data down by demographics and state. We display our findings with a series of graphs and maps. The R code associated with this project is available at https://github.com/rtrangucci/mrp_2016_election/. △ Less

Submitted 14 March, 2018; v1 submitted 2 February, 2018; originally announced February 2018.

arXiv:1707.08220 [pdf, other]

Bayesian hierarchical weighting adjustment and survey inference

Authors: Yajuan Si, Rob Trangucci, Jonah Sol Gabry, Andrew Gelman

Abstract: We combine Bayesian prediction and weighted inference as a unified approach to survey inference. The general principles of Bayesian analysis imply that models for survey outcomes should be conditional on all variables that affect the probability of inclusion. We incorporate the weighting variables under the framework of multilevel regression and poststratification, as a byproduct generating model-… ▽ More We combine Bayesian prediction and weighted inference as a unified approach to survey inference. The general principles of Bayesian analysis imply that models for survey outcomes should be conditional on all variables that affect the probability of inclusion. We incorporate the weighting variables under the framework of multilevel regression and poststratification, as a byproduct generating model-based weights after smoothing. We investigate deep interactions and introduce structured prior distributions for smoothing and stability of estimates. The computation is done via Stan and implemented in the open source R package "rstanarm" ready for public use. Simulation studies illustrate that model-based prediction and weighting inference outperform classical weighting. We apply the proposal to the New York Longitudinal Study of Wellbeing. The new approach generates robust weights and increases efficiency for finite population inference, especially for subsets of the population. △ Less

Submitted 23 June, 2020; v1 submitted 25 July, 2017; originally announced July 2017.

arXiv:1705.10876 [pdf, other]

A Hierarchical Bayes Approach to Adjust for Selection Bias in Before-After Analyses of Vision Zero Policies

Authors: Jonathan Auerbach, Christopher Eshleman, Rob Trangucci

Abstract: American cities devote significant resources to the implementation of traffic safety countermeasures that prevent pedestrian fatalities. However, the before-after comparisons typically used to evaluate the success of these countermeasures often suffer from selection bias. This paper motivates the tendency for selection bias to overestimate the benefits of traffic safety policy, using New York City… ▽ More American cities devote significant resources to the implementation of traffic safety countermeasures that prevent pedestrian fatalities. However, the before-after comparisons typically used to evaluate the success of these countermeasures often suffer from selection bias. This paper motivates the tendency for selection bias to overestimate the benefits of traffic safety policy, using New York City's Vision Zero strategy as an example. The NASS General Estimates System, Fatality Analysis Reporting System and other databases are combined into a Bayesian hierarchical model to calculate a more realistic before-after comparison. The results confirm the before-after analysis of New York City's Vision Zero policy did in fact overestimate the effect of the policy, and a more realistic estimate is roughly two-thirds the size. △ Less

Submitted 7 September, 2018; v1 submitted 30 May, 2017; originally announced May 2017.

MSC Class: 62P25

Showing 1–7 of 7 results for author: Trangucci, R