-
A Bayesian Spatial Berkson error approach to estimate small area opioid mortality rates accounting for population-at-risk uncertainty
Authors:
Emily N Peterson,
Rachel C. Nethery,
Jarvis T. Chen,
Loni P. Tabb,
Brent A. Coull,
Frederic B. Piel,
Lance A Waller
Abstract:
Monitoring small-area geographical population trends in opioid mortality has large scale implications to informing preventative resource allocation. A common approach to obtain small area estimates of opioid mortality is to use a standard disease map** approach in which population-at-risk estimates are treated as fixed and known. Assuming fixed populations ignores the uncertainty surrounding sma…
▽ More
Monitoring small-area geographical population trends in opioid mortality has large scale implications to informing preventative resource allocation. A common approach to obtain small area estimates of opioid mortality is to use a standard disease map** approach in which population-at-risk estimates are treated as fixed and known. Assuming fixed populations ignores the uncertainty surrounding small area population estimates, which may bias risk estimates and under-estimate their associated uncertainties. We present a Bayesian Spatial Berkson Error (BSBE) model to incorporate population-at-risk uncertainty within a disease map** model. We compare the BSBE approach to the naive (treating denominators as fixed) using simulation studies to illustrate potential bias resulting from this assumption. We show the application of the BSBE model to obtain 2020 opioid mortality risk estimates for 159 counties in GA accounting for population-at-risk uncertainty. Utilizing our proposed approach will help to inform interventions in opioid related public health responses, policies, and resource allocation. Additionally, we provide a general framework to improve in the estimation and map** of health indicators.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Utilizing a Capture-Recapture Strategy to Accelerate Infectious Disease Surveillance
Authors:
Lin Ge,
Yuzi Zhang,
Lance A. Waller,
Robert H. Lyles
Abstract:
Monitoring key elements of disease dynamics (e.g., prevalence, case counts) is of great importance in infectious disease prevention and control, as emphasized during the COVID-19 pandemic. To facilitate this effort, we propose a new capture-recapture (CRC) analysis strategy that takes misclassification into account from easily-administered, imperfect diagnostic test kits, such as the Rapid Antigen…
▽ More
Monitoring key elements of disease dynamics (e.g., prevalence, case counts) is of great importance in infectious disease prevention and control, as emphasized during the COVID-19 pandemic. To facilitate this effort, we propose a new capture-recapture (CRC) analysis strategy that takes misclassification into account from easily-administered, imperfect diagnostic test kits, such as the Rapid Antigen Test-kits or saliva tests. Our method is based on a recently proposed "anchor stream" design, whereby an existing voluntary surveillance data stream is augmented by a smaller and judiciously drawn random sample. It incorporates manufacturer-specified sensitivity and specificity parameters to account for imperfect diagnostic results in one or both data streams. For inference to accompany case count estimation, we improve upon traditional Wald-type confidence intervals by develo** an adapted Bayesian credible interval for the CRC estimator that yields favorable frequentist coverage properties. When feasible, the proposed design and analytic strategy provides a more efficient solution than traditional CRC methods or random sampling-based biased-corrected estimation to monitor disease prevalence while accounting for misclassification. We demonstrate the benefits of this approach through simulation studies that underscore its potential utility in practice for economical disease monitoring among a registered closed population.
△ Less
Submitted 30 June, 2023;
originally announced July 2023.
-
On some pitfalls of the log-linear modeling framework for capture-recapture studies in disease surveillance
Authors:
Yuzi Zhang,
Lin Ge,
Lance A. Waller,
Robert H. Lyles
Abstract:
In epidemiological studies, the capture-recapture (CRC) method is a powerful tool that can be used to estimate the number of diseased cases or potentially disease prevalence based on data from overlap** surveillance systems. Estimators derived from log-linear models are widely applied by epidemiologists when analyzing CRC data. The popularity of the log-linear model framework is largely associat…
▽ More
In epidemiological studies, the capture-recapture (CRC) method is a powerful tool that can be used to estimate the number of diseased cases or potentially disease prevalence based on data from overlap** surveillance systems. Estimators derived from log-linear models are widely applied by epidemiologists when analyzing CRC data. The popularity of the log-linear model framework is largely associated with its accessibility and the fact that interaction terms can allow for certain types of dependency among data streams. In this work, we shed new light on significant pitfalls associated with the log-linear model framework in the context of CRC using real data examples and simulation studies. First, we demonstrate that the log-linear model paradigm is highly exclusionary. That is, it can exclude, by design, many possible estimates that are potentially consistent with the observed data. Second, we clarify the ways in which regularly used model selection metrics (e.g., information criteria) are fundamentally deceiving in the effort to select a best model in this setting. By focusing attention on these important cautionary points and on the fundamental untestable dependency assumption made when fitting a log-linear model to CRC data, we hope to improve the quality of and transparency associated with subsequent surveillance-based CRC estimates of case counts.
△ Less
Submitted 18 June, 2023;
originally announced June 2023.
-
Enhanced Inference for Finite Population Sampling-Based Prevalence Estimation with Misclassification Errors
Authors:
Lin Ge,
Yuzi Zhang,
Lance A. Waller,
Robert H. Lyles
Abstract:
Epidemiologic screening programs often make use of tests with small, but non-zero probabilities of misdiagnosis. In this article, we assume the target population is finite with a fixed number of true cases, and that we apply an imperfect test with known sensitivity and specificity to a sample of individuals from the population. In this setting, we propose an enhanced inferential approach for use i…
▽ More
Epidemiologic screening programs often make use of tests with small, but non-zero probabilities of misdiagnosis. In this article, we assume the target population is finite with a fixed number of true cases, and that we apply an imperfect test with known sensitivity and specificity to a sample of individuals from the population. In this setting, we propose an enhanced inferential approach for use in conjunction with sampling-based bias-corrected prevalence estimation. While ignoring the finite nature of the population can yield markedly conservative estimates, direct application of a standard finite population correction (FPC) conversely leads to underestimation of variance. We uncover a way to leverage the typical FPC indirectly toward valid statistical inference. In particular, we derive a readily estimable extra variance component induced by misclassification in this specific but arguably common diagnostic testing scenario. Our approach yields a standard error estimate that properly captures the sampling variability of the usual bias-corrected maximum likelihood estimator of disease prevalence. Finally, we develop an adapted Bayesian credible interval for the true prevalence that offers improved frequentist properties (i.e., coverage and width) relative to a Wald-type confidence interval. We report the simulation results to demonstrate the enhanced performance of the proposed inferential methods.
△ Less
Submitted 13 August, 2023; v1 submitted 7 February, 2023;
originally announced February 2023.
-
A Design and Analytic Strategy for Monitoring Disease Positivity and Case Characteristics in Accessible Closed Populations
Authors:
Robert H. Lyles,
Yuzi Zhang,
Lin Ge,
Lance A. Waller
Abstract:
We propose a monitoring strategy for efficient and robust estimation of disease prevalence and case numbers within closed and enumerated populations such as schools, workplaces, or retirement communities. The proposed design relies largely on voluntary testing, notoriously biased (e.g., in the case of COVID-19) due to non-representative sampling. The approach yields unbiased and comparatively prec…
▽ More
We propose a monitoring strategy for efficient and robust estimation of disease prevalence and case numbers within closed and enumerated populations such as schools, workplaces, or retirement communities. The proposed design relies largely on voluntary testing, notoriously biased (e.g., in the case of COVID-19) due to non-representative sampling. The approach yields unbiased and comparatively precise estimates with no assumptions about factors underlying selection of individuals for voluntary testing, building on the strength of what can be a small random sampling component. This component unlocks a previously proposed "anchor stream" estimator, a well-calibrated alternative to classical capture-recapture (CRC) estimators based on two data streams. We show here that this estimator is equivalent to a direct standardization based on "capture", i.e., selection (or not) by the voluntary testing program, made possible by means of a key parameter identified by design. This equivalency simultaneously allows for novel two-stream CRC-like estimation of general means (e.g., of continuous variables such as antibody levels or biomarkers). For inference, we propose adaptations of a Bayesian credible interval when estimating case counts and bootstrap** when estimating means of continuous variables. We use simulations to demonstrate significant precision benefits relative to random sampling alone.
△ Less
Submitted 9 December, 2022;
originally announced December 2022.
-
Tailoring Capture-Recapture Methods to Estimate Registry-Based Case Counts Based on Error-Prone Diagnostic Signals
Authors:
Lin Ge,
Yuzi Zhang,
Kevin C. Ward,
Timothy L. Lash,
Lance A. Waller,
Robert H. Lyles
Abstract:
Surveillance research is of great importance for effective and efficient epidemiological monitoring of case counts and disease prevalence. Taking specific motivation from ongoing efforts to identify recurrent cases based on the Georgia Cancer Registry, we extend recently proposed "anchor stream" sampling design and estimation methodology. Our approach offers a more efficient and defensible alterna…
▽ More
Surveillance research is of great importance for effective and efficient epidemiological monitoring of case counts and disease prevalence. Taking specific motivation from ongoing efforts to identify recurrent cases based on the Georgia Cancer Registry, we extend recently proposed "anchor stream" sampling design and estimation methodology. Our approach offers a more efficient and defensible alternative to traditional capture-recapture (CRC) methods by leveraging a relatively small random sample of participants whose recurrence status is obtained through a principled application of medical records abstraction. This sample is combined with one or more existing signaling data streams, which may yield data based on arbitrarily non-representative subsets of the full registry population. The key extension developed here accounts for the common problem of false positive or negative diagnostic signals from the existing data stream(s). In particular, we show that the design only requires documentation of positive signals in these non-anchor surveillance streams, and permits valid estimation of the true case count based on an estimable positive predictive value (PPV) parameter. We borrow ideas from the multiple imputation paradigm to provide accompanying standard errors, and develop an adapted Bayesian credible interval approach that yields favorable frequentist coverage properties. We demonstrate the benefits of the proposed methods through simulation studies, and provide a data example targeting estimation of the breast cancer recurrence case count among Metro Atlanta area patients from the Georgia Cancer Registry-based Cancer Recurrence Information and Surveillance Program (CRISP) database.
△ Less
Submitted 24 November, 2022;
originally announced November 2022.
-
Impacts of Census Differential Privacy for Small-Area Disease Map** to Monitor Health Inequities
Authors:
Yanran Li,
Brent A. Coull,
Nancy Krieger,
Emily Peterson,
Lance A. Waller,
Jarvis T. Chen,
Rachel C. Nethery
Abstract:
The US Census Bureau will implement a new privacy-preserving disclosure avoidance system (DAS), which includes application of differential privacy, on the public-release 2020 census data. There are concerns that the DAS may bias small-area and demographically-stratified population counts, which play a critical role in public health research and policy, serving as denominators in estimation of dise…
▽ More
The US Census Bureau will implement a new privacy-preserving disclosure avoidance system (DAS), which includes application of differential privacy, on the public-release 2020 census data. There are concerns that the DAS may bias small-area and demographically-stratified population counts, which play a critical role in public health research and policy, serving as denominators in estimation of disease/mortality rates. Employing three DAS demonstration products, we quantify errors attributable to reliance on DAS-protected denominators in standard small-area disease map** models for characterizing health inequities. We conduct simulation studies and real data analyses of inequities in premature mortality at the census tract level in Massachusetts. Results show that overall patterns of inequity by racialized group and economic deprivation level are not compromised by the DAS. While early versions of DAS induce errors in mortality rate estimation that are larger for Black than for non-Hispanic white populations, this issue is ameliorated in newer DAS versions.
△ Less
Submitted 29 March, 2023; v1 submitted 9 September, 2022;
originally announced September 2022.
-
A Bayesian hierarchical small-area population model accounting for data source specific methodologies from American Community Survey, Population Estimates Program, and Decennial Census data
Authors:
Emily N Peterson,
Rachel C Nethery,
Tullia Padellini,
Jarvis T Chen,
Brent A Coull,
Frederic B Piel,
Jon Wakefield,
Marta Blangiardo,
Lance A Waller
Abstract:
Small area estimates of population are necessary for many epidemiological studies, yet their quality and accuracy are often not assessed. In the United States, small area estimates of population counts are published by the United States Census Bureau (USCB) in the form of the Decennial census counts, Intercensal population projections (PEP), and American Community Survey (ACS) estimates. Although…
▽ More
Small area estimates of population are necessary for many epidemiological studies, yet their quality and accuracy are often not assessed. In the United States, small area estimates of population counts are published by the United States Census Bureau (USCB) in the form of the Decennial census counts, Intercensal population projections (PEP), and American Community Survey (ACS) estimates. Although there are significant relationships between these data sources, there are important contrasts in data collection and processing methodologies, such that each set of estimates may be subject to different sources and magnitudes of error. Additionally, these data sources do not report identical small area population counts due to post-survey adjustments specific to each data source. Resulting small area disease/mortality rates may differ depending on which data source is used for population counts (denominator data). To accurately capture annual small area population counts, and associated uncertainties, we present a Bayesian population model (B-Pop), which fuses information from all three USCB sources, accounting for data source specific methodologies and associated errors. The main features of our framework are: 1) a single model integrating multiple data sources, 2) accounting for data source specific data generating mechanisms, and specifically accounting for data source specific errors, and 3) prediction of estimates for years without USCB reported data. We focus our study on the 159 counties of Georgia, and produce estimates for years 2005-2021.
△ Less
Submitted 17 December, 2021;
originally announced December 2021.
-
Multi-Objective Allocation of COVID-19 Testing Centers: Improving Coverage and Equity in Access
Authors:
Zhen Zhong,
Ribhu Sengupta,
Kamran Paynabar,
Lance A. Waller
Abstract:
At the time of this article, COVID-19 has been transmitted to more than 42 million people and resulted in more than 673,000 deaths across the United States. Throughout this pandemic, public health authorities have monitored the results of diagnostic testing to identify hotspots of transmission. Such information can help reduce or block transmission paths of COVID-19 and help infected patients rece…
▽ More
At the time of this article, COVID-19 has been transmitted to more than 42 million people and resulted in more than 673,000 deaths across the United States. Throughout this pandemic, public health authorities have monitored the results of diagnostic testing to identify hotspots of transmission. Such information can help reduce or block transmission paths of COVID-19 and help infected patients receive early treatment. However, most current schemes of test site allocation have been based on experience or convenience, often resulting in low efficiency and non-optimal allocation. In addition, the historical sociodemographic patterns of populations within cities can result in measurable inequities in access to testing between various racial and income groups. To address these pressing issues, we propose a novel test site allocation scheme to (a) maximize population coverage, (b) minimize prediction uncertainties associated with projections of outbreak trajectories, and (c) reduce inequities in access. We illustrate our approach with case studies comparing our allocation scheme with recorded allocation of testing sites in Georgia, revealing increases in both population coverage and improvements in equity of access over current practice.
△ Less
Submitted 20 September, 2021;
originally announced October 2021.
-
An integrated abundance model for estimating county-level prevalence of opioid misuse in Ohio
Authors:
Staci A. Hepler,
David Kline,
Andrea Bonny,
Erin McKnight,
Lance A. Waller
Abstract:
Opioid misuse is a national epidemic and a significant drug related threat to the United States. While the scale of the problem is undeniable, estimates of the local prevalence of opioid misuse are lacking, despite their importance to policy-making and resource allocation. This is due, in part, to the challenge of directly measuring opioid misuse at a local level. In this paper, we develop a Bayes…
▽ More
Opioid misuse is a national epidemic and a significant drug related threat to the United States. While the scale of the problem is undeniable, estimates of the local prevalence of opioid misuse are lacking, despite their importance to policy-making and resource allocation. This is due, in part, to the challenge of directly measuring opioid misuse at a local level. In this paper, we develop a Bayesian hierarchical spatio-temporal abundance model that integrates indirect county-level data on opioid-related outcomes with state-level survey estimates on prevalence of opioid misuse to estimate the latent county-level prevalence and counts of people who misuse opioids. A simulation study shows that our integrated model accurately recovers the latent counts and prevalence. We apply our model to county-level surveillance data on opioid overdose deaths and treatment admissions from the state of Ohio. Our proposed framework can be applied to other applications of small area estimation for hard to reach populations, which is a common occurrence with many health conditions such as those related to illicit behaviors.
△ Less
Submitted 12 January, 2022; v1 submitted 4 January, 2021;
originally announced January 2021.
-
Memory-efficient Learning for Large-scale Computational Imaging
Authors:
Michael Kellman,
Kevin Zhang,
Jon Tamir,
Emrah Bostan,
Michael Lustig,
Laura Waller
Abstract:
Critical aspects of computational imaging systems, such as experimental design and image priors, can be optimized through deep networks formed by the unrolled iterations of classical model-based reconstructions (termed physics-based networks). However, for real-world large-scale inverse problems, computing gradients via backpropagation is infeasible due to memory limitations of graphics processing…
▽ More
Critical aspects of computational imaging systems, such as experimental design and image priors, can be optimized through deep networks formed by the unrolled iterations of classical model-based reconstructions (termed physics-based networks). However, for real-world large-scale inverse problems, computing gradients via backpropagation is infeasible due to memory limitations of graphics processing units. In this work, we propose a memory-efficient learning procedure that exploits the reversibility of the network's layers to enable data-driven design for large-scale computational imaging systems. We demonstrate our method on a small-scale compressed sensing example, as well as two large-scale real-world systems: multi-channel magnetic resonance imaging and super-resolution optical microscopy.
△ Less
Submitted 11 March, 2020;
originally announced March 2020.
-
A Bayesian Downscaler Model to Estimate Daily PM2.5 levels in the Continental US
Authors:
Yikai Wang,
Xuefei Hu,
Howard Chang,
Lance Waller,
Jessica Belle,
Yang Liu
Abstract:
There has been growing interest in extending the coverage of ground PM2.5 monitoring networks based on satellite remote sensing data. With broad spatial and temporal coverage, satellite based monitoring network has a strong potential to complement the ground monitor system in terms of the spatial-temporal availability of the air quality data. However, most existing calibration models focused on a…
▽ More
There has been growing interest in extending the coverage of ground PM2.5 monitoring networks based on satellite remote sensing data. With broad spatial and temporal coverage, satellite based monitoring network has a strong potential to complement the ground monitor system in terms of the spatial-temporal availability of the air quality data. However, most existing calibration models focused on a relatively small spatial domain and cannot be generalized to national-wise study. In this paper, we proposed a statistically reliable and interpretable national modeling framework based on Bayesian downscaling methods with the application to the calibration of the daily ground PM2.5 concentrations across the Continental U.S. using satellite-retrieved aerosol optical depth (AOD) and other ancillary predictors in 2011. Our approach flexibly models the PM2.5 versus AOD and the potential related geographical factors varying across the climate regions and yields spatial and temporal specific parameters to enhance the model interpretability. Moreover, our model accurately predicted the national PM2.5 with a R2 at 70% and generates reliable annual and seasonal PM2.5 concentration maps with its SD. Overall, this modeling framework can be applied to the national scale PM2.5 exposure assessments and also quantify the prediction errors.
△ Less
Submitted 6 August, 2018;
originally announced August 2018.
-
GraphVar 2.0: A user-friendly toolbox for machine learning on functional connectivity measures
Authors:
Lea Waller,
Anastasia Brovkin,
Lena Dorfschmidt,
Danilo Bzdok,
Henrik Walter,
Johann Daniel Kruschwitz
Abstract:
Background: We previously presented GraphVar as a user-friendly MATLAB toolbox for comprehensive graph analyses of functional brain connectivity. Here we introduce a comprehensive extension of the toolbox allowing users to seamlessly explore easily customizable decoding models across functional connectivity measures as well as additional features.
New Method: GraphVar 2.0 provides machine learni…
▽ More
Background: We previously presented GraphVar as a user-friendly MATLAB toolbox for comprehensive graph analyses of functional brain connectivity. Here we introduce a comprehensive extension of the toolbox allowing users to seamlessly explore easily customizable decoding models across functional connectivity measures as well as additional features.
New Method: GraphVar 2.0 provides machine learning (ML) model construction, validation and exploration. Machine learning can be performed across any combination of network measures and additional variables, allowing for a flexibility in neuroimaging applications.
Results: In addition to previously integrated functionalities, such as network construction and graph-theoretical analyses of brain connectivity with a high-speed general linear model (GLM), users can now perform customizable ML across connectivity matrices, network metrics and additionally imported variables. The new extension also provides parametric and nonparametric testing of classifier and regressor performance, data export, figure generation and high quality export.
Comparison with existing methods: Compared to other existing toolboxes, GraphVar 2.0 offers (1) comprehensive customization, (2) an all-in-one user friendly interface, (3) customizable model design and manual hyperparameter entry, (4) interactive results exploration and data export, (5) automated cueing for modelling multiple outcome variables within the same session, (6) an easy to follow introductory review.
Conclusions: GraphVar 2.0 allows comprehensive, user-friendly exploration of encoding (GLM) and decoding (ML) modelling approaches on functional connectivity measures making big data neuroscience readily accessible to a broader audience of neuroimaging investigators.
△ Less
Submitted 6 July, 2018; v1 submitted 28 February, 2018;
originally announced March 2018.
-
Data Integration Model for Air Quality: A Hierarchical Approach to the Global Estimation of Exposures to Ambient Air Pollution
Authors:
Gavin Shaddick,
Matthew L. Thomas,
Amelia Jobling,
Michael Brauer,
Aaron van Donkelaar,
Rick Burnett,
Howard Chang,
Aaron Cohen,
Rita Van Dingenen,
Carlos Dora,
Sophie Gumy,
Yang Liu,
Randall Martin,
Lance A. Waller,
Jason West,
James V. Zidek,
Annette Prüss-Ustün
Abstract:
Air pollution is a major risk factor for global health, with both ambient and household air pollution contributing substantial components of the overall global disease burden. One of the key drivers of adverse health effects is fine particulate matter ambient pollution (PM$_{2.5}$) to which an estimated 3 million deaths can be attributed annually. The primary source of information for estimating e…
▽ More
Air pollution is a major risk factor for global health, with both ambient and household air pollution contributing substantial components of the overall global disease burden. One of the key drivers of adverse health effects is fine particulate matter ambient pollution (PM$_{2.5}$) to which an estimated 3 million deaths can be attributed annually. The primary source of information for estimating exposures has been measurements from ground monitoring networks but, although coverage is increasing, there remain regions in which monitoring is limited. Ground monitoring data therefore needs to be supplemented with information from other sources, such as satellite retrievals of aerosol optical depth and chemical transport models. A hierarchical modelling approach for integrating data from multiple sources is proposed allowing spatially-varying relationships between ground measurements and other factors that estimate air quality. Set within a Bayesian framework, the resulting Data Integration Model for Air Quality (DIMAQ) is used to estimate exposures, together with associated measures of uncertainty, on a high resolution grid covering the entire world. Bayesian analysis on this scale can be computationally challenging and here approximate Bayesian inference is performed using Integrated Nested Laplace Approximations. Model selection and assessment is performed by cross-validation with the final model offering substantial increases in predictive accuracy, particularly in regions where there is sparse ground monitoring, when compared to current approaches: root mean square error (RMSE) reduced from 17.1 to 10.7, and population weighted RMSE from 23.1 to 12.1 $μ$gm$^{-3}$. Based on summaries of the posterior distributions for each grid cell, it is estimated that 92% of the world's population reside in areas exceeding the World Health Organization's Air Quality Guidelines.
△ Less
Submitted 26 September, 2016; v1 submitted 1 September, 2016;
originally announced September 2016.
-
Hierarchical multivariate space-time methods for modeling counts with an application to stroke mortality data
Authors:
Harrison Quick,
Lance A. Waller,
Michele Casper
Abstract:
Geographic patterns in stroke mortality have been studied as far back as the 1960s, when a region of the southeastern United States became known as the "stroke belt" due to its unusually high rates of stroke mortality. While stroke mortality rates are known to increase exponentially with age, an investigation of spatiotemporal trends by age group at the county-level is daunting due to the preponde…
▽ More
Geographic patterns in stroke mortality have been studied as far back as the 1960s, when a region of the southeastern United States became known as the "stroke belt" due to its unusually high rates of stroke mortality. While stroke mortality rates are known to increase exponentially with age, an investigation of spatiotemporal trends by age group at the county-level is daunting due to the preponderance of small population sizes and/or few stroke events by age group. Here, we harness the power of a complex, nonseparable multivariate space-time model which borrows strength across space, time, and age group to obtain reliable estimates of yearly county-level mortality rates from US counties between 1973 and 2013 for those aged 65+. Furthermore, we propose an alternative metric for measuring changes in event rates over time which accounts for the full trajectory of a county's event rates, as opposed to simply comparing the rates at the beginning and end of the study period. In our analysis of the stroke data, we identify differing spatiotemporal trends in mortality rates across age groups, shed light on the gains achieved in the Deep South, and provide evidence that a separable model is inappropriate for these data.
△ Less
Submitted 14 February, 2016;
originally announced February 2016.
-
Automatic Region-wise Spatially Varying Coefficient Regression Model: an Application to National Cardiovascular Disease Mortality and Air Pollution Association Study
Authors:
Shuo Chen,
Chengsheng Jiang,
Lance Waller
Abstract:
Motivated by analyzing a national data base of annual air pollution and cardiovascular disease mortality rate for 3100 counties in the U.S. (areal data), we develop a novel statistical framework to automatically detect spatially varying region-wise associations between air pollution exposures and health outcomes. The automatic region-wise spatially varying coefficient model consists three parts: w…
▽ More
Motivated by analyzing a national data base of annual air pollution and cardiovascular disease mortality rate for 3100 counties in the U.S. (areal data), we develop a novel statistical framework to automatically detect spatially varying region-wise associations between air pollution exposures and health outcomes. The automatic region-wise spatially varying coefficient model consists three parts: we first compute the similarity matrix between the exposure-health outcome associations of all spatial units, then segment the whole map into a set of disjoint regions based on the adjacency matrix with constraints that all spatial units within a region are contiguous and have similar association, and lastly estimate the region specific associations between exposure and health outcome. We implement the framework by using regression and spectral graph techniques. We develop goodness of fit criteria for model assessment and model selection. The simulation study confirms the satisfactory performance of our model. We further employ our method to investigate the association between airborne particulate matter smaller than 2.5 microns (PM 2.5) and cardiovascular disease mortality. The results successfully identify regions with distinct associations between the mortality rate and PM 2.5 that may provide insightful guidance for environmental health research.
△ Less
Submitted 18 November, 2015;
originally announced November 2015.
-
A Nonseparable Multivariate Space-Time Model for Analyzing County-Level Heart Disease Death Rates by Race and Gender
Authors:
Harrison Quick,
Lance A. Waller,
Michele Casper
Abstract:
While death rates due to diseases of the heart have experienced a sharp decline over the past 50 years, these diseases continue to be the leading cause of death in the United States, and the rate of decline varies by geographic location, race, and gender. We look to harness the power of hierarchical Bayesian methods to obtain a clearer picture of the declines from county-level, temporally varying…
▽ More
While death rates due to diseases of the heart have experienced a sharp decline over the past 50 years, these diseases continue to be the leading cause of death in the United States, and the rate of decline varies by geographic location, race, and gender. We look to harness the power of hierarchical Bayesian methods to obtain a clearer picture of the declines from county-level, temporally varying heart disease death rates for men and women of different races in the US. Specifically, we propose a nonseparable multivariate spatio-temporal Bayesian model which allows for group-specific temporal correlations and temporally-evolving covariance structures in the multivariate spatio-temporal component of the model. After verifying the effectiveness of our model via simulation, we apply our model to a dataset of over 200,000 county-level heart disease death rates. In addition to yielding a superior fit than other common approaches for handling such data, the richness of our model provides insight into racial, gender, and geographic disparities underlying heart disease death rates in the US which are not permitted by more restrictive models.
△ Less
Submitted 9 July, 2015;
originally announced July 2015.
-
Discussion of "Spatial accessibility of pediatric primary healthcare: Measurement and inference"
Authors:
Lance A. Waller
Abstract:
Discussion of "Spatial accessibility of pediatric primary healthcare: Measurement and inference" by Mallory Nobles, Nicoleta Serban and Julie Swann [arXiv:1501.03626].
Discussion of "Spatial accessibility of pediatric primary healthcare: Measurement and inference" by Mallory Nobles, Nicoleta Serban and Julie Swann [arXiv:1501.03626].
△ Less
Submitted 16 January, 2015;
originally announced January 2015.