Skip to main content

Showing 1–15 of 15 results for author: Williams, M R

Searching in archive stat. Search in all archives.
.
  1. arXiv:2310.01575  [pdf, other

    stat.ME stat.AP

    Derivation of outcome-dependent dietary patterns for low-income women obtained from survey data using a Supervised Weighted Overfitted Latent Class Analysis

    Authors: Stephanie M. Wu, Matthew R. Williams, Terrance D. Savitsky, Briana J. K. Stephenson

    Abstract: Poor diet quality is a key modifiable risk factor for hypertension and disproportionately impacts low-income women. \sw{Analyzing diet-driven hypertensive outcomes in this demographic is challenging due to the complexity of dietary data and selection bias when the data come from surveys, a main data source for understanding diet-disease relationships in understudied populations. Supervised Bayesia… ▽ More

    Submitted 28 June, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: 16 pages, 8 tables, 7 figures

  2. arXiv:2308.06845  [pdf, other

    stat.CO stat.AP

    csSampling: An R Package for Bayesian Models for Complex Survey Data

    Authors: Ryan Hornby, Matthew R. Williams, Terrance D. Savitsky, Mahmoud Elkasabi

    Abstract: We present csSampling, an R package for estimation of Bayesian models for data collected from complex survey samples. csSampling combines functionality from the probabilistic programming language Stan (via the rstan and brms R packages) and the handling of complex survey data from the survey R package. Under this approach, the user creates a survey-weighted model in brms or provides a custom weigh… ▽ More

    Submitted 13 August, 2023; originally announced August 2023.

    Comments: 22 pages, 5 figures

  3. arXiv:2208.14541  [pdf, other

    stat.ME stat.CO

    Methods for Combining Probability and Nonprobability Samples Under Unknown Overlaps

    Authors: Terrance D. Savitsky, Matthew R. Williams, Julie Gershunskaya, Vladislav Beresovsky, Nels G. Johnson

    Abstract: Nonprobability (convenience) samples are increasingly sought to reduce the estimation variance for one or more population variables of interest that are estimated using a randomized survey (reference) sample by increasing the effective sample size. Estimation of a population quantity derived from a convenience sample will typically result in bias since the distribution of variables of interest in… ▽ More

    Submitted 9 June, 2023; v1 submitted 30 August, 2022; originally announced August 2022.

    Comments: 37 pages, 11 figures. arXiv admin note: substantial text overlap with arXiv:2204.02271

  4. arXiv:2205.05003  [pdf, other

    stat.ME

    Mechanisms for Global Differential Privacy under Bayesian Data Synthesis

    Authors: **gchen Hu, Matthew R. Williams, Terrance D. Savitsky

    Abstract: This paper introduces a new method that embeds any Bayesian model used to generate synthetic data and converts it into a differentially private (DP) mechanism. We propose an alteration of the model synthesizer to utilize a censored likelihood that induces upper and lower bounds of [$\exp(-ε/ 2), \exp(ε/ 2)$], where $ε$ denotes the level of the DP guarantee. This censoring mechanism equipped with a… ▽ More

    Submitted 3 August, 2023; v1 submitted 10 May, 2022; originally announced May 2022.

  5. arXiv:2204.02271   

    stat.ME

    Methods for Combining Probability and Nonprobability Samples Under Unknown Overlaps

    Authors: Terrance D. Savitsky, Matthew R. Williams, Julie Gershunskaya, Vladislav Beresovsky, Nels G. Johnson

    Abstract: Nonprobability (convenience) samples are increasingly sought to stabilize estimations for one or more population variables of interest that are performed using a randomized survey (reference) sample by increasing the effective sample size. Estimation of a population quantity derived from a convenience sample will typically result in bias since the distribution of variables of interest in the conve… ▽ More

    Submitted 9 June, 2023; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: Duplication with arXiv.2208.14541

  6. arXiv:2101.06188  [pdf, other

    stat.ME stat.AP

    Private Tabular Survey Data Products through Synthetic Microdata Generation

    Authors: **gchen Hu, Terrance D. Savitsky, Matthew R. Williams

    Abstract: We propose two synthetic microdata approaches to generate private tabular survey data products for public release. We adapt a pseudo posterior mechanism that downweights by-record likelihood contributions with weights $\in [0,1]$ based on their identification disclosure risks to producing tabular products for survey data. Our method applied to an observed survey database achieves an asymptotic glo… ▽ More

    Submitted 3 March, 2022; v1 submitted 15 January, 2021; originally announced January 2021.

  7. arXiv:2006.01230  [pdf, other

    stat.ME stat.AP

    Re-weighting of Vector-weighted Mechanisms for Utility Maximization under Differential Privacy

    Authors: Terrance D. Savitsky, **gchen Hu, Matthew R. Williams

    Abstract: We address practical implementation of a risk-weighted pseudo posterior synthesizer for microdata dissemination with a new re-weighting strategy that maximizes utility of released synthetic data under at any level of formal privacy guarantee. Our re-weighting strategy applies to any vector-weighted pseudo posterior mechanism under which a vector of observation-indexed weights are used to downweigh… ▽ More

    Submitted 28 April, 2022; v1 submitted 1 June, 2020; originally announced June 2020.

  8. arXiv:2004.06191  [pdf, other

    stat.ME

    Pseudo Bayesian Estimation of One-way ANOVA Model in Complex Surveys

    Authors: Terrance D. Savitsky, Matthew R. Williams, Sanvesh Srivastava

    Abstract: We devise survey-weighted pseudo posterior distribution estimators under two-stage informative sampling of both primary clusters and secondary nested units for a one-way analysis of variance (ANOVA) population generating model as a simple canonical case where population model random effects are defined to be coincident with the primary clusters, for example student performance based on a survey of… ▽ More

    Submitted 12 May, 2023; v1 submitted 13 April, 2020; originally announced April 2020.

    Comments: 45 pages, 12 figures

    MSC Class: 62D05; 62F15; 62J05

  9. arXiv:1909.11796  [pdf, other

    stat.ME

    Bayesian Pseudo Posterior Mechanism under Asymptotic Differential Privacy

    Authors: Terrance D. Savitsky, Matthew R. Williams, **gchen Hu

    Abstract: We propose a Bayesian pseudo posterior mechanism to generate record-level synthetic databases equipped with an $(ε,δ)-$ probabilistic differential privacy (pDP) guarantee, where $δ$ denotes the probability that any observed database exceeds $ε$. The pseudo posterior mechanism employs a data record-indexed, risk-based weight vector with weight values $\in [0, 1]$ that surgically downweight the like… ▽ More

    Submitted 13 August, 2021; v1 submitted 25 September, 2019; originally announced September 2019.

    Comments: 35 pages, 7 figures, 2 tables

  10. arXiv:1908.07639  [pdf, other

    stat.ME stat.AP

    Risk-Efficient Bayesian Data Synthesis for Privacy Protection

    Authors: **gchen Hu, Terrance D. Savitsky, Matthew R. Williams

    Abstract: Statistical agencies utilize models to synthesize respondent-level data for release to the public for privacy protection. In this work, we efficiently induce privacy protection into any Bayesian synthesis model by employing a pseudo likelihood that exponentiates each likelihood contribution by an observation record-indexed weight in [0, 1], defined to be inversely proportional to the identificatio… ▽ More

    Submitted 8 February, 2021; v1 submitted 20 August, 2019; originally announced August 2019.

    Journal ref: Journal of Survey Statistics and Methodology, 2021

  11. arXiv:1904.07680  [pdf, other

    stat.ME

    Pseudo Bayesian Mixed Models under Informative Sampling

    Authors: Terrance D. Savitsky, Matthew R. Williams

    Abstract: When random effects are correlated with sample design variables, the usual approach of employing individual survey weights (constructed to be inversely proportional to the unit survey inclusion probabilities) to form a pseudo-likelihood no longer produces asymptotically unbiased inference. We construct a weight-exponentiated formulation for the random effects distribution that achieves unbiased in… ▽ More

    Submitted 24 August, 2021; v1 submitted 16 April, 2019; originally announced April 2019.

    Comments: 31 pages, 6 figures, 2 table

    MSC Class: 62F15; 62D05

  12. arXiv:1901.03791  [pdf, other

    stat.CO stat.ME

    Optimization of Survey Weights under a Large Number of Conflicting Constraints

    Authors: Matthew R. Williams, Terrance D. Savitsky

    Abstract: In the analysis of survey data, sampling weights are needed for consistent estimation of the population. However, the original inverse probability weights from the survey sample design are typically modified to account for non-response, to increase efficiency by incorporating auxiliary population information, and to reduce the variability in estimates due to extreme weights. It is often the case t… ▽ More

    Submitted 11 January, 2019; originally announced January 2019.

    Comments: 23 pages, 2 figures, 3 tables

  13. Bayesian Uncertainty Estimation Under Complex Sampling

    Authors: Matthew R. Williams, Terrance D. Savitsky

    Abstract: Social and economic studies are often implemented as complex survey designs. For example, multistage, unequal probability sampling designs utilized by federal statistical agencies are typically constructed to maximize the efficiency of the target domain level estimator (e.g., indexed by geographic area) within cost constraints for survey administration. Such designs may induce dependence between t… ▽ More

    Submitted 29 July, 2019; v1 submitted 31 July, 2018; originally announced July 2018.

    Comments: 45 pages, 4 figures, 1 table

    MSC Class: 62D05; 62F15; 62F12

    Journal ref: International Statistical Review 2020

  14. Bayesian Estimation Under Informative Sampling with Unattenuated Dependence

    Authors: Matthew R. Williams, Terrance D. Savitsky

    Abstract: An informative sampling design leads to unit inclusion probabilities that are correlated with the response variable of interest. However, multistage sampling designs may also induce higher order dependencies, which are typically ignored in the literature when establishing consistency of estimators for survey data under a condition requiring asymptotic independence among the unit inclusion probabil… ▽ More

    Submitted 12 July, 2018; originally announced July 2018.

    Comments: 35 pages, 5 figures. arXiv admin note: text overlap with arXiv:1710.10102

    Journal ref: Bayesian Anal., advance publication, 4 January 2019

  15. Bayesian Pairwise Estimation Under Dependent Informative Sampling

    Authors: Matthew R. Williams, Terrance D. Savitsky

    Abstract: An informative sampling design leads to the selection of units whose inclusion probabilities are correlated with the response variable of interest. Model inference performed on the resulting observed sample will be biased for the population generative model. One approach that produces asymptotically unbiased inference employs marginal inclusion probabilities to form sampling weights used to expone… ▽ More

    Submitted 27 October, 2017; originally announced October 2017.

    Comments: 35 pages, 9 figures

    MSC Class: 62D05; 62G20

    Journal ref: Electron. J. Statist. Volume 12, Number 1 (2018), 1631-1661