Skip to main content

Showing 1–4 of 4 results for author: Raab, G M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.16826  [pdf, other

    stat.AP

    Practical privacy metrics for synthetic data

    Authors: Gillian M Raab, Beata Nowok, Chris Dibben

    Abstract: This paper explains how the synthpop package for R has been extended to include functions to calculate measures of identity and attribute disclosure risk for synthetic data that measure risks for the records used to create the synthetic data. The basic function, disclosure, calculates identity disclosure for a set of quasi-identifiers (keys) and attribute disclosure for one variable specified as a… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: 12 pages, 2 figures, plus 7 more pages with references ands appendices

  2. arXiv:2206.01362  [pdf, other

    stat.AP

    Utility and Disclosure Risk for Differentially Private Synthetic Categorical Data

    Authors: Gillian M Raab

    Abstract: This paper introduces two methods of creating differentially private (DP) synthetic data that are now incorporated into the \textit{synthpop} package for \textbf{R}. Both are suitable for synthesising categorical data, or numeric data grouped into categories. Ten data sets with varying characteristics were used to evaluate the methods. Measures of disclosiveness and of utility were defined and cal… ▽ More

    Submitted 26 June, 2022; v1 submitted 2 June, 2022; originally announced June 2022.

  3. arXiv:2109.12717  [pdf, other

    stat.CO

    Assessing, visualizing and improving the utility of synthetic data

    Authors: Gillian M Raab, Beata Nowok, Chris Dibben

    Abstract: The synthpop package for R https://www.synthpop.org.uk provides tools to allow data custodians to create synthetic versions of confidential microdata that can be distributed with fewer restrictions than the original. The synthesis can be customized to ensure that relationships evident in the real data are reproduced in the synthetic data. A number of measures have been proposed to assess this aspe… ▽ More

    Submitted 13 November, 2021; v1 submitted 26 September, 2021; originally announced September 2021.

    Comments: main text and references 13. Four appendices on pages 14-19. Four figures

  4. arXiv:1712.04078  [pdf, other

    stat.AP

    Guidelines for Producing Useful Synthetic Data

    Authors: Gillian M. Raab, Beata Nowok, Chris Dibben

    Abstract: We report on our experiences of hel** staff of the Scottish Longitudinal Study to create synthetic extracts that can be released to users. In particular, we focus on how the synthesis process can be tailored to produce synthetic extracts that will provide users with similar results to those that would be obtained from the original data. We make recommendations for synthesis methods and illustrat… ▽ More

    Submitted 11 December, 2017; originally announced December 2017.