-
Usable Differential Privacy: A Case Study with PSI
Authors:
Jack Murtagh,
Kathryn Taylor,
George Kellaris,
Salil Vadhan
Abstract:
Differential privacy is a promising framework for addressing the privacy concerns in sharing sensitive datasets for others to analyze. However differential privacy is a highly technical area and current deployments often require experts to write code, tune parameters, and optimize the trade-off between the privacy and accuracy of statistical releases. For differential privacy to achieve its potent…
▽ More
Differential privacy is a promising framework for addressing the privacy concerns in sharing sensitive datasets for others to analyze. However differential privacy is a highly technical area and current deployments often require experts to write code, tune parameters, and optimize the trade-off between the privacy and accuracy of statistical releases. For differential privacy to achieve its potential for wide impact, it is important to design usable systems that enable differential privacy to be used by ordinary data owners and analysts. PSI is a tool that was designed for this purpose, allowing researchers to release useful differentially private statistical information about their datasets without being experts in computer science, statistics, or privacy. We conducted a thorough usability study of PSI to test whether it accomplishes its goal of usability by non-experts. The usability test illuminated which features of PSI are most user-friendly and prompted us to improve aspects of the tool that caused confusion. The test also highlighted some general principles and lessons for designing usable systems for differential privacy, which we discuss in depth.
△ Less
Submitted 11 September, 2018;
originally announced September 2018.
-
Revealing the Unseen: How to Expose Cloud Usage While Protecting User Privacy
Authors:
Ata Turk,
Mayank Varia,
Georgios Kellaris
Abstract:
Cloud users have little visibility into the performance characteristics and utilization of the physical machines underpinning the virtualized cloud resources they use. This uncertainty forces users and researchers to reverse engineer the inner workings of cloud systems in order to understand and optimize the conditions their applications operate. At Massachusetts Open Cloud (MOC), as a public clou…
▽ More
Cloud users have little visibility into the performance characteristics and utilization of the physical machines underpinning the virtualized cloud resources they use. This uncertainty forces users and researchers to reverse engineer the inner workings of cloud systems in order to understand and optimize the conditions their applications operate. At Massachusetts Open Cloud (MOC), as a public cloud operator, we'd like to expose the utilization of our physical infrastructure to stop this wasteful effort. Mindful that such exposure can be used maliciously for gaining insight into other users workloads, in this position paper we argue for the need for an approach that balances openness of the cloud overall with privacy for each tenant inside of it. We believe that this approach can be instantiated via a novel combination of several security and privacy technologies. We discuss the potential benefits, implications of transparency for cloud systems and users, and technical challenges/possibilities.
△ Less
Submitted 2 October, 2017;
originally announced October 2017.
-
$\mathcal{E}\text{psolute}$: Efficiently Querying Databases While Providing Differential Privacy
Authors:
Dmytro Bogatov,
Georgios Kellaris,
George Kollios,
Kobbi Nissim,
Adam O'Neill
Abstract:
As organizations struggle with processing vast amounts of information, outsourcing sensitive data to third parties becomes a necessity. To protect the data, various cryptographic techniques are used in outsourced database systems to ensure data privacy, while allowing efficient querying. A rich collection of attacks on such systems has emerged. Even with strong cryptography, just communication vol…
▽ More
As organizations struggle with processing vast amounts of information, outsourcing sensitive data to third parties becomes a necessity. To protect the data, various cryptographic techniques are used in outsourced database systems to ensure data privacy, while allowing efficient querying. A rich collection of attacks on such systems has emerged. Even with strong cryptography, just communication volume or access pattern is enough for an adversary to succeed.
In this work we present a model for differentially private outsourced database system and a concrete construction, $\mathcal{E}\text{psolute}$, that provably conceals the aforementioned leakages, while remaining efficient and scalable. In our solution, differential privacy is preserved at the record level even against an untrusted server that controls data and queries. $\mathcal{E}\text{psolute}$ combines Oblivious RAM and differentially private sanitizers to create a generic and efficient construction.
We go further and present a set of improvements to bring the solution to efficiency and practicality necessary for real-world adoption. We describe the way to parallelize the operations, minimize the amount of noise, and reduce the number of network requests, while preserving the privacy guarantees. We have run an extensive set of experiments, dozens of servers processing up to 10 million records, and compiled a detailed result analysis proving the efficiency and scalability of our solution. While providing strong security and privacy guarantees we are less than an order of magnitude slower than range query execution of a non-secure plain-text optimized RDBMS like MySQL and PostgreSQL.
△ Less
Submitted 27 September, 2021; v1 submitted 5 June, 2017;
originally announced June 2017.
-
Engineering Methods for Differentially Private Histograms: Efficiency Beyond Utility
Authors:
Georgios Kellaris,
Stavros Papadopoulos,
Dimitris Papadias
Abstract:
Publishing histograms with $ε$-differential privacy has been studied extensively in the literature. Existing schemes aim at maximizing the utility of the published data, while previous experimental evaluations analyze the privacy/utility trade-off. In this paper we provide the first experimental evaluation of differentially private methods that goes beyond utility, emphasizing also on another impo…
▽ More
Publishing histograms with $ε$-differential privacy has been studied extensively in the literature. Existing schemes aim at maximizing the utility of the published data, while previous experimental evaluations analyze the privacy/utility trade-off. In this paper we provide the first experimental evaluation of differentially private methods that goes beyond utility, emphasizing also on another important aspect, namely efficiency. Towards this end, we first observe that all existing schemes are comprised of a small set of common blocks. We then optimize and choose the best implementation for each block, determine the combinations of blocks that capture the entire literature, and propose novel block combinations. We qualitatively assess the quality of the schemes based on the skyline of efficiency and utility, i.e., based on whether a method is dominated on both aspects or not. Using exhaustive experiments on four real datasets with different characteristics, we conclude that there are always trade-offs in terms of utility and efficiency. We demonstrate that the schemes derived from our novel block combinations provide the best trade-offs for time critical applications. Our work can serve as a guide to help practitioners engineer a differentially private histogram scheme depending on their application requirements.
△ Less
Submitted 20 April, 2017; v1 submitted 14 April, 2015;
originally announced April 2015.