-
SoK: The Gap Between Data Rights Ideals and Reality
Authors:
Yu** Kwon,
Ella Corren,
Gonzalo Munilla Garrido,
Chris Hoofnagle,
Dawn Song
Abstract:
As information economies burgeon, they unlock innovation and economic wealth while posing novel threats to civil liberties and altering power dynamics between individuals, companies, and governments. Legislatures have reacted with privacy laws designed to empower individuals over their data. These laws typically create rights for "data subjects" (individuals) to make requests of data collectors (c…
▽ More
As information economies burgeon, they unlock innovation and economic wealth while posing novel threats to civil liberties and altering power dynamics between individuals, companies, and governments. Legislatures have reacted with privacy laws designed to empower individuals over their data. These laws typically create rights for "data subjects" (individuals) to make requests of data collectors (companies and governments). The European Union General Data Protection Regulation (GDPR) exemplifies this, granting extensive data rights to data subjects, a model embraced globally. However, the question remains: do these rights-based privacy laws effectively empower individuals over their data? This paper scrutinizes these approaches by reviewing 201 interdisciplinary empirical studies, news articles, and blog posts. We pinpoint 15 key questions concerning the efficacy of rights allocations. The literature often presents conflicting results regarding the effectiveness of rights-based frameworks, but it generally emphasizes their limitations. We offer recommendations to policymakers and Computer Science (CS) groups committed to these frameworks, and suggest alternative privacy regulation approaches.
△ Less
Submitted 3 December, 2023;
originally announced December 2023.
-
SoK: Privacy-Preserving Data Synthesis
Authors:
Yuzheng Hu,
Fan Wu,
Qinbin Li,
Yunhui Long,
Gonzalo Munilla Garrido,
Chang Ge,
Bolin Ding,
David Forsyth,
Bo Li,
Dawn Song
Abstract:
As the prevalence of data analysis grows, safeguarding data privacy has become a paramount concern. Consequently, there has been an upsurge in the development of mechanisms aimed at privacy-preserving data analyses. However, these approaches are task-specific; designing algorithms for new tasks is a cumbersome process. As an alternative, one can create synthetic data that is (ideally) devoid of pr…
▽ More
As the prevalence of data analysis grows, safeguarding data privacy has become a paramount concern. Consequently, there has been an upsurge in the development of mechanisms aimed at privacy-preserving data analyses. However, these approaches are task-specific; designing algorithms for new tasks is a cumbersome process. As an alternative, one can create synthetic data that is (ideally) devoid of private information. This paper focuses on privacy-preserving data synthesis (PPDS) by providing a comprehensive overview, analysis, and discussion of the field. Specifically, we put forth a master recipe that unifies two prominent strands of research in PPDS: statistical methods and deep learning (DL)-based methods. Under the master recipe, we further dissect the statistical methods into choices of modeling and representation, and investigate the DL-based methods by different generative modeling principles. To consolidate our findings, we provide comprehensive reference tables, distill key takeaways, and identify open problems in the existing literature. In doing so, we aim to answer the following questions: What are the design principles behind different PPDS methods? How can we categorize these methods, and what are the advantages and disadvantages associated with each category? Can we provide guidelines for method selection in different real-world scenarios? We proceed to benchmark several prominent DL-based methods on the task of private image synthesis and conclude that DP-MERF is an all-purpose approach. Finally, upon systematizing the work over the past decade, we identify future directions and call for actions from researchers.
△ Less
Submitted 5 August, 2023; v1 submitted 5 July, 2023;
originally announced July 2023.
-
SoK: Data Privacy in Virtual Reality
Authors:
Gonzalo Munilla Garrido,
Vivek Nair,
Dawn Song
Abstract:
The adoption of virtual reality (VR) technologies has rapidly gained momentum in recent years as companies around the world begin to position the so-called "metaverse" as the next major medium for accessing and interacting with the internet. While consumers have become accustomed to a degree of data harvesting on the web, the real-time nature of data sharing in the metaverse indicates that privacy…
▽ More
The adoption of virtual reality (VR) technologies has rapidly gained momentum in recent years as companies around the world begin to position the so-called "metaverse" as the next major medium for accessing and interacting with the internet. While consumers have become accustomed to a degree of data harvesting on the web, the real-time nature of data sharing in the metaverse indicates that privacy concerns are likely to be even more prevalent in the new "Web 3.0." Research into VR privacy has demonstrated that a plethora of sensitive personal information is observable by various would-be adversaries from just a few minutes of telemetry data. On the other hand, we have yet to see VR parallels for many privacy-preserving tools aimed at mitigating threats on conventional platforms. This paper aims to systematize knowledge on the landscape of VR privacy threats and countermeasures by proposing a comprehensive taxonomy of data attributes, protections, and adversaries based on the study of 68 collected publications. We complement our qualitative discussion with a statistical analysis of the risk associated with various data sources inherent to VR in consideration of the known attacks and defenses. By focusing on highlighting the clear outstanding opportunities, we hope to motivate and guide further research into this increasingly important field.
△ Less
Submitted 18 May, 2023; v1 submitted 14 January, 2023;
originally announced January 2023.
-
Lessons Learned: Surveying the Practicality of Differential Privacy in the Industry
Authors:
Gonzalo Munilla Garrido,
Xiaoyuan Liu,
Florian Matthes,
Dawn Song
Abstract:
Since its introduction in 2006, differential privacy has emerged as a predominant statistical tool for quantifying data privacy in academic works. Yet despite the plethora of research and open-source utilities that have accompanied its rise, with limited exceptions, differential privacy has failed to achieve widespread adoption in the enterprise domain. Our study aims to shed light on the fundamen…
▽ More
Since its introduction in 2006, differential privacy has emerged as a predominant statistical tool for quantifying data privacy in academic works. Yet despite the plethora of research and open-source utilities that have accompanied its rise, with limited exceptions, differential privacy has failed to achieve widespread adoption in the enterprise domain. Our study aims to shed light on the fundamental causes underlying this academic-industrial utilization gap through detailed interviews of 24 privacy practitioners across 9 major companies. We analyze the results of our survey to provide key findings and suggestions for companies striving to improve privacy protection in their data workflows and highlight the necessary and missing requirements of existing differential privacy tools, with the goal of guiding researchers working towards the broader adoption of differential privacy. Our findings indicate that analysts suffer from lengthy bureaucratic processes for requesting access to sensitive data, yet once granted, only scarcely-enforced privacy policies stand between rogue practitioners and misuse of private information. We thus argue that differential privacy can significantly improve the processes of requesting and conducting data exploration across silos, and conclude that with a few of the improvements suggested herein, the practical use of differential privacy across the enterprise is within striking distance.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
Exploring privacy-enhancing technologies in the automotive value chain
Authors:
Gonzalo Munilla Garrido,
Kaja Schmidt,
Christopher Harth-Kitzerow,
Johannes Klepsch,
Andre Luckow,
Florian Matthes
Abstract:
Privacy-enhancing technologies (PETs) are becoming increasingly crucial for addressing customer needs, security, privacy (e.g., enhancing anonymity and confidentiality), and regulatory requirements. However, applying PETs in organizations requires a precise understanding of use cases, technologies, and limitations. This paper investigates several industrial use cases, their characteristics, and th…
▽ More
Privacy-enhancing technologies (PETs) are becoming increasingly crucial for addressing customer needs, security, privacy (e.g., enhancing anonymity and confidentiality), and regulatory requirements. However, applying PETs in organizations requires a precise understanding of use cases, technologies, and limitations. This paper investigates several industrial use cases, their characteristics, and the potential applicability of PETs to these. We conduct expert interviews to identify and classify uses cases, a gray literature review of relevant open-source PET tools, and discuss how the use case characteristics can be addressed using PETs' capabilities. While we focus mainly on automotive use cases, the results also apply to other use case domains.
△ Less
Submitted 12 September, 2022;
originally announced September 2022.
-
Going Incognito in the Metaverse: Achieving Theoretically Optimal Privacy-Usability Tradeoffs in VR
Authors:
Vivek Nair,
Gonzalo Munilla Garrido,
Dawn Song
Abstract:
Virtual reality (VR) telepresence applications and the so-called "metaverse" promise to be the next major medium of human-computer interaction. However, with recent studies demonstrating the ease at which VR users can be profiled and deanonymized, metaverse platforms carry many of the privacy risks of the conventional internet (and more) while at present offering few of the defensive utilities tha…
▽ More
Virtual reality (VR) telepresence applications and the so-called "metaverse" promise to be the next major medium of human-computer interaction. However, with recent studies demonstrating the ease at which VR users can be profiled and deanonymized, metaverse platforms carry many of the privacy risks of the conventional internet (and more) while at present offering few of the defensive utilities that users are accustomed to having access to. To remedy this, we present the first known method of implementing an "incognito mode" for VR. Our technique leverages local differential privacy to quantifiably obscure sensitive user data attributes, with a focus on intelligently adding noise when and where it is needed most to maximize privacy while minimizing usability impact. Our system is capable of flexibly adapting to the unique needs of each VR application to further optimize this trade-off. We implement our solution as a universal Unity (C#) plugin that we then evaluate using several popular VR applications. Upon faithfully replicating the most well-known VR privacy attack studies, we show a significant degradation of attacker capabilities when using our solution.
△ Less
Submitted 23 October, 2023; v1 submitted 10 August, 2022;
originally announced August 2022.
-
Exploring the Privacy Risks of Adversarial VR Game Design
Authors:
Vivek Nair,
Gonzalo Munilla Garrido,
Dawn Song,
James F. O'Brien
Abstract:
Fifty study participants playtested an innocent-looking "escape room" game in virtual reality (VR). Within just a few minutes, an adversarial program had accurately inferred over 25 of their personal data attributes, from anthropometrics like height and wingspan to demographics like age and gender. As notoriously data-hungry companies become increasingly involved in VR development, this experiment…
▽ More
Fifty study participants playtested an innocent-looking "escape room" game in virtual reality (VR). Within just a few minutes, an adversarial program had accurately inferred over 25 of their personal data attributes, from anthropometrics like height and wingspan to demographics like age and gender. As notoriously data-hungry companies become increasingly involved in VR development, this experimental scenario may soon represent a typical VR user experience. Since the Cambridge Analytica scandal of 2018, adversarially designed gamified elements have been known to constitute a significant privacy threat in conventional social platforms. In this work, we present a case study of how metaverse environments can similarly be adversarially constructed to covertly infer dozens of personal data attributes from seemingly anonymous users. While existing VR privacy research largely focuses on passive observation, we argue that because individuals subconsciously reveal personal information via their motion in response to specific stimuli, active attacks pose an outsized risk in VR environments.
△ Less
Submitted 13 December, 2023; v1 submitted 26 July, 2022;
originally announced July 2022.
-
Mitigating Sovereign Data Exchange Challenges: A Map** to Apply Privacy- and Authenticity-Enhancing Technologies
Authors:
Kaja Schmidt,
Gonzalo Munilla Garrido,
Alexander Mühle,
Christoph Meinel
Abstract:
Harmful repercussions from sharing sensitive or personal data can hamper institutions' willingness to engage in data exchange. Thus, institutions consider Authenticity Enhancing Technologies (AETs) and Privacy-Enhancing Technologies (PETs) to engage in Sovereign Data Exchange (SDE), i.e., sharing data with third parties without compromising their own or their users' data sovereignty. However, thes…
▽ More
Harmful repercussions from sharing sensitive or personal data can hamper institutions' willingness to engage in data exchange. Thus, institutions consider Authenticity Enhancing Technologies (AETs) and Privacy-Enhancing Technologies (PETs) to engage in Sovereign Data Exchange (SDE), i.e., sharing data with third parties without compromising their own or their users' data sovereignty. However, these technologies are often technically complex, which impedes their adoption. To support practitioners select PETs and AETs for SDE use cases and highlight SDE challenges researchers and practitioners should address, this study empirically constructs a challenge-oriented technology map**. First, we compile challenges of SDE by conducting a systematic literature review and expert interviews. Second, we map PETs and AETs to the SDE challenges and identify which technologies can mitigate which challenges. We validate the map** through investigator triangulation. Although the most critical challenge concerns data usage and access control, we find that the majority of PETs and AETs focus on data processing issues.
△ Less
Submitted 20 June, 2022;
originally announced July 2022.
-
Towards Verifiable Differentially-Private Polling
Authors:
Gonzalo Munilla Garrido,
Matthias Babel,
Johannes Sedlmeir
Abstract:
Analyses that fulfill differential privacy provide plausible deniability to individuals while allowing analysts to extract insights from data. However, beyond an often acceptable accuracy tradeoff, these statistical disclosure techniques generally inhibit the verifiability of the provided information, as one cannot check the correctness of the participants' truthful information, the differentially…
▽ More
Analyses that fulfill differential privacy provide plausible deniability to individuals while allowing analysts to extract insights from data. However, beyond an often acceptable accuracy tradeoff, these statistical disclosure techniques generally inhibit the verifiability of the provided information, as one cannot check the correctness of the participants' truthful information, the differentially private mechanism, or the unbiased random number generation. While related work has already discussed this opportunity, an efficient implementation with a precise bound on errors and corresponding proofs of the differential privacy property is so far missing. In this paper, we follow an approach based on zero-knowledge proofs~(ZKPs), in specific succinct non-interactive arguments of knowledge, as a verifiable computation technique to prove the correctness of a differentially private query output. In particular, we ensure the guarantees of differential privacy hold despite the limitations of ZKPs that operate on finite fields and have limited branching capabilities. We demonstrate that our approach has practical performance and discuss how practitioners could employ our primitives to verifiably query individuals' age from their digitally signed ID card in a differentially private manner.
△ Less
Submitted 14 June, 2022;
originally announced June 2022.
-
Verifying Outsourced Computation in an Edge Computing Marketplace
Authors:
Christopher Harth-Kitzerow,
Gonzalo Munilla Garrido
Abstract:
An edge computing marketplace could enable IoT devices (Outsourcers) to outsource computation to any participating node (Contractors) in their proximity. In return, these nodes receive a reward for providing computation resources. In this work, we propose a scheme that verifies the integrity of arbitrary deterministic functions and is resistant to both dishonest Outsourcers and Contractors who try…
▽ More
An edge computing marketplace could enable IoT devices (Outsourcers) to outsource computation to any participating node (Contractors) in their proximity. In return, these nodes receive a reward for providing computation resources. In this work, we propose a scheme that verifies the integrity of arbitrary deterministic functions and is resistant to both dishonest Outsourcers and Contractors who try to maximize their expected payoff. We tested our verification scheme with state-of-the-art pre-trained Convolutional Neural Network models designed for object detection. On all devices, our verification scheme causes less than 1ms computational overhead and a negligible network bandwidth overhead of at most 84 bytes per frame. Our implementation can also perform our verification scheme's tasks parallel to the object detection to eliminate any latency overhead. Compared to other proposed verification schemes, our scheme resists a comprehensive set of protocol violations without sacrificing performance.
△ Less
Submitted 23 March, 2022;
originally announced March 2022.
-
Exponential Randomized Response: Boosting Utility in Differentially Private Selection
Authors:
Gonzalo Munilla Garrido,
Florian Matthes
Abstract:
A differentially private selection algorithm outputs from a finite set the item that approximately maximizes a data-dependent quality function. The most widely adopted mechanisms tackling this task are the pioneering exponential mechanism and permute-and-flip, which can offer utility improvements of up to a factor of two over the exponential mechanism. This work introduces a new differentially pri…
▽ More
A differentially private selection algorithm outputs from a finite set the item that approximately maximizes a data-dependent quality function. The most widely adopted mechanisms tackling this task are the pioneering exponential mechanism and permute-and-flip, which can offer utility improvements of up to a factor of two over the exponential mechanism. This work introduces a new differentially private mechanism for private selection and conducts theoretical and empirical comparisons with the above mechanisms. For reasonably common scenarios, our mechanism can provide utility improvements of factors significantly larger than two over the exponential and permute-and-flip mechanisms. Because the utility can deteriorate in niche scenarios, we recommend our mechanism to analysts who can tolerate lower utility for some datasets.
△ Less
Submitted 3 August, 2022; v1 submitted 11 January, 2022;
originally announced January 2022.
-
Do I Get the Privacy I Need? Benchmarking Utility in Differential Privacy Libraries
Authors:
Gonzalo Munilla Garrido,
Joseph Near,
Aitsam Muhammad,
Warren He,
Roman Matzutt,
Florian Matthes
Abstract:
An increasing number of open-source libraries promise to bring differential privacy to practice, even for non-experts. This paper studies five libraries that offer differentially private analytics: Google DP, SmartNoise, diffprivlib, diffpriv, and Chorus. We compare these libraries qualitatively (capabilities, features, and maturity) and quantitatively (utility and scalability) across four analyti…
▽ More
An increasing number of open-source libraries promise to bring differential privacy to practice, even for non-experts. This paper studies five libraries that offer differentially private analytics: Google DP, SmartNoise, diffprivlib, diffpriv, and Chorus. We compare these libraries qualitatively (capabilities, features, and maturity) and quantitatively (utility and scalability) across four analytics queries (count, sum, mean, and variance) executed on synthetic and real-world datasets. We conclude that these libraries provide similar utility (except in some notable scenarios). However, there are significant differences in the features provided, and we find that no single library excels in all areas. Based on our results, we provide guidance for practitioners to help in choosing a suitable library, guidance for library designers to enhance their software, and guidance for researchers on open challenges in differential privacy tools for non-experts.
△ Less
Submitted 22 September, 2021;
originally announced September 2021.
-
Revealing the Landscape of Privacy-Enhancing Technologies in the Context of Data Markets for the IoT: A Systematic Literature Review
Authors:
Gonzalo Munilla Garrido,
Johannes Sedlmeir,
Ömer Uludağ,
Ilias Soto Alaoui,
Andre Luckow,
Florian Matthes
Abstract:
IoT data markets in public and private institutions have become increasingly relevant in recent years because of their potential to improve data availability and unlock new business models. However, exchanging data in markets bears considerable challenges related to disclosing sensitive information. Despite considerable research focused on different aspects of privacy-enhancing data markets for th…
▽ More
IoT data markets in public and private institutions have become increasingly relevant in recent years because of their potential to improve data availability and unlock new business models. However, exchanging data in markets bears considerable challenges related to disclosing sensitive information. Despite considerable research focused on different aspects of privacy-enhancing data markets for the IoT, none of the solutions proposed so far seems to find a practical adoption. Thus, this study aims to organize the state-of-the-art solutions, analyze and scope the technologies that have been suggested in this context, and structure the remaining challenges to determine areas where future research is required. To accomplish this goal, we conducted a systematic literature review on privacy enhancement in data markets for the IoT, covering 50 publications dated up to July 2020, and provided updates with 24 publications dated up to May 2022. Our results indicate that most research in this area has emerged only recently, and no IoT data market architecture has established itself as canonical. Existing solutions frequently lack the required combination of anonymization and secure computation technologies. Furthermore, there is no consensus on the appropriate use of blockchain technology for IoT data markets and a low degree of leveraging existing libraries or reusing generic data market architectures. We also identified significant challenges remaining, such as the copy problem and the recursive enforcement problem that-while solutions have been suggested to some extent-are often not sufficiently addressed in proposed designs. We conclude that privacy-enhancing technologies need further improvements to positively impact data markets so that, ultimately, the value of data is preserved through data scarcity and users' privacy and businesses-critical information are protected.
△ Less
Submitted 12 July, 2022; v1 submitted 25 July, 2021;
originally announced July 2021.