-
An applied Perspective: Estimating the Differential Identifiability Risk of an Exemplary SOEP Data Set
Authors:
Jonas Allmann,
Saskia Nuñez von Voigt,
Florian Tschorsch
Abstract:
Using real-world study data usually requires contractual agreements where research results may only be published in anonymized form. Requiring formal privacy guarantees, such as differential privacy, could be helpful for data-driven projects to comply with data protection. However, deploying differential privacy in consumer use cases raises the need to explain its underlying mechanisms and the res…
▽ More
Using real-world study data usually requires contractual agreements where research results may only be published in anonymized form. Requiring formal privacy guarantees, such as differential privacy, could be helpful for data-driven projects to comply with data protection. However, deploying differential privacy in consumer use cases raises the need to explain its underlying mechanisms and the resulting privacy guarantees. In this paper, we thoroughly review and extend an existing privacy metric. We show how to compute this risk metric efficiently for a set of basic statistical queries. Our empirical analysis based on an extensive, real-world scientific data set expands the knowledge on how to compute risks under realistic conditions, while presenting more challenges than solutions.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
From Theory to Comprehension: A Comparative Study of Differential Privacy and $k$-Anonymity
Authors:
Saskia Nuñez von Voigt,
Luise Mehner,
Florian Tschorsch
Abstract:
The notion of $\varepsilon$-differential privacy is a widely used concept of providing quantifiable privacy to individuals. However, it is unclear how to explain the level of privacy protection provided by a differential privacy mechanism with a set $\varepsilon$. In this study, we focus on users' comprehension of the privacy protection provided by a differential privacy mechanism. To do so, we st…
▽ More
The notion of $\varepsilon$-differential privacy is a widely used concept of providing quantifiable privacy to individuals. However, it is unclear how to explain the level of privacy protection provided by a differential privacy mechanism with a set $\varepsilon$. In this study, we focus on users' comprehension of the privacy protection provided by a differential privacy mechanism. To do so, we study three variants of explaining the privacy protection provided by differential privacy: (1) the original mathematical definition; (2) $\varepsilon$ translated into a specific privacy risk; and (3) an explanation using the randomized response technique. We compare users' comprehension of privacy protection employing these explanatory models with their comprehension of privacy protection of $k$-anonymity as baseline comprehensibility. Our findings suggest that participants' comprehension of differential privacy protection is enhanced by the privacy risk model and the randomized response-based model. Moreover, our results confirm our intuition that privacy protection provided by $k$-anonymity is more comprehensible.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Redefining Recon: Bridging Gaps with UAVs, 360 degree Cameras, and Neural Radiance Fields
Authors:
Hartmut Surmann,
Niklas Digakis,
Jan-Nicklas Kremer,
Julien Meine,
Max Schulte,
Niklas Voigt
Abstract:
In the realm of digital situational awareness during disaster situations, accurate digital representations, like 3D models, play an indispensable role. To ensure the safety of rescue teams, robotic platforms are often deployed to generate these models. In this paper, we introduce an innovative approach that synergizes the capabilities of compact Unmaned Arial Vehicles (UAVs), smaller than 30 cm, e…
▽ More
In the realm of digital situational awareness during disaster situations, accurate digital representations, like 3D models, play an indispensable role. To ensure the safety of rescue teams, robotic platforms are often deployed to generate these models. In this paper, we introduce an innovative approach that synergizes the capabilities of compact Unmaned Arial Vehicles (UAVs), smaller than 30 cm, equipped with 360 degree cameras and the advances of Neural Radiance Fields (NeRFs). A NeRF, a specialized neural network, can deduce a 3D representation of any scene using 2D images and then synthesize it from various angles upon request. This method is especially tailored for urban environments which have experienced significant destruction, where the structural integrity of buildings is compromised to the point of barring entry-commonly observed post-earthquakes and after severe fires. We have tested our approach through recent post-fire scenario, underlining the efficacy of NeRFs even in challenging outdoor environments characterized by water, snow, varying light conditions, and reflective surfaces.
△ Less
Submitted 30 November, 2023;
originally announced January 2024.
-
Towards Standardized Mobility Reports with User-Level Privacy
Authors:
Alexandra Kapp,
Saskia Nuñez von Voigt,
Helena Mihaljević,
Florian Tschorsch
Abstract:
The importance of human mobility analyses is growing in both research and practice, especially as applications for urban planning and mobility rely on them. Aggregate statistics and visualizations play an essential role as building blocks of data explorations and summary reports, the latter being increasingly released to third parties such as municipal administrations or in the context of citizen…
▽ More
The importance of human mobility analyses is growing in both research and practice, especially as applications for urban planning and mobility rely on them. Aggregate statistics and visualizations play an essential role as building blocks of data explorations and summary reports, the latter being increasingly released to third parties such as municipal administrations or in the context of citizen participation. However, such explorations already pose a threat to privacy as they reveal potentially sensitive location information, and thus should not be shared without further privacy measures.
There is a substantial gap between state-of-the-art research on privacy methods and their utilization in practice. We thus conceptualize a standardized mobility report with differential privacy guarantees and implement it as open-source software to enable a privacy-preserving exploration of key aspects of mobility data in an easily accessible way. Moreover, we evaluate the benefits of limiting user contributions using three data sets relevant to research and practice. Our results show that even a strong limit on user contribution alters the original geospatial distribution only within a comparatively small range, while significantly reducing the error introduced by adding noise to achieve privacy guarantees.
△ Less
Submitted 19 September, 2022;
originally announced September 2022.
-
"Am I Private and If So, how Many?" - Communicating Privacy Guarantees of Differential Privacy with Risk Communication Formats
Authors:
Daniel Franzen,
Saskia Nuñez von Voigt,
Peter Sörries,
Florian Tschorsch,
Claudia Müller-Birn
Abstract:
Decisions about sharing personal information are not trivial, since there are many legitimate and important purposes for such data collection, but often the collected data can reveal sensitive information about individuals. Privacy-preserving technologies, such as differential privacy (DP), can be employed to protect the privacy of individuals and, furthermore, provide mathematically sound guarant…
▽ More
Decisions about sharing personal information are not trivial, since there are many legitimate and important purposes for such data collection, but often the collected data can reveal sensitive information about individuals. Privacy-preserving technologies, such as differential privacy (DP), can be employed to protect the privacy of individuals and, furthermore, provide mathematically sound guarantees on the maximum privacy risk. However, they can only support informed privacy decisions, if individuals understand the provided privacy guarantees. This article proposes a novel approach for communicating privacy guarantees to support individuals in their privacy decisions when sharing data. For this, we adopt risk communication formats from the medical domain in conjunction with a model for privacy guarantees of DP to create quantitative privacy risk notifications. We conducted a crowd-sourced study with 343 participants to evaluate how well our notifications conveyed the privacy risk information and how confident participants were about their own understanding of the privacy risk. Our findings suggest that these new notifications can communicate the objective information similarly well to currently used qualitative notifications, but left individuals less confident in their understanding. We also discovered that several of our notifications and the currently used qualitative notification disadvantage individuals with low numeracy: these individuals appear overconfident compared to their actual understanding of the associated privacy risks and are, therefore, less likely to seek the needed additional information before an informed decision. The promising results allow for multiple directions in future research, for example, adding visual aids or tailoring privacy risk communication to characteristics of the individuals.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
"Am I Private and If So, how Many?" -- Using Risk Communication Formats for Making Differential Privacy Understandable
Authors:
Daniel Franzen,
Saskia Nuñez von Voigt,
Peter Sörries,
Florian Tschorsch,
Claudia Müller-Birn
Abstract:
Mobility data is essential for cities and communities to identify areas for necessary improvement. Data collected by mobility providers already contains all the information necessary, but privacy of the individuals needs to be preserved. Differential privacy (DP) defines a mathematical property which guarantees that certain limits of privacy are preserved while sharing such data, but its functiona…
▽ More
Mobility data is essential for cities and communities to identify areas for necessary improvement. Data collected by mobility providers already contains all the information necessary, but privacy of the individuals needs to be preserved. Differential privacy (DP) defines a mathematical property which guarantees that certain limits of privacy are preserved while sharing such data, but its functionality and privacy protection are difficult to explain to laypeople. In this paper, we adapt risk communication formats in conjunction with a model for the privacy risks of DP. The result are privacy notifications which explain the risk to an individual's privacy when using DP, rather than DP's functionality. We evaluate these novel privacy communication formats in a crowdsourced study. We find that they perform similarly to the best performing DP communications used currently in terms of objective understanding, but did not make our participants as confident in their understanding. We also discovered an influence, similar to the Dunning-Kruger effect, of the statistical numeracy on the effectiveness of some of our privacy communication formats and the DP communication format used currently. These results generate hypotheses in multiple directions, for example, toward the use of risk visualization to improve the understandability of our formats or toward adaptive user interfaces which tailor the risk communication to the characteristics of the reader.
△ Less
Submitted 22 June, 2023; v1 submitted 8 April, 2022;
originally announced April 2022.
-
Self-Determined Reciprocal Recommender System with Strong Privacy Guarantees
Authors:
S. Nuñez von Voigt,
E. Daniel,
F. Tschorsch
Abstract:
Recommender systems are widely used. Usually, recommender systems are based on a centralized client-server architecture. However, this approach implies drawbacks regarding the privacy of users. In this paper, we propose a distributed reciprocal recommender system with strong, self-determined privacy guarantees, i.e., local differential privacy. More precisely, users randomize their profiles locall…
▽ More
Recommender systems are widely used. Usually, recommender systems are based on a centralized client-server architecture. However, this approach implies drawbacks regarding the privacy of users. In this paper, we propose a distributed reciprocal recommender system with strong, self-determined privacy guarantees, i.e., local differential privacy. More precisely, users randomize their profiles locally and exchange them via a peer-to-peer network. Recommendations are then computed and ranked locally by estimating similarities between profiles. We evaluate recommendation accuracy of a job recommender system and demonstrate that our method provides acceptable utility under strong privacy requirements.
△ Less
Submitted 14 July, 2021;
originally announced July 2021.
-
Privacy and Confidentiality in Process Mining -- Threats and Research Challenges
Authors:
Gamal Elkoumy,
Stephan A. Fahrenkrog-Petersen,
Mohammadreza Fani Sani,
Agnes Koschmider,
Felix Mannhardt,
Saskia Nuñez von Voigt,
Majid Rafiei,
Leopold von Waldthausen
Abstract:
Privacy and confidentiality are very important prerequisites for applying process mining in order to comply with regulations and keep company secrets. This paper provides a foundation for future research on privacy-preserving and confidential process mining techniques. Main threats are identified and related to an motivation application scenario in a hospital context as well as to the current body…
▽ More
Privacy and confidentiality are very important prerequisites for applying process mining in order to comply with regulations and keep company secrets. This paper provides a foundation for future research on privacy-preserving and confidential process mining techniques. Main threats are identified and related to an motivation application scenario in a hospital context as well as to the current body of work on privacy and confidentiality in process mining. A newly developed conceptual model structures the discussion that existing techniques leave room for improvement. This results in a number of important research challenges that should be addressed by future process mining research.
△ Less
Submitted 1 June, 2021;
originally announced June 2021.
-
Every Query Counts: Analyzing the Privacy Loss of Exploratory Data Analyses
Authors:
Saskia Nuñez von Voigt,
Mira Pauli,
Johanna Reichert,
Florian Tschorsch
Abstract:
An exploratory data analysis is an essential step for every data analyst to gain insights, evaluate data quality and (if required) select a machine learning model for further processing. While privacy-preserving machine learning is on the rise, more often than not this initial analysis is not counted towards the privacy budget. In this paper, we quantify the privacy loss for basic statistical func…
▽ More
An exploratory data analysis is an essential step for every data analyst to gain insights, evaluate data quality and (if required) select a machine learning model for further processing. While privacy-preserving machine learning is on the rise, more often than not this initial analysis is not counted towards the privacy budget. In this paper, we quantify the privacy loss for basic statistical functions and highlight the importance of taking it into account when calculating the privacy-loss budget of a machine learning approach.
△ Less
Submitted 27 August, 2020;
originally announced August 2020.
-
Quantifying the Re-identification Risk of Event Logs for Process Mining
Authors:
S. Nuñez von Voigt,
S. A. Fahrenkrog-Petersen,
D. Janssen,
A. Koschmider,
F. Tschorsch,
F. Mannhardt,
O. Landsiedel,
M. Weidlich
Abstract:
Event logs recorded during the execution of business processes constitute a valuable source of information. Applying process mining techniques to them, event logs may reveal the actual process execution and enable reasoning on quantitative or qualitative process properties. However, event logs often contain sensitive information that could be related to individual process stakeholders through back…
▽ More
Event logs recorded during the execution of business processes constitute a valuable source of information. Applying process mining techniques to them, event logs may reveal the actual process execution and enable reasoning on quantitative or qualitative process properties. However, event logs often contain sensitive information that could be related to individual process stakeholders through background information and cross-correlation. We therefore argue that, when publishing event logs, the risk of such re-identification attacks must be considered. In this paper, we show how to quantify the re-identification risk with measures for the individual uniqueness in event logs. We also report on a large-scale study that explored the individual uniqueness in a collection of publicly available event logs. Our results suggest that potentially up to all of the cases in an event log may be re-identified, which highlights the importance of privacy-preserving techniques in process mining.
△ Less
Submitted 19 June, 2020; v1 submitted 24 March, 2020;
originally announced March 2020.
-
ZnO Nanocrystal Networks Near the Insulator-Metal Transition: Tuning Contact Radius and Electron Density with Intense Pulsed Light
Authors:
Benjamin L. Greenberg,
Zachary L. Robinson,
K. V. Reich,
Claudia Gorynski,
Bryan N. Voigt,
Lorraine F. Francis,
B. I. Shklovskii,
Eray S. Aydil,
Uwe R. Kortshagen
Abstract:
Networks of ligand-free semiconductor nanocrystals (NCs) offer a valuable combination of high carrier mobility and optoelectronic properties tunable via quantum confinement. In principle, maximizing carrier mobility entails crossing the insulator-metal transition (IMT), where carriers become delocalized. A recent theoretical study predicted that this transition occurs at nρ^3 ~ 0.3, where n is the…
▽ More
Networks of ligand-free semiconductor nanocrystals (NCs) offer a valuable combination of high carrier mobility and optoelectronic properties tunable via quantum confinement. In principle, maximizing carrier mobility entails crossing the insulator-metal transition (IMT), where carriers become delocalized. A recent theoretical study predicted that this transition occurs at nρ^3 ~ 0.3, where n is the carrier density and ρis the interparticle contact radius. In this work, we satisfy this criterion in networks of plasma-synthesized ZnO NCs by using intense pulsed light (IPL) annealing to tune n and ρindependently. IPL applied to as-deposited NCs increases ρby inducing sintering, and IPL applied after the NCs are coated with Al2O3 by atomic layer deposition increases n by removing electron-trap** surface hydroxyls. This procedure does not substantially alter NC size or composition and is potentially applicable to a wide variety of nanomaterials. As we increase nρ^3 to at least twice the predicted critical value, we observe conductivity scaling consistent with arrival at the critical region of a continuous quantum phase transition. This allows us to determine the critical behavior of the dielectric constant and electron localization length at the IMT. However, our samples remain on the insulating side of the critical region, which suggests that the critical value of nρ^3 may in fact be significantly higher than 0.3.
△ Less
Submitted 23 January, 2018;
originally announced January 2018.