Search | arXiv e-print repository

An applied Perspective: Estimating the Differential Identifiability Risk of an Exemplary SOEP Data Set

Authors: Jonas Allmann, Saskia Nuñez von Voigt, Florian Tschorsch

Abstract: Using real-world study data usually requires contractual agreements where research results may only be published in anonymized form. Requiring formal privacy guarantees, such as differential privacy, could be helpful for data-driven projects to comply with data protection. However, deploying differential privacy in consumer use cases raises the need to explain its underlying mechanisms and the res… ▽ More Using real-world study data usually requires contractual agreements where research results may only be published in anonymized form. Requiring formal privacy guarantees, such as differential privacy, could be helpful for data-driven projects to comply with data protection. However, deploying differential privacy in consumer use cases raises the need to explain its underlying mechanisms and the resulting privacy guarantees. In this paper, we thoroughly review and extend an existing privacy metric. We show how to compute this risk metric efficiently for a set of basic statistical queries. Our empirical analysis based on an extensive, real-world scientific data set expands the knowledge on how to compute risks under realistic conditions, while presenting more challenges than solutions. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Accepted on IWPE 2024

arXiv:2404.04006 [pdf, other]

From Theory to Comprehension: A Comparative Study of Differential Privacy and $k$-Anonymity

Authors: Saskia Nuñez von Voigt, Luise Mehner, Florian Tschorsch

Abstract: The notion of $\varepsilon$-differential privacy is a widely used concept of providing quantifiable privacy to individuals. However, it is unclear how to explain the level of privacy protection provided by a differential privacy mechanism with a set $\varepsilon$. In this study, we focus on users' comprehension of the privacy protection provided by a differential privacy mechanism. To do so, we st… ▽ More The notion of $\varepsilon$-differential privacy is a widely used concept of providing quantifiable privacy to individuals. However, it is unclear how to explain the level of privacy protection provided by a differential privacy mechanism with a set $\varepsilon$. In this study, we focus on users' comprehension of the privacy protection provided by a differential privacy mechanism. To do so, we study three variants of explaining the privacy protection provided by differential privacy: (1) the original mathematical definition; (2) $\varepsilon$ translated into a specific privacy risk; and (3) an explanation using the randomized response technique. We compare users' comprehension of privacy protection employing these explanatory models with their comprehension of privacy protection of $k$-anonymity as baseline comprehensibility. Our findings suggest that participants' comprehension of differential privacy protection is enhanced by the privacy risk model and the randomized response-based model. Moreover, our results confirm our intuition that privacy protection provided by $k$-anonymity is more comprehensible. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: Accepted to ACM CODASPY'24, 19-21 June 2024, Porto, Portugal

arXiv:2401.06143 [pdf, other]

Redefining Recon: Bridging Gaps with UAVs, 360 degree Cameras, and Neural Radiance Fields

Authors: Hartmut Surmann, Niklas Digakis, Jan-Nicklas Kremer, Julien Meine, Max Schulte, Niklas Voigt

Abstract: In the realm of digital situational awareness during disaster situations, accurate digital representations, like 3D models, play an indispensable role. To ensure the safety of rescue teams, robotic platforms are often deployed to generate these models. In this paper, we introduce an innovative approach that synergizes the capabilities of compact Unmaned Arial Vehicles (UAVs), smaller than 30 cm, e… ▽ More In the realm of digital situational awareness during disaster situations, accurate digital representations, like 3D models, play an indispensable role. To ensure the safety of rescue teams, robotic platforms are often deployed to generate these models. In this paper, we introduce an innovative approach that synergizes the capabilities of compact Unmaned Arial Vehicles (UAVs), smaller than 30 cm, equipped with 360 degree cameras and the advances of Neural Radiance Fields (NeRFs). A NeRF, a specialized neural network, can deduce a 3D representation of any scene using 2D images and then synthesize it from various angles upon request. This method is especially tailored for urban environments which have experienced significant destruction, where the structural integrity of buildings is compromised to the point of barring entry-commonly observed post-earthquakes and after severe fires. We have tested our approach through recent post-fire scenario, underlining the efficacy of NeRFs even in challenging outdoor environments characterized by water, snow, varying light conditions, and reflective surfaces. △ Less

Submitted 30 November, 2023; originally announced January 2024.

Comments: 6 pages, published at IEEE International Symposium on Safety,Security,and Rescue Robotics SSRR2023 in FUKUSHIMA, November 13-15 2023

arXiv:2209.08921 [pdf, other]

doi 10.1080/17489725.2022.2148008

Towards Standardized Mobility Reports with User-Level Privacy

Authors: Alexandra Kapp, Saskia Nuñez von Voigt, Helena Mihaljević, Florian Tschorsch

Abstract: The importance of human mobility analyses is growing in both research and practice, especially as applications for urban planning and mobility rely on them. Aggregate statistics and visualizations play an essential role as building blocks of data explorations and summary reports, the latter being increasingly released to third parties such as municipal administrations or in the context of citizen… ▽ More The importance of human mobility analyses is growing in both research and practice, especially as applications for urban planning and mobility rely on them. Aggregate statistics and visualizations play an essential role as building blocks of data explorations and summary reports, the latter being increasingly released to third parties such as municipal administrations or in the context of citizen participation. However, such explorations already pose a threat to privacy as they reveal potentially sensitive location information, and thus should not be shared without further privacy measures. There is a substantial gap between state-of-the-art research on privacy methods and their utilization in practice. We thus conceptualize a standardized mobility report with differential privacy guarantees and implement it as open-source software to enable a privacy-preserving exploration of key aspects of mobility data in an easily accessible way. Moreover, we evaluate the benefits of limiting user contributions using three data sets relevant to research and practice. Our results show that even a strong limit on user contribution alters the original geospatial distribution only within a comparatively small range, while significantly reducing the error introduced by adding noise to achieve privacy guarantees. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Journal ref: Journal of Location Based Services, 2022

arXiv:2208.10820 [pdf, other]

doi 10.1145/3548606.3560693

"Am I Private and If So, how Many?" - Communicating Privacy Guarantees of Differential Privacy with Risk Communication Formats

Authors: Daniel Franzen, Saskia Nuñez von Voigt, Peter Sörries, Florian Tschorsch, Claudia Müller-Birn

Abstract: Decisions about sharing personal information are not trivial, since there are many legitimate and important purposes for such data collection, but often the collected data can reveal sensitive information about individuals. Privacy-preserving technologies, such as differential privacy (DP), can be employed to protect the privacy of individuals and, furthermore, provide mathematically sound guarant… ▽ More Decisions about sharing personal information are not trivial, since there are many legitimate and important purposes for such data collection, but often the collected data can reveal sensitive information about individuals. Privacy-preserving technologies, such as differential privacy (DP), can be employed to protect the privacy of individuals and, furthermore, provide mathematically sound guarantees on the maximum privacy risk. However, they can only support informed privacy decisions, if individuals understand the provided privacy guarantees. This article proposes a novel approach for communicating privacy guarantees to support individuals in their privacy decisions when sharing data. For this, we adopt risk communication formats from the medical domain in conjunction with a model for privacy guarantees of DP to create quantitative privacy risk notifications. We conducted a crowd-sourced study with 343 participants to evaluate how well our notifications conveyed the privacy risk information and how confident participants were about their own understanding of the privacy risk. Our findings suggest that these new notifications can communicate the objective information similarly well to currently used qualitative notifications, but left individuals less confident in their understanding. We also discovered that several of our notifications and the currently used qualitative notification disadvantage individuals with low numeracy: these individuals appear overconfident compared to their actual understanding of the associated privacy risks and are, therefore, less likely to seek the needed additional information before an informed decision. The promising results allow for multiple directions in future research, for example, adding visual aids or tailoring privacy risk communication to characteristics of the individuals. △ Less

Submitted 23 August, 2022; originally announced August 2022.

Comments: Accepted to ACM CCS 2022. arXiv admin note: substantial text overlap with arXiv:2204.04061

arXiv:2204.04061

"Am I Private and If So, how Many?" -- Using Risk Communication Formats for Making Differential Privacy Understandable

Authors: Daniel Franzen, Saskia Nuñez von Voigt, Peter Sörries, Florian Tschorsch, Claudia Müller-Birn

Abstract: Mobility data is essential for cities and communities to identify areas for necessary improvement. Data collected by mobility providers already contains all the information necessary, but privacy of the individuals needs to be preserved. Differential privacy (DP) defines a mathematical property which guarantees that certain limits of privacy are preserved while sharing such data, but its functiona… ▽ More Mobility data is essential for cities and communities to identify areas for necessary improvement. Data collected by mobility providers already contains all the information necessary, but privacy of the individuals needs to be preserved. Differential privacy (DP) defines a mathematical property which guarantees that certain limits of privacy are preserved while sharing such data, but its functionality and privacy protection are difficult to explain to laypeople. In this paper, we adapt risk communication formats in conjunction with a model for the privacy risks of DP. The result are privacy notifications which explain the risk to an individual's privacy when using DP, rather than DP's functionality. We evaluate these novel privacy communication formats in a crowdsourced study. We find that they perform similarly to the best performing DP communications used currently in terms of objective understanding, but did not make our participants as confident in their understanding. We also discovered an influence, similar to the Dunning-Kruger effect, of the statistical numeracy on the effectiveness of some of our privacy communication formats and the DP communication format used currently. These results generate hypotheses in multiple directions, for example, toward the use of risk visualization to improve the understandability of our formats or toward adaptive user interfaces which tailor the risk communication to the characteristics of the reader. △ Less

Submitted 22 June, 2023; v1 submitted 8 April, 2022; originally announced April 2022.

Comments: A newer version of this article was submitted: arXiv.2208.10820

arXiv:2107.06590 [pdf, other]

doi 10.1145/3465481.3465769

Self-Determined Reciprocal Recommender System with Strong Privacy Guarantees

Authors: S. Nuñez von Voigt, E. Daniel, F. Tschorsch

Abstract: Recommender systems are widely used. Usually, recommender systems are based on a centralized client-server architecture. However, this approach implies drawbacks regarding the privacy of users. In this paper, we propose a distributed reciprocal recommender system with strong, self-determined privacy guarantees, i.e., local differential privacy. More precisely, users randomize their profiles locall… ▽ More Recommender systems are widely used. Usually, recommender systems are based on a centralized client-server architecture. However, this approach implies drawbacks regarding the privacy of users. In this paper, we propose a distributed reciprocal recommender system with strong, self-determined privacy guarantees, i.e., local differential privacy. More precisely, users randomize their profiles locally and exchange them via a peer-to-peer network. Recommendations are then computed and ranked locally by estimating similarities between profiles. We evaluate recommendation accuracy of a job recommender system and demonstrate that our method provides acceptable utility under strong privacy requirements. △ Less

Submitted 14 July, 2021; originally announced July 2021.

Comments: Accepted at The 16th International Conference on Availability, Reliability and Security (ARES 2021)

arXiv:2106.00388 [pdf, ps, other]

doi 10.1145/3468877

Privacy and Confidentiality in Process Mining -- Threats and Research Challenges

Authors: Gamal Elkoumy, Stephan A. Fahrenkrog-Petersen, Mohammadreza Fani Sani, Agnes Koschmider, Felix Mannhardt, Saskia Nuñez von Voigt, Majid Rafiei, Leopold von Waldthausen

Abstract: Privacy and confidentiality are very important prerequisites for applying process mining in order to comply with regulations and keep company secrets. This paper provides a foundation for future research on privacy-preserving and confidential process mining techniques. Main threats are identified and related to an motivation application scenario in a hospital context as well as to the current body… ▽ More Privacy and confidentiality are very important prerequisites for applying process mining in order to comply with regulations and keep company secrets. This paper provides a foundation for future research on privacy-preserving and confidential process mining techniques. Main threats are identified and related to an motivation application scenario in a hospital context as well as to the current body of work on privacy and confidentiality in process mining. A newly developed conceptual model structures the discussion that existing techniques leave room for improvement. This results in a number of important research challenges that should be addressed by future process mining research. △ Less

Submitted 1 June, 2021; originally announced June 2021.

Comments: Accepted for publication in ACM Transactions on Management Information Systems

arXiv:2008.12282 [pdf, other]

doi 10.1007/978-3-030-66172-4_17

Every Query Counts: Analyzing the Privacy Loss of Exploratory Data Analyses

Authors: Saskia Nuñez von Voigt, Mira Pauli, Johanna Reichert, Florian Tschorsch

Abstract: An exploratory data analysis is an essential step for every data analyst to gain insights, evaluate data quality and (if required) select a machine learning model for further processing. While privacy-preserving machine learning is on the rise, more often than not this initial analysis is not counted towards the privacy budget. In this paper, we quantify the privacy loss for basic statistical func… ▽ More An exploratory data analysis is an essential step for every data analyst to gain insights, evaluate data quality and (if required) select a machine learning model for further processing. While privacy-preserving machine learning is on the rise, more often than not this initial analysis is not counted towards the privacy budget. In this paper, we quantify the privacy loss for basic statistical functions and highlight the importance of taking it into account when calculating the privacy-loss budget of a machine learning approach. △ Less

Submitted 27 August, 2020; originally announced August 2020.

Comments: Accepted Paper for DPM 2020 co-located ESORICS 2020

arXiv:2003.10707 [pdf, other]

doi 10.1007/978-3-030-49435-3_16

Quantifying the Re-identification Risk of Event Logs for Process Mining

Authors: S. Nuñez von Voigt, S. A. Fahrenkrog-Petersen, D. Janssen, A. Koschmider, F. Tschorsch, F. Mannhardt, O. Landsiedel, M. Weidlich

Abstract: Event logs recorded during the execution of business processes constitute a valuable source of information. Applying process mining techniques to them, event logs may reveal the actual process execution and enable reasoning on quantitative or qualitative process properties. However, event logs often contain sensitive information that could be related to individual process stakeholders through back… ▽ More Event logs recorded during the execution of business processes constitute a valuable source of information. Applying process mining techniques to them, event logs may reveal the actual process execution and enable reasoning on quantitative or qualitative process properties. However, event logs often contain sensitive information that could be related to individual process stakeholders through background information and cross-correlation. We therefore argue that, when publishing event logs, the risk of such re-identification attacks must be considered. In this paper, we show how to quantify the re-identification risk with measures for the individual uniqueness in event logs. We also report on a large-scale study that explored the individual uniqueness in a collection of publicly available event logs. Our results suggest that potentially up to all of the cases in an event log may be re-identified, which highlights the importance of privacy-preserving techniques in process mining. △ Less

Submitted 19 June, 2020; v1 submitted 24 March, 2020; originally announced March 2020.

Comments: Accepted to CAiSE-2020

Journal ref: CAiSE 2020: Advanced Information Systems Engineering pp 252-267

arXiv:1801.07502 [pdf]

doi 10.1021/acs.nanolett.7b01078

ZnO Nanocrystal Networks Near the Insulator-Metal Transition: Tuning Contact Radius and Electron Density with Intense Pulsed Light

Authors: Benjamin L. Greenberg, Zachary L. Robinson, K. V. Reich, Claudia Gorynski, Bryan N. Voigt, Lorraine F. Francis, B. I. Shklovskii, Eray S. Aydil, Uwe R. Kortshagen

Abstract: Networks of ligand-free semiconductor nanocrystals (NCs) offer a valuable combination of high carrier mobility and optoelectronic properties tunable via quantum confinement. In principle, maximizing carrier mobility entails crossing the insulator-metal transition (IMT), where carriers become delocalized. A recent theoretical study predicted that this transition occurs at nρ^3 ~ 0.3, where n is the… ▽ More Networks of ligand-free semiconductor nanocrystals (NCs) offer a valuable combination of high carrier mobility and optoelectronic properties tunable via quantum confinement. In principle, maximizing carrier mobility entails crossing the insulator-metal transition (IMT), where carriers become delocalized. A recent theoretical study predicted that this transition occurs at nρ^3 ~ 0.3, where n is the carrier density and ρis the interparticle contact radius. In this work, we satisfy this criterion in networks of plasma-synthesized ZnO NCs by using intense pulsed light (IPL) annealing to tune n and ρindependently. IPL applied to as-deposited NCs increases ρby inducing sintering, and IPL applied after the NCs are coated with Al2O3 by atomic layer deposition increases n by removing electron-trap** surface hydroxyls. This procedure does not substantially alter NC size or composition and is potentially applicable to a wide variety of nanomaterials. As we increase nρ^3 to at least twice the predicted critical value, we observe conductivity scaling consistent with arrival at the critical region of a continuous quantum phase transition. This allows us to determine the critical behavior of the dielectric constant and electron localization length at the IMT. However, our samples remain on the insulating side of the critical region, which suggests that the critical value of nρ^3 may in fact be significantly higher than 0.3. △ Less

Submitted 23 January, 2018; originally announced January 2018.

Journal ref: Nano letters 17 (8), 4634-4642 (2017)

Showing 1–11 of 11 results for author: Voigt, N