-
Secure k-Anonymization over Encrypted Databases
Authors:
Manish Kesarwani,
Akshar Kaul,
Stefano Braghin,
Naoise Holohan,
Spiros Antonatos
Abstract:
Data protection algorithms are becoming increasingly important to support modern business needs for facilitating data sharing and data monetization. Anonymization is an important step before data sharing. Several organizations leverage on third parties for storing and managing data. However, third parties are often not trusted to store plaintext personal and sensitive data; data encryption is wide…
▽ More
Data protection algorithms are becoming increasingly important to support modern business needs for facilitating data sharing and data monetization. Anonymization is an important step before data sharing. Several organizations leverage on third parties for storing and managing data. However, third parties are often not trusted to store plaintext personal and sensitive data; data encryption is widely adopted to protect against intentional and unintentional attempts to read personal/sensitive data. Traditional encryption schemes do not support operations over the ciphertexts and thus anonymizing encrypted datasets is not feasible with current approaches. This paper explores the feasibility and depth of implementing a privacy-preserving data publishing workflow over encrypted datasets leveraging on homomorphic encryption. We demonstrate how we can achieve uniqueness discovery, data masking, differential privacy and k-anonymity over encrypted data requiring zero knowledge about the original values. We prove that the security protocols followed by our approach provide strong guarantees against inference attacks. Finally, we experimentally demonstrate the performance of our data publishing workflow components.
△ Less
Submitted 10 August, 2021;
originally announced August 2021.
-
AnonTokens: tracing re-identification attacks through decoy records
Authors:
Spiros Antonatos,
Stefano Braghin,
Naoise Holohan,
Pol MacAonghusa
Abstract:
Privacy is of the utmost concern when it comes to releasing data to third parties. Data owners rely on anonymization approaches to safeguard the released datasets against re-identification attacks. However, even with strict anonymization in place, re-identification attacks are still a possibility and in many cases a reality. Prior art has focused on providing better anonymization algorithms with m…
▽ More
Privacy is of the utmost concern when it comes to releasing data to third parties. Data owners rely on anonymization approaches to safeguard the released datasets against re-identification attacks. However, even with strict anonymization in place, re-identification attacks are still a possibility and in many cases a reality. Prior art has focused on providing better anonymization algorithms with minimal loss of information and how to prevent data disclosure attacks. Our approach tries to tackle the issue of tracing re-identification attacks based on the concept of honeytokens, decoy or "bait" records with the goal to lure malicious users. While the concept of honeytokens has been widely used in the security domain, this is the first approach to apply the concept on the data privacy domain. Records with high re-identification risk, called AnonTokens, are inserted into anonymized datasets. This work demonstrates the feasibility, detectability and usability of AnonTokens and provides promising results for data owners who want to apply our approach to real use cases. We evaluated our concept with real large-scale population datasets. The results show that the introduction of decoy tokens is feasible without significant impact on the released dataset.
△ Less
Submitted 24 June, 2019;
originally announced June 2019.
-
The Bounded Laplace Mechanism in Differential Privacy
Authors:
Naoise Holohan,
Spiros Antonatos,
Stefano Braghin,
Pól Mac Aonghusa
Abstract:
The Laplace mechanism is the workhorse of differential privacy, applied to many instances where numerical data is processed. However, the Laplace mechanism can return semantically impossible values, such as negative counts, due to its infinite support. There are two popular solutions to this: (i) bounding/cap** the output values and (ii) bounding the mechanism support. In this paper, we show tha…
▽ More
The Laplace mechanism is the workhorse of differential privacy, applied to many instances where numerical data is processed. However, the Laplace mechanism can return semantically impossible values, such as negative counts, due to its infinite support. There are two popular solutions to this: (i) bounding/cap** the output values and (ii) bounding the mechanism support. In this paper, we show that bounding the mechanism support, while using the parameters of the pure Laplace mechanism, does not typically preserve differential privacy. We also present a robust method to compute the optimal mechanism parameters to achieve differential privacy in such a setting.
△ Less
Submitted 30 August, 2018;
originally announced August 2018.
-
($k$,$ε$)-Anonymity: $k$-Anonymity with $ε$-Differential Privacy
Authors:
Naoise Holohan,
Spiros Antonatos,
Stefano Braghin,
Pól Mac Aonghusa
Abstract:
The explosion in volume and variety of data offers enormous potential for research and commercial use. Increased availability of personal data is of particular interest in enabling highly customised services tuned to individual needs. Preserving the privacy of individuals against reidentification attacks in this fast-moving ecosystem poses significant challenges for a one-size fits all approach to…
▽ More
The explosion in volume and variety of data offers enormous potential for research and commercial use. Increased availability of personal data is of particular interest in enabling highly customised services tuned to individual needs. Preserving the privacy of individuals against reidentification attacks in this fast-moving ecosystem poses significant challenges for a one-size fits all approach to anonymisation.
In this paper we present ($k$,$ε$)-anonymisation, an approach that combines the $k$-anonymisation and $ε$-differential privacy models into a single coherent framework, providing privacy guarantees at least as strong as those offered by the individual models. Linking risks of less than 5\% are observed in experimental results, even with modest values of $k$ and $ε$.
Our approach is shown to address well-known limitations of $k$-anonymity and $ε$-differential privacy and is validated in an extensive experimental campaign using openly available datasets.
△ Less
Submitted 4 October, 2017;
originally announced October 2017.