Search | arXiv e-print repository

Does Black-box Attribute Inference Attacks on Graph Neural Networks Constitute Privacy Risk?

Authors: Iyiola E. Olatunji, Anmar Hizber, Oliver Sihlovec, Megha Khosla

Abstract: Graph neural networks (GNNs) have shown promising results on real-life datasets and applications, including healthcare, finance, and education. However, recent studies have shown that GNNs are highly vulnerable to attacks such as membership inference attack and link reconstruction attack. Surprisingly, attribute inference attacks has received little attention. In this paper, we initiate the first… ▽ More Graph neural networks (GNNs) have shown promising results on real-life datasets and applications, including healthcare, finance, and education. However, recent studies have shown that GNNs are highly vulnerable to attacks such as membership inference attack and link reconstruction attack. Surprisingly, attribute inference attacks has received little attention. In this paper, we initiate the first investigation into attribute inference attack where an attacker aims to infer the sensitive user attributes based on her public or non-sensitive attributes. We ask the question whether black-box attribute inference attack constitutes a significant privacy risk for graph-structured data and their corresponding GNN model. We take a systematic approach to launch the attacks by varying the adversarial knowledge and assumptions. Our findings reveal that when an attacker has black-box access to the target model, GNNs generally do not reveal significantly more information compared to missing value estimation techniques. Code is available. △ Less

Submitted 1 June, 2023; originally announced June 2023.

arXiv:2206.14724 [pdf, other]

doi 10.56553/popets-2023-0041

Private Graph Extraction via Feature Explanations

Authors: Iyiola E. Olatunji, Mandeep Rathee, Thorben Funke, Megha Khosla

Abstract: Privacy and interpretability are two important ingredients for achieving trustworthy machine learning. We study the interplay of these two aspects in graph machine learning through graph reconstruction attacks. The goal of the adversary here is to reconstruct the graph structure of the training data given access to model explanations. Based on the different kinds of auxiliary information available… ▽ More Privacy and interpretability are two important ingredients for achieving trustworthy machine learning. We study the interplay of these two aspects in graph machine learning through graph reconstruction attacks. The goal of the adversary here is to reconstruct the graph structure of the training data given access to model explanations. Based on the different kinds of auxiliary information available to the adversary, we propose several graph reconstruction attacks. We show that additional knowledge of post-hoc feature explanations substantially increases the success rate of these attacks. Further, we investigate in detail the differences between attack performance with respect to three different classes of explanation methods for graph neural networks: gradient-based, perturbation-based, and surrogate model-based methods. While gradient-based explanations reveal the most in terms of the graph structure, we find that these explanations do not always score high in utility. For the other two classes of explanations, privacy leakage increases with an increase in explanation utility. Finally, we propose a defense based on a randomized response mechanism for releasing the explanations, which substantially reduces the attack success rate. Our code is available at https://github.com/iyempissy/graph-stealing-attacks-with-explanation △ Less

Submitted 2 November, 2023; v1 submitted 29 June, 2022; originally announced June 2022.

Comments: Accepted at PETS 2023

Journal ref: Proceedings of the 23rd Privacy Enhancing Technologies Symposium (PETS), 2023

arXiv:2109.08907 [pdf, other]

Releasing Graph Neural Networks with Differential Privacy Guarantees

Authors: Iyiola E. Olatunji, Thorben Funke, Megha Khosla

Abstract: With the increasing popularity of graph neural networks (GNNs) in several sensitive applications like healthcare and medicine, concerns have been raised over the privacy aspects of trained GNNs. More notably, GNNs are vulnerable to privacy attacks, such as membership inference attacks, even if only black-box access to the trained model is granted. We propose PrivGNN, a privacy-preserving framework… ▽ More With the increasing popularity of graph neural networks (GNNs) in several sensitive applications like healthcare and medicine, concerns have been raised over the privacy aspects of trained GNNs. More notably, GNNs are vulnerable to privacy attacks, such as membership inference attacks, even if only black-box access to the trained model is granted. We propose PrivGNN, a privacy-preserving framework for releasing GNN models in a centralized setting. Assuming an access to a public unlabeled graph, PrivGNN provides a framework to release GNN models trained explicitly on public data along with knowledge obtained from the private data in a privacy preserving manner. PrivGNN combines the knowledge-distillation framework with the two noise mechanisms, random subsampling, and noisy labeling, to ensure rigorous privacy guarantees. We theoretically analyze our approach in the Renyi differential privacy framework. Besides, we show the solid experimental performance of our method compared to several baselines adapted for graph-structured data. Our code is available at https://github.com/iyempissy/privGnn. △ Less

Submitted 2 November, 2023; v1 submitted 18 September, 2021; originally announced September 2021.

Comments: Published in TMLR 2023

Journal ref: Transactions on Machine Learning Research (TMLR), 2023

arXiv:2104.07938 [pdf, other]

Achieving differential privacy for $k$-nearest neighbors based outlier detection by data partitioning

Authors: Jens Rauch, Iyiola E. Olatunji, Megha Khosla

Abstract: When applying outlier detection in settings where data is sensitive, mechanisms which guarantee the privacy of the underlying data are needed. The $k$-nearest neighbors ($k$-NN) algorithm is a simple and one of the most effective methods for outlier detection. So far, there have been no attempts made to develop a differentially private ($ε$-DP) approach for $k$-NN based outlier detection. Existing… ▽ More When applying outlier detection in settings where data is sensitive, mechanisms which guarantee the privacy of the underlying data are needed. The $k$-nearest neighbors ($k$-NN) algorithm is a simple and one of the most effective methods for outlier detection. So far, there have been no attempts made to develop a differentially private ($ε$-DP) approach for $k$-NN based outlier detection. Existing approaches often relax the notion of $ε$-DP and employ other methods than $k$-NN. We propose a method for $k$-NN based outlier detection by separating the procedure into a fitting step on reference inlier data and then apply the outlier classifier to new data. We achieve $ε$-DP for both the fitting algorithm and the outlier classifier with respect to the reference data by partitioning the dataset into a uniform grid, which yields low global sensitivity. Our approach yields nearly optimal performance on real-world data with varying dimensions when compared to the non-private versions of $k$-NN. △ Less

Submitted 16 April, 2021; originally announced April 2021.

arXiv:2104.06523 [pdf, other]

doi 10.1089/big.2021.0169

A Review of Anonymization for Healthcare Data

Authors: Iyiola E. Olatunji, Jens Rauch, Matthias Katzensteiner, Megha Khosla

Abstract: Mining health data can lead to faster medical decisions, improvement in the quality of treatment, disease prevention, reduced cost, and it drives innovative solutions within the healthcare sector. However, health data is highly sensitive and subject to regulations such as the General Data Protection Regulation (GDPR), which aims to ensure patient's privacy. Anonymization or removal of patient iden… ▽ More Mining health data can lead to faster medical decisions, improvement in the quality of treatment, disease prevention, reduced cost, and it drives innovative solutions within the healthcare sector. However, health data is highly sensitive and subject to regulations such as the General Data Protection Regulation (GDPR), which aims to ensure patient's privacy. Anonymization or removal of patient identifiable information, though the most conventional way, is the first important step to adhere to the regulations and incorporate privacy concerns. In this paper, we review the existing anonymization techniques and their applicability to various types (relational and graph-based) of health data. Besides, we provide an overview of possible attacks on anonymized data. We illustrate via a reconstruction attack that anonymization though necessary, is not sufficient to address patient privacy and discuss methods for protecting against such attacks. Finally, we discuss tools that can be used to achieve anonymization. △ Less

Submitted 13 April, 2021; originally announced April 2021.

Journal ref: Big Data (2022)

arXiv:2101.06570 [pdf, other]

Membership Inference Attack on Graph Neural Networks

Authors: Iyiola E. Olatunji, Wolfgang Nejdl, Megha Khosla

Abstract: Graph Neural Networks (GNNs), which generalize traditional deep neural networks on graph data, have achieved state-of-the-art performance on several graph analytical tasks. We focus on how trained GNN models could leak information about the \emph{member} nodes that they were trained on. We introduce two realistic settings for performing a membership inference (MI) attack on GNNs. While choosing th… ▽ More Graph Neural Networks (GNNs), which generalize traditional deep neural networks on graph data, have achieved state-of-the-art performance on several graph analytical tasks. We focus on how trained GNN models could leak information about the \emph{member} nodes that they were trained on. We introduce two realistic settings for performing a membership inference (MI) attack on GNNs. While choosing the simplest possible attack model that utilizes the posteriors of the trained model (black-box access), we thoroughly analyze the properties of GNNs and the datasets which dictate the differences in their robustness towards MI attack. While in traditional machine learning models, overfitting is considered the main cause of such leakage, we show that in GNNs the additional structural information is the major contributing factor. We support our findings by extensive experiments on four representative GNN models. To prevent MI attacks on GNN, we propose two effective defenses that significantly decreases the attacker's inference by up to 60% without degradation to the target model's performance. Our code is available at https://github.com/iyempissy/rebMIGraph. △ Less

Submitted 18 December, 2021; v1 submitted 16 January, 2021; originally announced January 2021.

Comments: Best student paper award, IEEE TPS 21

arXiv:2004.13078 [pdf, other]

doi 10.1007/978-3-030-42835-8_6

Context-aware Helpfulness Prediction for Online Product Reviews

Authors: Iyiola E. Olatunji, Xin Li, Wai Lam

Abstract: Modeling and prediction of review helpfulness has become more predominant due to proliferation of e-commerce websites and online shops. Since the functionality of a product cannot be tested before buying, people often rely on different kinds of user reviews to decide whether or not to buy a product. However, quality reviews might be buried deep in the heap of a large amount of reviews. Therefore,… ▽ More Modeling and prediction of review helpfulness has become more predominant due to proliferation of e-commerce websites and online shops. Since the functionality of a product cannot be tested before buying, people often rely on different kinds of user reviews to decide whether or not to buy a product. However, quality reviews might be buried deep in the heap of a large amount of reviews. Therefore, recommending reviews to customers based on the review quality is of the essence. Since there is no direct indication of review quality, most reviews use the information that ''X out of Y'' users found the review helpful for obtaining the review quality. However, this approach undermines helpfulness prediction because not all reviews have statistically abundant votes. In this paper, we propose a neural deep learning model that predicts the helpfulness score of a review. This model is based on convolutional neural network (CNN) and a context-aware encoding mechanism which can directly capture relationships between words irrespective of their distance in a long sequence. We validated our model on human annotated dataset and the result shows that our model significantly outperforms existing models for helpfulness prediction. △ Less

Submitted 27 April, 2020; originally announced April 2020.

Comments: Published as a proceeding paper in AIRS 2019

arXiv:1807.00139 [pdf]

Harnessing constrained resources in service industry via video analytics

Authors: Chun-Hung Cheng, Iyiola E. Olatunji

Abstract: Service industries contribute significantly to many developed and develo** - economies. As their business activities expand rapidly, many service companies struggle to maintain customer's satisfaction due to sluggish service response caused by resource shortages. Anticipating resource shortages and proffering solutions before they happen is an effective way of reducing the adverse effect on oper… ▽ More Service industries contribute significantly to many developed and develo** - economies. As their business activities expand rapidly, many service companies struggle to maintain customer's satisfaction due to sluggish service response caused by resource shortages. Anticipating resource shortages and proffering solutions before they happen is an effective way of reducing the adverse effect on operations. However, this proactive approach is very expensive in terms of capacity and labor costs. Many companies fall into productivity conundrum as they fail to find sufficient strong arguments to justify the cost of a new technology yet cannot afford not to invest in new technologies to match up with competitors. The question is whether there is an innovative solution to maximally utilize available resources and drastically reduce the effect that the shortages of resources may cause yet achieving high level of service quality at a low cost. This work demonstrates with a practical analysis of a trolley tracking system we designed and deployed at Hong Kong International Airport (HKIA) on how video analytics helps achieve management's goal of satisfying customer's needs via real-time detection and prevention of problems they may encounter during the service consumption process using existing video technology rather than adopting new technologies. This paper presents the integration of commercial video surveillance system with deep learning algorithms for video analytics. We show that our system can provide accurate decision when faced with total or partial occlusion with high accuracy and it significantly improves daily operation. It is envisioned that this work will heighten the appreciation of integrative technologies for resource management within the service industries and as a measure for real-time customer assistance. △ Less

Submitted 30 June, 2018; originally announced July 2018.

Comments: Accepted to appear in Archives of Industrial Engineering Journal

arXiv:1801.07633 [pdf]

doi 10.1088/1742-6596/1069/1/012148

Human Activity Recognition for Mobile Robot

Authors: Iyiola E. Olatunji

Abstract: Due to the increasing number of mobile robots including domestic robots for cleaning and maintenance in developed countries, human activity recognition is inevitable for congruent human-robot interaction. Needless to say that this is indeed a challenging task for robots, it is expedient to learn human activities for autonomous mobile robots (AMR) for navigating in an uncontrolled environment witho… ▽ More Due to the increasing number of mobile robots including domestic robots for cleaning and maintenance in developed countries, human activity recognition is inevitable for congruent human-robot interaction. Needless to say that this is indeed a challenging task for robots, it is expedient to learn human activities for autonomous mobile robots (AMR) for navigating in an uncontrolled environment without any guidance. Building a correct classifier for complex human action is non-trivial since simple actions can be combined to recognize a complex human activity. In this paper, we trained a model for human activity recognition using convolutional neural network. We trained and validated the model using the Vicon physical action dataset and also tested the model on our generated dataset (VMCUHK). Our experiment shows that our method performs with high accuracy, human activity recognition task both on the Vicon physical action dataset and VMCUHK dataset. △ Less

Submitted 23 January, 2018; originally announced January 2018.

Showing 1–9 of 9 results for author: Olatunji, I E