-
Gransformer: Transformer-based Graph Generation
Authors:
Ahmad Khajenezhad,
Seyed Ali Osia,
Mahmood Karimian,
Hamid Beigy
Abstract:
Transformers have become widely used in various tasks, such as natural language processing and machine vision. This paper proposes Gransformer, an algorithm based on Transformer for generating graphs. We modify the Transformer encoder to exploit the structural information of the given graph. The attention mechanism is adapted to consider the presence or absence of edges between each pair of nodes.…
▽ More
Transformers have become widely used in various tasks, such as natural language processing and machine vision. This paper proposes Gransformer, an algorithm based on Transformer for generating graphs. We modify the Transformer encoder to exploit the structural information of the given graph. The attention mechanism is adapted to consider the presence or absence of edges between each pair of nodes. We also introduce a graph-based familiarity measure between node pairs that applies to both the attention and the positional encoding. This measure of familiarity is based on message-passing algorithms and contains structural information about the graph. Also, this measure is autoregressive, which allows our model to acquire the necessary conditional probabilities in a single forward pass. In the output layer, we also use a masked autoencoder for density estimation to efficiently model the sequential generation of dependent edges connected to each node. In addition, we propose a technique to prevent the model from generating isolated nodes without connection to preceding nodes by using BFS node orderings. We evaluate this method using synthetic and real-world datasets and compare it with related ones, including recurrent models and graph convolutional networks. Experimental results show that the proposed method performs comparatively to these methods.
△ Less
Submitted 30 May, 2024; v1 submitted 25 March, 2022;
originally announced March 2022.
-
RC-RNN: Reconfigurable Cache Architecture for Storage Systems Using Recurrent Neural Networks
Authors:
Shahriar Ebrahimi,
Reza Salkhordeh,
Seyed Ali Osia,
Ali Taheri,
Hamid Reza Rabiee,
Hossein Asadi
Abstract:
Solid-State Drives (SSDs) have significant performance advantages over traditional Hard Disk Drives (HDDs) such as lower latency and higher throughput. Significantly higher price per capacity and limited lifetime, however, prevents designers to completely substitute HDDs by SSDs in enterprise storage systems. SSD-based caching has recently been suggested for storage systems to benefit from higher…
▽ More
Solid-State Drives (SSDs) have significant performance advantages over traditional Hard Disk Drives (HDDs) such as lower latency and higher throughput. Significantly higher price per capacity and limited lifetime, however, prevents designers to completely substitute HDDs by SSDs in enterprise storage systems. SSD-based caching has recently been suggested for storage systems to benefit from higher performance of SSDs while minimizing the overall cost. While conventional caching algorithms such as Least Recently Used (LRU) provide high hit ratio in processors, due to the highly random behavior of Input/Output (I/O) workloads, they hardly provide the required performance level for storage systems. In addition to poor performance, inefficient algorithms also shorten SSD lifetime with unnecessary cache replacements. Such shortcomings motivate us to benefit from more complex non-linear algorithms to achieve higher cache performance and extend SSD lifetime. In this paper, we propose RC-RNN, the first reconfigurable SSD-based cache architecture for storage systems that utilizes machine learning to identify performance-critical data pages for I/O caching. The proposed architecture uses Recurrent Neural Networks (RNN) to characterize ongoing workloads and optimize itself towards higher cache performance while improving SSD lifetime. RC-RNN attempts to learn characteristics of the running workload to predict its behavior and then uses the collected information to identify performance-critical data pages to fetch into the cache. Experimental results show that RC-RNN characterizes workloads with an accuracy up to 94.6% for SNIA I/O workloads. RC-RNN can perform similarly to the optimal cache algorithm by an accuracy of 95% on average, and outperforms previous SSD caching architectures by providing up to 7x higher hit ratio and decreasing cache replacements by up to 2x.
△ Less
Submitted 5 November, 2021;
originally announced November 2021.
-
Privacy Against Brute-Force Inference Attacks
Authors:
Seyed Ali Osia,
Borzoo Rassouli,
Hamed Haddadi,
Hamid R. Rabiee,
Deniz Gündüz
Abstract:
Privacy-preserving data release is about disclosing information about useful data while retaining the privacy of sensitive data. Assuming that the sensitive data is threatened by a brute-force adversary, we define Guessing Leakage as a measure of privacy, based on the concept of guessing. After investigating the properties of this measure, we derive the optimal utility-privacy trade-off via a line…
▽ More
Privacy-preserving data release is about disclosing information about useful data while retaining the privacy of sensitive data. Assuming that the sensitive data is threatened by a brute-force adversary, we define Guessing Leakage as a measure of privacy, based on the concept of guessing. After investigating the properties of this measure, we derive the optimal utility-privacy trade-off via a linear program with any $f$-information adopted as the utility measure, and show that the optimal utility is a concave and piece-wise linear function of the privacy-leakage budget.
△ Less
Submitted 1 February, 2019;
originally announced February 2019.
-
Deep Private-Feature Extraction
Authors:
Seyed Ali Osia,
Ali Taheri,
Ali Shahin Shamsabadi,
Kleomenis Katevas,
Hamed Haddadi,
Hamid R. Rabiee
Abstract:
We present and evaluate Deep Private-Feature Extractor (DPFE), a deep model which is trained and evaluated based on information theoretic constraints. Using the selective exchange of information between a user's device and a service provider, DPFE enables the user to prevent certain sensitive information from being shared with a service provider, while allowing them to extract approved information…
▽ More
We present and evaluate Deep Private-Feature Extractor (DPFE), a deep model which is trained and evaluated based on information theoretic constraints. Using the selective exchange of information between a user's device and a service provider, DPFE enables the user to prevent certain sensitive information from being shared with a service provider, while allowing them to extract approved information using their model. We introduce and utilize the log-rank privacy, a novel measure to assess the effectiveness of DPFE in removing sensitive information and compare different models based on their accuracy-privacy tradeoff. We then implement and evaluate the performance of DPFE on smartphones to understand its complexity, resource demands, and efficiency tradeoffs. Our results on benchmark image datasets demonstrate that under moderate resource utilization, DPFE can achieve high accuracy for primary tasks while preserving the privacy of sensitive features.
△ Less
Submitted 28 February, 2018; v1 submitted 9 February, 2018;
originally announced February 2018.
-
Privacy-Preserving Deep Inference for Rich User Data on The Cloud
Authors:
Seyed Ali Osia,
Ali Shahin Shamsabadi,
Ali Taheri,
Kleomenis Katevas,
Hamid R. Rabiee,
Nicholas D. Lane,
Hamed Haddadi
Abstract:
Deep neural networks are increasingly being used in a variety of machine learning applications applied to rich user data on the cloud. However, this approach introduces a number of privacy and efficiency challenges, as the cloud operator can perform secondary inferences on the available data. Recently, advances in edge processing have paved the way for more efficient, and private, data processing…
▽ More
Deep neural networks are increasingly being used in a variety of machine learning applications applied to rich user data on the cloud. However, this approach introduces a number of privacy and efficiency challenges, as the cloud operator can perform secondary inferences on the available data. Recently, advances in edge processing have paved the way for more efficient, and private, data processing at the source for simple tasks and lighter models, though they remain a challenge for larger, and more complicated models. In this paper, we present a hybrid approach for breaking down large, complex deep models for cooperative, privacy-preserving analytics. We do this by breaking down the popular deep architectures and fine-tune them in a particular way. We then evaluate the privacy benefits of this approach based on the information exposed to the cloud service. We also asses the local inference cost of different layers on a modern handset for mobile applications. Our evaluations show that by using certain kind of fine-tuning and embedding techniques and at a small processing costs, we can greatly reduce the level of information available to unintended tasks applied to the data feature on the cloud, and hence achieving the desired tradeoff between privacy and performance.
△ Less
Submitted 11 October, 2017; v1 submitted 4 October, 2017;
originally announced October 2017.
-
A Hybrid Deep Learning Architecture for Privacy-Preserving Mobile Analytics
Authors:
Seyed Ali Osia,
Ali Shahin Shamsabadi,
Sina Sajadmanesh,
Ali Taheri,
Kleomenis Katevas,
Hamid R. Rabiee,
Nicholas D. Lane,
Hamed Haddadi
Abstract:
Internet of Things (IoT) devices and applications are being deployed in our homes and workplaces. These devices often rely on continuous data collection to feed machine learning models. However, this approach introduces several privacy and efficiency challenges, as the service operator can perform unwanted inferences on the available data. Recently, advances in edge processing have paved the way f…
▽ More
Internet of Things (IoT) devices and applications are being deployed in our homes and workplaces. These devices often rely on continuous data collection to feed machine learning models. However, this approach introduces several privacy and efficiency challenges, as the service operator can perform unwanted inferences on the available data. Recently, advances in edge processing have paved the way for more efficient, and private, data processing at the source for simple tasks and lighter models, though they remain a challenge for larger, and more complicated models. In this paper, we present a hybrid approach for breaking down large, complex deep neural networks for cooperative, privacy-preserving analytics. To this end, instead of performing the whole operation on the cloud, we let an IoT device to run the initial layers of the neural network, and then send the output to the cloud to feed the remaining layers and produce the final result. In order to ensure that the user's device contains no extra information except what is necessary for the main task and preventing any secondary inference on the data, we introduce Siamese fine-tuning. We evaluate the privacy benefits of this approach based on the information exposed to the cloud service. We also assess the local inference cost of different layers on a modern handset. Our evaluations show that by using Siamese fine-tuning and at a small processing cost, we can greatly reduce the level of unnecessary, potentially sensitive information in the personal data, and thus achieving the desired trade-off between utility, privacy, and performance.
△ Less
Submitted 26 December, 2019; v1 submitted 8 March, 2017;
originally announced March 2017.
-
Kissing Cuisines: Exploring Worldwide Culinary Habits on the Web
Authors:
Sina Sajadmanesh,
Sina Jafarzadeh,
Seyed Ali Osia,
Hamid R. Rabiee,
Hamed Haddadi,
Yelena Mejova,
Mirco Musolesi,
Emiliano De Cristofaro,
Gianluca Stringhini
Abstract:
Food and nutrition occupy an increasingly prevalent space on the web, and dishes and recipes shared online provide an invaluable mirror into culinary cultures and attitudes around the world. More specifically, ingredients, flavors, and nutrition information become strong signals of the taste preferences of individuals and civilizations. However, there is little understanding of these palate variet…
▽ More
Food and nutrition occupy an increasingly prevalent space on the web, and dishes and recipes shared online provide an invaluable mirror into culinary cultures and attitudes around the world. More specifically, ingredients, flavors, and nutrition information become strong signals of the taste preferences of individuals and civilizations. However, there is little understanding of these palate varieties. In this paper, we present a large-scale study of recipes published on the web and their content, aiming to understand cuisines and culinary habits around the world. Using a database of more than 157K recipes from over 200 different cuisines, we analyze ingredients, flavors, and nutritional values which distinguish dishes from different regions, and use this knowledge to assess the predictability of recipes from different cuisines. We then use country health statistics to understand the relation between these factors and health indicators of different nations, such as obesity, diabetes, migration, and health expenditure. Our results confirm the strong effects of geographical and cultural similarities on recipes, health indicators, and culinary preferences across the globe.
△ Less
Submitted 25 April, 2017; v1 submitted 26 October, 2016;
originally announced October 2016.