-
When Machine Learning Meets Privacy: A Survey and Outlook
Authors:
Bo Liu,
Ming Ding,
Sina Shaham,
Wenny Rahayu,
Farhad Farokhi,
Zihuai Lin
Abstract:
The newly emerged machine learning (e.g. deep learning) methods have become a strong driving force to revolutionize a wide range of industries, such as smart healthcare, financial technology, and surveillance systems. Meanwhile, privacy has emerged as a big concern in this machine learning-based artificial intelligence era. It is important to note that the problem of privacy preservation in the co…
▽ More
The newly emerged machine learning (e.g. deep learning) methods have become a strong driving force to revolutionize a wide range of industries, such as smart healthcare, financial technology, and surveillance systems. Meanwhile, privacy has emerged as a big concern in this machine learning-based artificial intelligence era. It is important to note that the problem of privacy preservation in the context of machine learning is quite different from that in traditional data privacy protection, as machine learning can act as both friend and foe. Currently, the work on the preservation of privacy and machine learning (ML) is still in an infancy stage, as most existing solutions only focus on privacy problems during the machine learning process. Therefore, a comprehensive study on the privacy preservation problems and machine learning is required. This paper surveys the state of the art in privacy issues and solutions for machine learning. The survey covers three categories of interactions between privacy and machine learning: (i) private machine learning, (ii) machine learning aided privacy protection, and (iii) machine learning-based privacy attack and corresponding protection schemes. The current research progress in each category is reviewed and the key challenges are identified. Finally, based on our in-depth analysis of the area of privacy and machine learning, we point out future research directions in this field.
△ Less
Submitted 23 November, 2020;
originally announced November 2020.
-
Distributed Training of Deep Learning Models: A Taxonomic Perspective
Authors:
Matthias Langer,
Zhen He,
Wenny Rahayu,
Yanbo Xue
Abstract:
Distributed deep learning systems (DDLS) train deep neural network models by utilizing the distributed resources of a cluster. Developers of DDLS are required to make many decisions to process their particular workloads in their chosen environment efficiently. The advent of GPU-based deep learning, the ever-increasing size of datasets and deep neural network models, in combination with the bandwid…
▽ More
Distributed deep learning systems (DDLS) train deep neural network models by utilizing the distributed resources of a cluster. Developers of DDLS are required to make many decisions to process their particular workloads in their chosen environment efficiently. The advent of GPU-based deep learning, the ever-increasing size of datasets and deep neural network models, in combination with the bandwidth constraints that exist in cluster environments require developers of DDLS to be innovative in order to train high quality models quickly. Comparing DDLS side-by-side is difficult due to their extensive feature lists and architectural deviations. We aim to shine some light on the fundamental principles that are at work when training deep neural networks in a cluster of independent machines by analyzing the general properties associated with training deep learning models and how such workloads can be distributed in a cluster to achieve collaborative model training. Thereby we provide an overview of the different techniques that are used by contemporary DDLS and discuss their influence and implications on the training process. To conceptualize and compare DDLS, we group different techniques into categories, thus establishing a taxonomy of distributed deep learning systems.
△ Less
Submitted 8 July, 2020;
originally announced July 2020.
-
A Policy Model and Framework for Context-Aware Access Control to Information Resources
Authors:
A. S. M. Kayes,
Jun Han,
Wenny Rahayu,
Md. Saiful Islam,
Alan Colman
Abstract:
In today's dynamic ICT environments, the ability to control users' access to resources becomes ever important. On the one hand, it should adapt to the users' changing needs; on the other hand, it should not be compromised. Therefore, it is essential to have a flexible access control model, incorporating dynamically changing context information. Towards this end, this paper introduces a policy fram…
▽ More
In today's dynamic ICT environments, the ability to control users' access to resources becomes ever important. On the one hand, it should adapt to the users' changing needs; on the other hand, it should not be compromised. Therefore, it is essential to have a flexible access control model, incorporating dynamically changing context information. Towards this end, this paper introduces a policy framework for context-aware access control (CAAC) applications that extends the role-based access control model with both dynamic associations of user-role and role-permission capabilities. We first present a formal model of CAAC policies for our framework. Using this model, we then introduce an ontology-based approach and a software prototype for modelling and enforcing CAAC policies. In addition, we evaluate our policy ontology model and framework by considering (i) the completeness of the ontology concepts, specifying different context-aware user-role and role-permission assignment policies from the healthcare scenarios; (ii) the correctness and consistency of the ontology semantics, assessing the core and domain-specific ontologies through the healthcare case study; and (iii) the performance of the framework by means of response time. The evaluation results demonstrate the feasibility of our framework and quantify the performance overhead of achieving context-aware access control to information resources.
△ Less
Submitted 6 March, 2017;
originally announced March 2017.
-
Frequent Query Matching in Dynamic Data Warehousing
Authors:
Charles H. Goonetilleke,
J. Wenny Rahayu,
Md. Saiful Islam
Abstract:
With the need for flexible and on-demand decision support, Dynamic Data Warehouses (DDW) provide benefits over traditional data warehouses due to their dynamic characteristics in structuring and access mechanism. A DDW is a data framework that accommodates data source changes easily to allow seamless querying to users. Materialized Views (MV) are proven to be an effective methodology to enhance th…
▽ More
With the need for flexible and on-demand decision support, Dynamic Data Warehouses (DDW) provide benefits over traditional data warehouses due to their dynamic characteristics in structuring and access mechanism. A DDW is a data framework that accommodates data source changes easily to allow seamless querying to users. Materialized Views (MV) are proven to be an effective methodology to enhance the process of retrieving data from a DDW as results are pre-computed and stored in it. However, due to the static nature of materialized views, the level of dynamicity that can be provided at the MV access layer is restricted. As a result, the collection of materialized views is not compatible with ever-changing reporting requirements. It is important that the MV collection is consistent with current and upcoming queries. The solution to the above problem must consider the following aspects: (a) MV must be matched against an OLAP query in order to recognize whether the MV can answer the query, (b) enable scalability in the MV collection, an intuitive mechanism to prune it and retrieve closely matching MVs must be incorporated, (c) MV collection must be able to evolve in correspondence to the regularly changing user query patterns. Therefore, the primary objective of this paper is to explore these aspects and provide a well-rounded solution for the MV access layer to remove the mismatch between the MV collection and reporting requirements. Our contribution to solve the problem includes a Query Matching Technique, a Domain Matching Technique and Maintenance of the MV collection. We developed an experimental platform using real data-sets to evaluate the effectiveness in terms of performance and precision of the proposed techniques.
△ Less
Submitted 6 March, 2017;
originally announced March 2017.
-
Computing Influence of a Product through Uncertain Reverse Skyline
Authors:
Md. Saiful Islam,
Wenny Rahayu,
Chengfei Liu,
Tarique Anwar,
Bela Stantic
Abstract:
Understanding the influence of a product is crucially important for making informed business decisions. This paper introduces a new type of skyline queries, called uncertain reverse skyline, for measuring the influence of a probabilistic product in uncertain data settings. More specifically, given a dataset of probabilistic products P and a set of customers C, an uncertain reverse skyline of a pro…
▽ More
Understanding the influence of a product is crucially important for making informed business decisions. This paper introduces a new type of skyline queries, called uncertain reverse skyline, for measuring the influence of a probabilistic product in uncertain data settings. More specifically, given a dataset of probabilistic products P and a set of customers C, an uncertain reverse skyline of a probabilistic product q retrieves all customers c in C which include q as one of their preferred products. We present efficient pruning ideas and techniques for processing the uncertain reverse skyline query of a probabilistic product using R-Tree data index. We also present an efficient parallel approach to compute the uncertain reverse skyline and influence score of a probabilistic product. Our approach significantly outperforms the baseline approach derived from the existing literature. The efficiency of our approach is demonstrated by conducting extensive experiments with both real and synthetic datasets.
△ Less
Submitted 21 February, 2017;
originally announced February 2017.