Search | arXiv e-print repository

arXiv:2011.11819 [pdf, other]

When Machine Learning Meets Privacy: A Survey and Outlook

Authors: Bo Liu, Ming Ding, Sina Shaham, Wenny Rahayu, Farhad Farokhi, Zihuai Lin

Abstract: The newly emerged machine learning (e.g. deep learning) methods have become a strong driving force to revolutionize a wide range of industries, such as smart healthcare, financial technology, and surveillance systems. Meanwhile, privacy has emerged as a big concern in this machine learning-based artificial intelligence era. It is important to note that the problem of privacy preservation in the co… ▽ More The newly emerged machine learning (e.g. deep learning) methods have become a strong driving force to revolutionize a wide range of industries, such as smart healthcare, financial technology, and surveillance systems. Meanwhile, privacy has emerged as a big concern in this machine learning-based artificial intelligence era. It is important to note that the problem of privacy preservation in the context of machine learning is quite different from that in traditional data privacy protection, as machine learning can act as both friend and foe. Currently, the work on the preservation of privacy and machine learning (ML) is still in an infancy stage, as most existing solutions only focus on privacy problems during the machine learning process. Therefore, a comprehensive study on the privacy preservation problems and machine learning is required. This paper surveys the state of the art in privacy issues and solutions for machine learning. The survey covers three categories of interactions between privacy and machine learning: (i) private machine learning, (ii) machine learning aided privacy protection, and (iii) machine learning-based privacy attack and corresponding protection schemes. The current research progress in each category is reviewed and the key challenges are identified. Finally, based on our in-depth analysis of the area of privacy and machine learning, we point out future research directions in this field. △ Less

Submitted 23 November, 2020; originally announced November 2020.

Comments: This work is accepted by ACM Computing Surveys

arXiv:2007.03970 [pdf]

doi 10.1109/TPDS.2020.3003307

Distributed Training of Deep Learning Models: A Taxonomic Perspective

Authors: Matthias Langer, Zhen He, Wenny Rahayu, Yanbo Xue

Abstract: Distributed deep learning systems (DDLS) train deep neural network models by utilizing the distributed resources of a cluster. Developers of DDLS are required to make many decisions to process their particular workloads in their chosen environment efficiently. The advent of GPU-based deep learning, the ever-increasing size of datasets and deep neural network models, in combination with the bandwid… ▽ More Distributed deep learning systems (DDLS) train deep neural network models by utilizing the distributed resources of a cluster. Developers of DDLS are required to make many decisions to process their particular workloads in their chosen environment efficiently. The advent of GPU-based deep learning, the ever-increasing size of datasets and deep neural network models, in combination with the bandwidth constraints that exist in cluster environments require developers of DDLS to be innovative in order to train high quality models quickly. Comparing DDLS side-by-side is difficult due to their extensive feature lists and architectural deviations. We aim to shine some light on the fundamental principles that are at work when training deep neural networks in a cluster of independent machines by analyzing the general properties associated with training deep learning models and how such workloads can be distributed in a cluster to achieve collaborative model training. Thereby we provide an overview of the different techniques that are used by contemporary DDLS and discuss their influence and implications on the training process. To conceptualize and compare DDLS, we group different techniques into categories, thus establishing a taxonomy of distributed deep learning systems. △ Less

Submitted 8 July, 2020; originally announced July 2020.

Journal ref: IEEE Transactions on Parallel and Distributed Systems, 2020, Volume: 31, Issue: 12, Pages: 2802-2818

arXiv:1703.02162 [pdf, other]

A Policy Model and Framework for Context-Aware Access Control to Information Resources

Authors: A. S. M. Kayes, Jun Han, Wenny Rahayu, Md. Saiful Islam, Alan Colman

Abstract: In today's dynamic ICT environments, the ability to control users' access to resources becomes ever important. On the one hand, it should adapt to the users' changing needs; on the other hand, it should not be compromised. Therefore, it is essential to have a flexible access control model, incorporating dynamically changing context information. Towards this end, this paper introduces a policy fram… ▽ More In today's dynamic ICT environments, the ability to control users' access to resources becomes ever important. On the one hand, it should adapt to the users' changing needs; on the other hand, it should not be compromised. Therefore, it is essential to have a flexible access control model, incorporating dynamically changing context information. Towards this end, this paper introduces a policy framework for context-aware access control (CAAC) applications that extends the role-based access control model with both dynamic associations of user-role and role-permission capabilities. We first present a formal model of CAAC policies for our framework. Using this model, we then introduce an ontology-based approach and a software prototype for modelling and enforcing CAAC policies. In addition, we evaluate our policy ontology model and framework by considering (i) the completeness of the ontology concepts, specifying different context-aware user-role and role-permission assignment policies from the healthcare scenarios; (ii) the correctness and consistency of the ontology semantics, assessing the core and domain-specific ontologies through the healthcare case study; and (iii) the performance of the framework by means of response time. The evaluation results demonstrate the feasibility of our framework and quantify the performance overhead of achieving context-aware access control to information resources. △ Less

Submitted 6 March, 2017; originally announced March 2017.

Comments: 26 pages

MSC Class: 68U35

arXiv:1703.01727 [pdf]

Frequent Query Matching in Dynamic Data Warehousing

Authors: Charles H. Goonetilleke, J. Wenny Rahayu, Md. Saiful Islam

Abstract: With the need for flexible and on-demand decision support, Dynamic Data Warehouses (DDW) provide benefits over traditional data warehouses due to their dynamic characteristics in structuring and access mechanism. A DDW is a data framework that accommodates data source changes easily to allow seamless querying to users. Materialized Views (MV) are proven to be an effective methodology to enhance th… ▽ More With the need for flexible and on-demand decision support, Dynamic Data Warehouses (DDW) provide benefits over traditional data warehouses due to their dynamic characteristics in structuring and access mechanism. A DDW is a data framework that accommodates data source changes easily to allow seamless querying to users. Materialized Views (MV) are proven to be an effective methodology to enhance the process of retrieving data from a DDW as results are pre-computed and stored in it. However, due to the static nature of materialized views, the level of dynamicity that can be provided at the MV access layer is restricted. As a result, the collection of materialized views is not compatible with ever-changing reporting requirements. It is important that the MV collection is consistent with current and upcoming queries. The solution to the above problem must consider the following aspects: (a) MV must be matched against an OLAP query in order to recognize whether the MV can answer the query, (b) enable scalability in the MV collection, an intuitive mechanism to prune it and retrieve closely matching MVs must be incorporated, (c) MV collection must be able to evolve in correspondence to the regularly changing user query patterns. Therefore, the primary objective of this paper is to explore these aspects and provide a well-rounded solution for the MV access layer to remove the mismatch between the MV collection and reporting requirements. Our contribution to solve the problem includes a Query Matching Technique, a Domain Matching Technique and Maintenance of the MV collection. We developed an experimental platform using real data-sets to evaluate the effectiveness in terms of performance and precision of the proposed techniques. △ Less

Submitted 6 March, 2017; originally announced March 2017.

Comments: 2 Tables, 29 Figures, submitted to the Elsevier Journal of Systems and Software for possible publication

arXiv:1702.06298 [pdf, other]

Computing Influence of a Product through Uncertain Reverse Skyline

Authors: Md. Saiful Islam, Wenny Rahayu, Chengfei Liu, Tarique Anwar, Bela Stantic

Abstract: Understanding the influence of a product is crucially important for making informed business decisions. This paper introduces a new type of skyline queries, called uncertain reverse skyline, for measuring the influence of a probabilistic product in uncertain data settings. More specifically, given a dataset of probabilistic products P and a set of customers C, an uncertain reverse skyline of a pro… ▽ More Understanding the influence of a product is crucially important for making informed business decisions. This paper introduces a new type of skyline queries, called uncertain reverse skyline, for measuring the influence of a probabilistic product in uncertain data settings. More specifically, given a dataset of probabilistic products P and a set of customers C, an uncertain reverse skyline of a probabilistic product q retrieves all customers c in C which include q as one of their preferred products. We present efficient pruning ideas and techniques for processing the uncertain reverse skyline query of a probabilistic product using R-Tree data index. We also present an efficient parallel approach to compute the uncertain reverse skyline and influence score of a probabilistic product. Our approach significantly outperforms the baseline approach derived from the existing literature. The efficiency of our approach is demonstrated by conducting extensive experiments with both real and synthetic datasets. △ Less

Submitted 21 February, 2017; originally announced February 2017.

Comments: 12 pages, 3 tables, 12 figures, submitted to SSDBM 2017

Showing 1–5 of 5 results for author: Rahayu, W