Search | arXiv e-print repository

PANORAMIA: Privacy Auditing of Machine Learning Models without Retraining

Authors: Mishaal Kazmi, Hadrien Lautraite, Alireza Akbari, Mauricio Soroco, Qiaoyue Tang, Tao Wang, Sébastien Gambs, Mathias Lécuyer

Abstract: We introduce a privacy auditing scheme for ML models that relies on membership inference attacks using generated data as "non-members". This scheme, which we call PANORAMIA, quantifies the privacy leakage for large-scale ML models without control of the training process or model re-training and only requires access to a subset of the training data. To demonstrate its applicability, we evaluate our… ▽ More We introduce a privacy auditing scheme for ML models that relies on membership inference attacks using generated data as "non-members". This scheme, which we call PANORAMIA, quantifies the privacy leakage for large-scale ML models without control of the training process or model re-training and only requires access to a subset of the training data. To demonstrate its applicability, we evaluate our auditing scheme across multiple ML domains, ranging from image and tabular data classification to large-scale language models. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: 19 pages

arXiv:2107.12826 [pdf, other]

Adversarial Stacked Auto-Encoders for Fair Representation Learning

Authors: Patrik Joslin Kenfack, Adil Mehmood Khan, Rasheed Hussain, S. M. Ahsan Kazmi

Abstract: Training machine learning models with the only accuracy as a final goal may promote prejudices and discriminatory behaviors embedded in the data. One solution is to learn latent representations that fulfill specific fairness metrics. Different types of learning methods are employed to map data into the fair representational space. The main purpose is to learn a latent representation of data that s… ▽ More Training machine learning models with the only accuracy as a final goal may promote prejudices and discriminatory behaviors embedded in the data. One solution is to learn latent representations that fulfill specific fairness metrics. Different types of learning methods are employed to map data into the fair representational space. The main purpose is to learn a latent representation of data that scores well on a fairness metric while maintaining the usability for the downstream task. In this paper, we propose a new fair representation learning approach that leverages different levels of representation of data to tighten the fairness bounds of the learned representation. Our results show that stacking different auto-encoders and enforcing fairness at different latent spaces result in an improvement of fairness compared to other existing approaches. △ Less

Submitted 27 July, 2021; originally announced July 2021.

Comments: ICML2021 ML4data Workshop Paper

arXiv:2104.01577 [pdf, other]

Class-incremental Learning using a Sequence of Partial Implicitly Regularized Classifiers

Authors: Sobirdzhon Bobiev, Adil Khan, Syed Muhammad Ahsan Raza Kazmi

Abstract: In class-incremental learning, the objective is to learn a number of classes sequentially without having access to the whole training data. However, due to a problem known as catastrophic forgetting, neural networks suffer substantial performance drop in such settings. The problem is often approached by experience replay, a method which stores a limited number of samples to be replayed in future s… ▽ More In class-incremental learning, the objective is to learn a number of classes sequentially without having access to the whole training data. However, due to a problem known as catastrophic forgetting, neural networks suffer substantial performance drop in such settings. The problem is often approached by experience replay, a method which stores a limited number of samples to be replayed in future steps to reduce forgetting of the learned classes. When using a pretrained network as a feature extractor, we show that instead of training a single classifier incrementally, it is better to train a number of specialized classifiers which do not interfere with each other yet can cooperatively predict a single class. Our experiments on CIFAR100 dataset show that the proposed method improves the performance over SOTA by a large margin. △ Less

Submitted 30 May, 2021; v1 submitted 4 April, 2021; originally announced April 2021.

arXiv:2103.00950 [pdf, other]

On the Fairness of Generative Adversarial Networks (GANs)

Authors: Patrik Joslin Kenfack, Daniil Dmitrievich Arapov, Rasheed Hussain, S. M. Ahsan Kazmi, Adil Mehmood Khan

Abstract: Generative adversarial networks (GANs) are one of the greatest advances in AI in recent years. With their ability to directly learn the probability distribution of data, and then sample synthetic realistic data. Many applications have emerged, using GANs to solve classical problems in machine learning, such as data augmentation, class unbalance problems, and fair representation learning. In this p… ▽ More Generative adversarial networks (GANs) are one of the greatest advances in AI in recent years. With their ability to directly learn the probability distribution of data, and then sample synthetic realistic data. Many applications have emerged, using GANs to solve classical problems in machine learning, such as data augmentation, class unbalance problems, and fair representation learning. In this paper, we analyze and highlight fairness concerns of GANs model. In this regard, we show empirically that GANs models may inherently prefer certain groups during the training process and therefore they're not able to homogeneously generate data from different groups during the testing phase. Furthermore, we propose solutions to solve this issue by conditioning the GAN model towards samples' group or using ensemble method (boosting) to allow the GAN model to leverage distributed structure of data during the training phase and generate groups at equal rate during the testing phase. △ Less

Submitted 21 May, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

Comments: Corrected typos, added results on CelibA dataset

arXiv:2011.14137 [pdf, other]

doi 10.1109/ACCESS.2021.3093481

Short-Term Load Forecasting using Bi-directional Sequential Models and Feature Engineering for Small Datasets

Authors: Abdul Wahab, Muhammad Anas Tahir, Naveed Iqbal, Faisal Shafait, Syed Muhammad Raza Kazmi

Abstract: Electricity load forecasting enables the grid operators to optimally implement the smart grid's most essential features such as demand response and energy efficiency. Electricity demand profiles can vary drastically from one region to another on diurnal, seasonal and yearly scale. Hence to devise a load forecasting technique that can yield the best estimates on diverse datasets, specially when the… ▽ More Electricity load forecasting enables the grid operators to optimally implement the smart grid's most essential features such as demand response and energy efficiency. Electricity demand profiles can vary drastically from one region to another on diurnal, seasonal and yearly scale. Hence to devise a load forecasting technique that can yield the best estimates on diverse datasets, specially when the training data is limited, is a big challenge. This paper presents a deep learning architecture for short-term load forecasting based on bidirectional sequential models in conjunction with feature engineering that extracts the hand-crafted derived features in order to aid the model for better learning and predictions. In the proposed architecture, named as Deep Derived Feature Fusion (DeepDeFF), the raw input and hand-crafted features are trained at separate levels and then their respective outputs are combined to make the final prediction. The efficacy of the proposed methodology is evaluated on datasets from five countries with completely different patterns. The results demonstrate that the proposed technique is superior to the existing state of the art. △ Less

Submitted 28 November, 2020; originally announced November 2020.

Comments: 8 pages, 13 figures, 5 tables. Submitted to IEEE Transactions on Power Systems, 2020

arXiv:2006.00815 [pdf, other]

Ruin Theory for Energy-Efficient Resource Allocation in UAV-assisted Cellular Networks

Authors: Aunas Manzoor, Kitae Kim, Shashi Raj Pandey, S. M. Ahsan Kazmi, Nguyen H. Tran, Walid Saad, Choong Seon Hong

Abstract: Unmanned aerial vehicles (UAVs) can provide an effective solution for improving the coverage, capacity, and the overall performance of terrestrial wireless cellular networks. In particular, UAV-assisted cellular networks can meet the stringent performance requirements of the fifth generation new radio (5G NR) applications. In this paper, the problem of energy-efficient resource allocation in UAV-a… ▽ More Unmanned aerial vehicles (UAVs) can provide an effective solution for improving the coverage, capacity, and the overall performance of terrestrial wireless cellular networks. In particular, UAV-assisted cellular networks can meet the stringent performance requirements of the fifth generation new radio (5G NR) applications. In this paper, the problem of energy-efficient resource allocation in UAV-assisted cellular networks is studied under the reliability and latency constraints of 5G NR applications. The framework of ruin theory is employed to allow solar-powered UAVs to capture the dynamics of harvested and consumed energies. First, the surplus power of every UAV is modeled, and then it is used to compute the probability of ruin of the UAVs. The probability of ruin denotes the vulnerability of draining out the power of a UAV. Next, the probability of ruin is used for efficient user association with each UAV. Then, power allocation for 5G NR applications is performed to maximize the achievable network rate using the water-filling approach. Simulation results demonstrate that the proposed ruin-based scheme can enhance the flight duration up to 61% and the number of served users in a UAV flight by up to 58\%, compared to a baseline SINR-based scheme. △ Less

Submitted 1 June, 2020; originally announced June 2020.

arXiv:2003.11176 [pdf, other]

Contract-based Scheduling of URLLC Packets in Incumbent EMBB Traffic

Authors: Aunas Manzoor, S. M. Ahsan Kazmi, Shashi Raj Pandey, Choong Seon Hong

Abstract: Recently, the coexistence of ultra-reliable and low-latency communication (URLLC) and enhanced mobile broadband (eMBB) services on the same licensed spectrum has gained a lot of attention from both academia and industry. However, the coexistence of these services is not trivial due to the diverse multiple access protocols, contrasting frame distributions in the existing network, and the distinct q… ▽ More Recently, the coexistence of ultra-reliable and low-latency communication (URLLC) and enhanced mobile broadband (eMBB) services on the same licensed spectrum has gained a lot of attention from both academia and industry. However, the coexistence of these services is not trivial due to the diverse multiple access protocols, contrasting frame distributions in the existing network, and the distinct quality of service requirements posed by these services. Therefore, such coexistence drives towards a challenging resource scheduling problem. To address this problem, in this paper, we first investigate the possibilities of scheduling URLLC packets in incumbent eMBB traffic. In this regard, we formulate an optimization problem for coexistence by dynamically adopting a superposition or puncturing scheme. In particular, the aim is to provide spectrum access to the URLLC users while reducing the intervention on incumbent eMBB users. Next, we apply the one-to-one matching game to find stable URLLC-eMBB pairs that can coexist on the same spectrum. Then, we apply the contract theory framework to design contracts for URLLC users to adopt the superposition scheme. Simulation results reveal that the proposed contract-based scheduling scheme achieves up to 63% of the eMBB rate for the "No URLLC" case compared to the "Puncturing" scheme. △ Less

Submitted 31 March, 2020; v1 submitted 24 March, 2020; originally announced March 2020.

Comments: Submitted to IEEE Access

arXiv:2003.06327 [pdf]

doi 10.1109/ICET48972.2019.8994412

Human Activity Recognition using Multi-Head CNN followed by LSTM

Authors: Waqar Ahmad, Misbah Kazmi, Hazrat Ali

Abstract: This study presents a novel method to recognize human physical activities using CNN followed by LSTM. Achieving high accuracy by traditional machine learning algorithms, (such as SVM, KNN and random forest method) is a challenging task because the data acquired from the wearable sensors like accelerometer and gyroscope is a time-series data. So, to achieve high accuracy, we propose a multi-head CN… ▽ More This study presents a novel method to recognize human physical activities using CNN followed by LSTM. Achieving high accuracy by traditional machine learning algorithms, (such as SVM, KNN and random forest method) is a challenging task because the data acquired from the wearable sensors like accelerometer and gyroscope is a time-series data. So, to achieve high accuracy, we propose a multi-head CNN model comprising of three CNNs to extract features for the data acquired from different sensors and all three CNNs are then merged, which are followed by an LSTM layer and a dense layer. The configuration of all three CNNs is kept the same so that the same number of features are obtained for every input to CNN. By using the proposed method, we achieve state-of-the-art accuracy, which is comparable to traditional machine learning algorithms and other deep neural network algorithms. △ Less

Submitted 21 February, 2020; originally announced March 2020.

Comments: IEEE ICET 2019

arXiv:1909.08747 [pdf, other]

Edge-Computing-Enabled Smart Cities: A Comprehensive Survey

Authors: Latif U. Khan, Ibrar Yaqoob, Nguyen H. Tran, S. M. Ahsan Kazmi, Tri Nguyen Dang, Choong Seon Hong

Abstract: Recent years have disclosed a remarkable proliferation of compute-intensive applications in smart cities. Such applications continuously generate enormous amounts of data which demand strict latency-aware computational processing capabilities. Although edge computing is an appealing technology to compensate for stringent latency related issues, its deployment engenders new challenges. In this surv… ▽ More Recent years have disclosed a remarkable proliferation of compute-intensive applications in smart cities. Such applications continuously generate enormous amounts of data which demand strict latency-aware computational processing capabilities. Although edge computing is an appealing technology to compensate for stringent latency related issues, its deployment engenders new challenges. In this survey, we highlight the role of edge computing in realizing the vision of smart cities. First, we analyze the evolution of edge computing paradigms. Subsequently, we critically review the state-of-the-art literature focusing on edge computing applications in smart cities. Later, we categorize and classify the literature by devising a comprehensive and meticulous taxonomy. Furthermore, we identify and discuss key requirements, and enumerate recently reported synergies of edge computing enabled smart cities. Finally, several indispensable open challenges along with their causes and guidelines are discussed, serving as future research directions. △ Less

Submitted 12 October, 2020; v1 submitted 11 September, 2019; originally announced September 2019.

arXiv:1812.04177 [pdf, other]

Ruin Theory for Dynamic Spectrum Allocation in LTE-U Networks

Authors: Aunas Manzoor, Nguyen H. Tran, Walid Saad, S. M. Ahsan Kazmi, Shashi Raj Pandey, Choong Seon Hong

Abstract: LTE in the unlicensed band (LTE-U) is a promising solution to overcome the scarcity of the wireless spectrum. However, to reap the benefits of LTE-U, it is essential to maintain its effective coexistence with WiFi systems. Such a coexistence, hence, constitutes a major challenge for LTE-U deployment. In this paper, the problem of unlicensed spectrum sharing among WiFi and LTE-U system is studied.… ▽ More LTE in the unlicensed band (LTE-U) is a promising solution to overcome the scarcity of the wireless spectrum. However, to reap the benefits of LTE-U, it is essential to maintain its effective coexistence with WiFi systems. Such a coexistence, hence, constitutes a major challenge for LTE-U deployment. In this paper, the problem of unlicensed spectrum sharing among WiFi and LTE-U system is studied. In particular, a fair time sharing model based on \emph{ruin theory} is proposed to share redundant spectral resources from the unlicensed band with LTE-U without jeopardizing the performance of the WiFi system. Fairness among both WiFi and LTE-U is maintained by applying the concept of the probability of ruin. In particular, the probability of ruin is used to perform efficient duty-cycle allocation in LTE-U, so as to provide fairness to the WiFi system and maintain certain WiFi performance. Simulation results show that the proposed ruin-based algorithm provides better fairness to the WiFi system as compared to equal duty-cycle sharing among WiFi and LTE-U. △ Less

Submitted 10 December, 2018; originally announced December 2018.

Comments: Accepted in IEEE Communications Letters (09-Dec 2018)

arXiv:1706.05171 [pdf, ps, other]

doi 10.1016/j.eswa.2017.06.013

Improving Scalability of Inductive Logic Programming via Pruning and Best-Effort Optimisation

Authors: Mishal Kazmi, Peter Schüller, Yücel Saygın

Abstract: Inductive Logic Programming (ILP) combines rule-based and statistical artificial intelligence methods, by learning a hypothesis comprising a set of rules given background knowledge and constraints for the search space. We focus on extending the XHAIL algorithm for ILP which is based on Answer Set Programming and we evaluate our extensions using the Natural Language Processing application of senten… ▽ More Inductive Logic Programming (ILP) combines rule-based and statistical artificial intelligence methods, by learning a hypothesis comprising a set of rules given background knowledge and constraints for the search space. We focus on extending the XHAIL algorithm for ILP which is based on Answer Set Programming and we evaluate our extensions using the Natural Language Processing application of sentence chunking. With respect to processing natural language, ILP can cater for the constant change in how we use language on a daily basis. At the same time, ILP does not require huge amounts of training examples such as other statistical methods and produces interpretable results, that means a set of rules, which can be analysed and tweaked if necessary. As contributions we extend XHAIL with (i) a pruning mechanism within the hypothesis generalisation algorithm which enables learning from larger datasets, (ii) a better usage of modern solver technology using recently developed optimisation methods, and (iii) a time budget that permits the usage of suboptimal results. We evaluate these improvements on the task of sentence chunking using three datasets from a recent SemEval competition. Results show that our improvements allow for learning on bigger datasets with results that are of similar quality to state-of-the-art systems on the same task. Moreover, we compare the hypotheses obtained on datasets to gain insights on the structure of each dataset. △ Less

Submitted 16 June, 2017; originally announced June 2017.

Comments: 24 pages, preprint of article accepted at Expert Systems With Applications

Journal ref: Expert Systems With Applications 87, pages 291-303, 2017

Showing 1–11 of 11 results for author: Kazmi, M