Search | arXiv e-print repository

Adversarial Attacks and Dimensionality in Text Classifiers

Authors: Nandish Chattopadhyay, Atreya Goswami, Anupam Chattopadhyay

Abstract: Adversarial attacks on machine learning algorithms have been a key deterrent to the adoption of AI in many real-world use cases. They significantly undermine the ability of high-performance neural networks by forcing misclassifications. These attacks introduce minute and structured perturbations or alterations in the test samples, imperceptible to human annotators in general, but trained neural ne… ▽ More Adversarial attacks on machine learning algorithms have been a key deterrent to the adoption of AI in many real-world use cases. They significantly undermine the ability of high-performance neural networks by forcing misclassifications. These attacks introduce minute and structured perturbations or alterations in the test samples, imperceptible to human annotators in general, but trained neural networks and other models are sensitive to it. Historically, adversarial attacks have been first identified and studied in the domain of image processing. In this paper, we study adversarial examples in the field of natural language processing, specifically text classification tasks. We investigate the reasons for adversarial vulnerability, particularly in relation to the inherent dimensionality of the model. Our key finding is that there is a very strong correlation between the embedding dimensionality of the adversarial samples and their effectiveness on models tuned with input samples with same embedding dimension. We utilize this sensitivity to design an adversarial defense mechanism. We use ensemble models of varying inherent dimensionality to thwart the attacks. This is tested on multiple datasets for its efficacy in providing robustness. We also study the problem of measuring adversarial perturbation using different distance metrics. For all of the aforementioned studies, we have run tests on multiple models with varying dimensionality and used a word-vector level adversarial attack to substantiate the findings. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: This paper is accepted for publication at EURASIP Journal on Information Security in 2024

arXiv:2402.06249 [pdf, other]

Anomaly Unveiled: Securing Image Classification against Adversarial Patch Attacks

Authors: Nandish Chattopadhyay, Amira Guesmi, Muhammad Shafique

Abstract: Adversarial patch attacks pose a significant threat to the practical deployment of deep learning systems. However, existing research primarily focuses on image pre-processing defenses, which often result in reduced classification accuracy for clean images and fail to effectively counter physically feasible attacks. In this paper, we investigate the behavior of adversarial patches as anomalies with… ▽ More Adversarial patch attacks pose a significant threat to the practical deployment of deep learning systems. However, existing research primarily focuses on image pre-processing defenses, which often result in reduced classification accuracy for clean images and fail to effectively counter physically feasible attacks. In this paper, we investigate the behavior of adversarial patches as anomalies within the distribution of image information and leverage this insight to develop a robust defense strategy. Our proposed defense mechanism utilizes a clustering-based technique called DBSCAN to isolate anomalous image segments, which is carried out by a three-stage pipeline consisting of Segmenting, Isolating, and Blocking phases to identify and mitigate adversarial noise. Upon identifying adversarial components, we neutralize them by replacing them with the mean pixel value, surpassing alternative replacement options. Our model-agnostic defense mechanism is evaluated across multiple models and datasets, demonstrating its effectiveness in countering various adversarial patch attacks in image classification tasks. Our proposed approach significantly improves accuracy, increasing from 38.8\% without the defense to 67.1\% with the defense against LaVAN and GoogleAp attacks, surpassing prominent state-of-the-art methods such as LGS (53.86\%) and Jujutsu (60\%) △ Less

Submitted 9 February, 2024; originally announced February 2024.

arXiv:2311.12211 [pdf, other]

DefensiveDR: Defending against Adversarial Patches using Dimensionality Reduction

Authors: Nandish Chattopadhyay, Amira Guesmi, Muhammad Abdullah Hanif, Bassem Ouni, Muhammad Shafique

Abstract: Adversarial patch-based attacks have shown to be a major deterrent towards the reliable use of machine learning models. These attacks involve the strategic modification of localized patches or specific image areas to deceive trained machine learning models. In this paper, we propose \textit{DefensiveDR}, a practical mechanism using a dimensionality reduction technique to thwart such patch-based at… ▽ More Adversarial patch-based attacks have shown to be a major deterrent towards the reliable use of machine learning models. These attacks involve the strategic modification of localized patches or specific image areas to deceive trained machine learning models. In this paper, we propose \textit{DefensiveDR}, a practical mechanism using a dimensionality reduction technique to thwart such patch-based attacks. Our method involves projecting the sample images onto a lower-dimensional space while retaining essential information or variability for effective machine learning tasks. We perform this using two techniques, Singular Value Decomposition and t-Distributed Stochastic Neighbor Embedding. We experimentally tune the variability to be preserved for optimal performance as a hyper-parameter. This dimension reduction substantially mitigates adversarial perturbations, thereby enhancing the robustness of the given machine learning model. Our defense is model-agnostic and operates without assumptions about access to model decisions or model architectures, making it effective in both black-box and white-box settings. Furthermore, it maintains accuracy across various models and remains robust against several unseen patch-based attacks. The proposed defensive approach improves the accuracy from 38.8\% (without defense) to 66.2\% (with defense) when performing LaVAN and GoogleAp attacks, which supersedes that of the prominent state-of-the-art like LGS (53.86\%) and Jujutsu (60\%). △ Less

Submitted 20 November, 2023; originally announced November 2023.

arXiv:2311.12084 [pdf, other]

ODDR: Outlier Detection & Dimension Reduction Based Defense Against Adversarial Patches

Authors: Nandish Chattopadhyay, Amira Guesmi, Muhammad Abdullah Hanif, Bassem Ouni, Muhammad Shafique

Abstract: Adversarial attacks are a major deterrent towards the reliable use of machine learning models. A powerful type of adversarial attacks is the patch-based attack, wherein the adversarial perturbations modify localized patches or specific areas within the images to deceive the trained machine learning model. In this paper, we introduce Outlier Detection and Dimension Reduction (ODDR), a holistic defe… ▽ More Adversarial attacks are a major deterrent towards the reliable use of machine learning models. A powerful type of adversarial attacks is the patch-based attack, wherein the adversarial perturbations modify localized patches or specific areas within the images to deceive the trained machine learning model. In this paper, we introduce Outlier Detection and Dimension Reduction (ODDR), a holistic defense mechanism designed to effectively mitigate patch-based adversarial attacks. In our approach, we posit that input features corresponding to adversarial patches, whether naturalistic or otherwise, deviate from the inherent distribution of the remaining image sample and can be identified as outliers or anomalies. ODDR employs a three-stage pipeline: Fragmentation, Segregation, and Neutralization, providing a model-agnostic solution applicable to both image classification and object detection tasks. The Fragmentation stage parses the samples into chunks for the subsequent Segregation process. Here, outlier detection techniques identify and segregate the anomalous features associated with adversarial perturbations. The Neutralization stage utilizes dimension reduction methods on the outliers to mitigate the impact of adversarial perturbations without sacrificing pertinent information necessary for the machine learning task. Extensive testing on benchmark datasets and state-of-the-art adversarial patches demonstrates the effectiveness of ODDR. Results indicate robust accuracies matching and lying within a small range of clean accuracies (1%-3% for classification and 3%-5% for object detection), with only a marginal compromise of 1%-2% in performance on clean samples, thereby significantly outperforming other defenses. △ Less

Submitted 20 November, 2023; originally announced November 2023.

arXiv:2208.14206 [pdf, other]

FUSION: Fully Unsupervised Test-Time Stain Adaptation via Fused Normalization Statistics

Authors: Nilanjan Chattopadhyay, Shiv Gehlot, Nitin Singhal

Abstract: Staining reveals the micro structure of the aspirate while creating histopathology slides. Stain variation, defined as a chromatic difference between the source and the target, is caused by varying characteristics during staining, resulting in a distribution shift and poor performance on the target. The goal of stain normalization is to match the target's chromatic distribution to that of the sour… ▽ More Staining reveals the micro structure of the aspirate while creating histopathology slides. Stain variation, defined as a chromatic difference between the source and the target, is caused by varying characteristics during staining, resulting in a distribution shift and poor performance on the target. The goal of stain normalization is to match the target's chromatic distribution to that of the source. However, stain normalisation causes the underlying morphology to distort, resulting in an incorrect diagnosis. We propose FUSION, a new method for promoting stain-adaption by adjusting the model to the target in an unsupervised test-time scenario, eliminating the necessity for significant labelling at the target end. FUSION works by altering the target's batch normalization statistics and fusing them with source statistics using a weighting factor. The algorithm reduces to one of two extremes based on the weighting factor. Despite the lack of training or supervision, FUSION surpasses existing equivalent algorithms for classification and dense predictions (segmentation), as demonstrated by comprehensive experiments on two public datasets. △ Less

Submitted 30 August, 2022; originally announced August 2022.

Comments: Accepted in European Conference on Computer Vision (ECCV) 2022 Workshop: AI-enabled medical image analysis (AIMIA)

arXiv:2011.10794 [pdf, other]

Spatially Correlated Patterns in Adversarial Images

Authors: Nandish Chattopadhyay, Lionell Yip En Zhi, Bryan Tan Bing Xing, Anupam Chattopadhyay

Abstract: Adversarial attacks have proved to be the major impediment in the progress on research towards reliable machine learning solutions. Carefully crafted perturbations, imperceptible to human vision, can be added to images to force misclassification by an otherwise high performing neural network. To have a better understanding of the key contributors of such structured attacks, we searched for and stu… ▽ More Adversarial attacks have proved to be the major impediment in the progress on research towards reliable machine learning solutions. Carefully crafted perturbations, imperceptible to human vision, can be added to images to force misclassification by an otherwise high performing neural network. To have a better understanding of the key contributors of such structured attacks, we searched for and studied spatially co-located patterns in the distribution of pixels in the input space. In this paper, we propose a framework for segregating and isolating regions within an input image which are particularly critical towards either classification (during inference), or adversarial vulnerability or both. We assert that during inference, the trained model looks at a specific region in the image, which we call Region of Importance (RoI); and the attacker looks at a region to alter/modify, which we call Region of Attack (RoA). The success of this approach could also be used to design a post-hoc adversarial defence method, as illustrated by our observations. This uses the notion of blocking out (we call neutralizing) that region of the image which is highly vulnerable to adversarial attacks but is not important for the task of classification. We establish the theoretical setup for formalising the process of segregation, isolation and neutralization and substantiate it through empirical analysis on standard benchmarking datasets. The findings strongly indicate that map** features into the input space preserves the significant patterns typically observed in the feature-space while adding major interpretability and therefore simplifies potential defensive mechanisms. △ Less

Submitted 21 November, 2020; originally announced November 2020.

Comments: Submitted for review

arXiv:1401.6951 [pdf, ps, other]

doi 10.1016/j.physa.2014.05.026

Inequality in Societies, Academic Institutions and Science Journals: Gini and k-indices

Authors: Asim Ghosh, Nachiketa Chattopadhyay, Bikas K. Chakrabarti

Abstract: Social inequality is traditionally measured by the Gini-index ($g$). The $g$-index takes values from $0$ to $1$ where $g=0$ represents complete equality and $g=1$ represents complete inequality. Most of the estimates of the income or wealth data indicate the $g$ value to be widely dispersed across the countries of the world: \textit{g} values typically range from $0.30$ to $0.65$ at a particular t… ▽ More Social inequality is traditionally measured by the Gini-index ($g$). The $g$-index takes values from $0$ to $1$ where $g=0$ represents complete equality and $g=1$ represents complete inequality. Most of the estimates of the income or wealth data indicate the $g$ value to be widely dispersed across the countries of the world: \textit{g} values typically range from $0.30$ to $0.65$ at a particular time (year). We estimated similarly the Gini-index for the citations earned by the yearly publications of various academic institutions and the science journals. The ISI web of science data suggests remarkably strong inequality and universality ($g=0.70\pm0.07$) across all the universities and institutions of the world, while for the journals we find $g=0.65\pm0.15$ for any typical year. We define a new inequality measure, namely the $k$-index, saying that the cumulative income or citations of ($1-k$) fraction of people or papers exceed those earned by the fraction ($k$) of the people or publications respectively. We find, while the $k$-index value for income ranges from $0.60$ to $0.75$ for income distributions across the world, it has a value around $0.75\pm0.05$ for different universities and institutions across the world and around $0.77\pm0.10$ for the science journals. Apart from above indices, we also analyze the same institution and journal citation data by measuring Pietra index and median index. △ Less

Submitted 26 May, 2014; v1 submitted 27 January, 2014; originally announced January 2014.

Comments: 5 pages, 3 figures

Journal ref: Physica A 410 (2014) 30-34

Showing 1–7 of 7 results for author: Chattopadhyay, N