-
PASA: Attack Agnostic Unsupervised Adversarial Detection using Prediction & Attribution Sensitivity Analysis
Authors:
Dipkamal Bhusal,
Md Tanvirul Alam,
Monish K. Veerabhadran,
Michael Clifford,
Sara Rampazzi,
Nidhi Rastogi
Abstract:
Deep neural networks for classification are vulnerable to adversarial attacks, where small perturbations to input samples lead to incorrect predictions. This susceptibility, combined with the black-box nature of such networks, limits their adoption in critical applications like autonomous driving. Feature-attribution-based explanation methods provide relevance of input features for model predictio…
▽ More
Deep neural networks for classification are vulnerable to adversarial attacks, where small perturbations to input samples lead to incorrect predictions. This susceptibility, combined with the black-box nature of such networks, limits their adoption in critical applications like autonomous driving. Feature-attribution-based explanation methods provide relevance of input features for model predictions on input samples, thus explaining model decisions. However, we observe that both model predictions and feature attributions for input samples are sensitive to noise. We develop a practical method for this characteristic of model prediction and feature attribution to detect adversarial samples. Our method, PASA, requires the computation of two test statistics using model prediction and feature attribution and can reliably detect adversarial samples using thresholds learned from benign samples. We validate our lightweight approach by evaluating the performance of PASA on varying strengths of FGSM, PGD, BIM, and CW attacks on multiple image and non-image datasets. On average, we outperform state-of-the-art statistical unsupervised adversarial detectors on CIFAR-10 and ImageNet by 14\% and 35\% ROC-AUC scores, respectively. Moreover, our approach demonstrates competitive performance even when an adversary is aware of the defense mechanism.
△ Less
Submitted 12 April, 2024;
originally announced April 2024.
-
An Interactive Human-Machine Learning Interface for Collecting and Learning from Complex Annotations
Authors:
Jonathan Erskine,
Matt Clifford,
Alexander Hepburn,
Raúl Santos-Rodríguez
Abstract:
Human-Computer Interaction has been shown to lead to improvements in machine learning systems by boosting model performance, accelerating learning and building user confidence. In this work, we aim to alleviate the expectation that human annotators adapt to the constraints imposed by traditional labels by allowing for extra flexibility in the form that supervision information is collected. For thi…
▽ More
Human-Computer Interaction has been shown to lead to improvements in machine learning systems by boosting model performance, accelerating learning and building user confidence. In this work, we aim to alleviate the expectation that human annotators adapt to the constraints imposed by traditional labels by allowing for extra flexibility in the form that supervision information is collected. For this, we propose a human-machine learning interface for binary classification tasks which enables human annotators to utilise counterfactual examples to complement standard binary labels as annotations for a dataset. Finally we discuss the challenges in future extensions of this work.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Invisible Reflections: Leveraging Infrared Laser Reflections to Target Traffic Sign Perception
Authors:
Takami Sato,
Sri Hrushikesh Varma Bhupathiraju,
Michael Clifford,
Takeshi Sugawara,
Qi Alfred Chen,
Sara Rampazzi
Abstract:
All vehicles must follow the rules that govern traffic behavior, regardless of whether the vehicles are human-driven or Connected Autonomous Vehicles (CAVs). Road signs indicate locally active rules, such as speed limits and requirements to yield or stop. Recent research has demonstrated attacks, such as adding stickers or projected colored patches to signs, that cause CAV misinterpretation, resul…
▽ More
All vehicles must follow the rules that govern traffic behavior, regardless of whether the vehicles are human-driven or Connected Autonomous Vehicles (CAVs). Road signs indicate locally active rules, such as speed limits and requirements to yield or stop. Recent research has demonstrated attacks, such as adding stickers or projected colored patches to signs, that cause CAV misinterpretation, resulting in potential safety issues. Humans can see and potentially defend against these attacks. But humans can not detect what they can not observe. We have developed an effective physical-world attack that leverages the sensitivity of filterless image sensors and the properties of Infrared Laser Reflections (ILRs), which are invisible to humans. The attack is designed to affect CAV cameras and perception, undermining traffic sign recognition by inducing misclassification. In this work, we formulate the threat model and requirements for an ILR-based traffic sign perception attack to succeed. We evaluate the effectiveness of the ILR attack with real-world experiments against two major traffic sign recognition architectures on four IR-sensitive cameras. Our black-box optimization methodology allows the attack to achieve up to a 100% attack success rate in indoor, static scenarios and a >80.5% attack success rate in our outdoor, moving vehicle scenarios. We find the latest state-of-the-art certifiable defense is ineffective against ILR attacks as it mis-certifies >33.5% of cases. To address this, we propose a detection strategy based on the physical properties of IR laser reflections which can detect 96% of ILR attacks.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
SoK: Modeling Explainability in Security Analytics for Interpretability, Trustworthiness, and Usability
Authors:
Dipkamal Bhusal,
Rosalyn Shin,
Ajay Ashok Shewale,
Monish Kumar Manikya Veerabhadran,
Michael Clifford,
Sara Rampazzi,
Nidhi Rastogi
Abstract:
Interpretability, trustworthiness, and usability are key considerations in high-stake security applications, especially when utilizing deep learning models. While these models are known for their high accuracy, they behave as black boxes in which identifying important features and factors that led to a classification or a prediction is difficult. This can lead to uncertainty and distrust, especial…
▽ More
Interpretability, trustworthiness, and usability are key considerations in high-stake security applications, especially when utilizing deep learning models. While these models are known for their high accuracy, they behave as black boxes in which identifying important features and factors that led to a classification or a prediction is difficult. This can lead to uncertainty and distrust, especially when an incorrect prediction results in severe consequences. Thus, explanation methods aim to provide insights into the inner working of deep learning models. However, most explanation methods provide inconsistent explanations, have low fidelity, and are susceptible to adversarial manipulation, which can reduce model trustworthiness. This paper provides a comprehensive analysis of explainable methods and demonstrates their efficacy in three distinct security applications: anomaly detection using system logs, malware prediction, and detection of adversarial images. Our quantitative and qualitative analysis reveals serious limitations and concerns in state-of-the-art explanation methods in all three applications. We show that explanation methods for security applications necessitate distinct characteristics, such as stability, fidelity, robustness, and usability, among others, which we outline as the prerequisites for trustworthy explanation methods.
△ Less
Submitted 12 June, 2023; v1 submitted 31 October, 2022;
originally announced October 2022.
-
FAT Forensics: A Python Toolbox for Implementing and Deploying Fairness, Accountability and Transparency Algorithms in Predictive Systems
Authors:
Kacper Sokol,
Alexander Hepburn,
Rafael Poyiadzi,
Matthew Clifford,
Raul Santos-Rodriguez,
Peter Flach
Abstract:
Predictive systems, in particular machine learning algorithms, can take important, and sometimes legally binding, decisions about our everyday life. In most cases, however, these systems and decisions are neither regulated nor certified. Given the potential harm that these algorithms can cause, their qualities such as fairness, accountability and transparency (FAT) are of paramount importance. To…
▽ More
Predictive systems, in particular machine learning algorithms, can take important, and sometimes legally binding, decisions about our everyday life. In most cases, however, these systems and decisions are neither regulated nor certified. Given the potential harm that these algorithms can cause, their qualities such as fairness, accountability and transparency (FAT) are of paramount importance. To ensure high-quality, fair, transparent and reliable predictive systems, we developed an open source Python package called FAT Forensics. It can inspect important fairness, accountability and transparency aspects of predictive algorithms to automatically and objectively report them back to engineers and users of such systems. Our toolbox can evaluate all elements of a predictive pipeline: data (and their features), models and predictions. Published under the BSD 3-Clause open source licence, FAT Forensics is opened up for personal and commercial usage.
△ Less
Submitted 8 September, 2022;
originally announced September 2022.
-
Explaining RADAR features for detecting spoofing attacks in Connected Autonomous Vehicles
Authors:
Nidhi Rastogi,
Sara Rampazzi,
Michael Clifford,
Miriam Heller,
Matthew Bishop,
Karl Levitt
Abstract:
Connected autonomous vehicles (CAVs) are anticipated to have built-in AI systems for defending against cyberattacks. Machine learning (ML) models form the basis of many such AI systems. These models are notorious for acting like black boxes, transforming inputs into solutions with great accuracy, but no explanations support their decisions. Explanations are needed to communicate model performance,…
▽ More
Connected autonomous vehicles (CAVs) are anticipated to have built-in AI systems for defending against cyberattacks. Machine learning (ML) models form the basis of many such AI systems. These models are notorious for acting like black boxes, transforming inputs into solutions with great accuracy, but no explanations support their decisions. Explanations are needed to communicate model performance, make decisions transparent, and establish trust in the models with stakeholders. Explanations can also indicate when humans must take control, for instance, when the ML model makes low confidence decisions or offers multiple or ambiguous alternatives. Explanations also provide evidence for post-incident forensic analysis. Research on explainable ML to security problems is limited, and more so concerning CAVs. This paper surfaces a critical yet under-researched sensor data \textit{uncertainty} problem for training ML attack detection models, especially in highly mobile and risk-averse platforms such as autonomous vehicles. We present a model that explains \textit{certainty} and \textit{uncertainty} in sensor input -- a missing characteristic in data collection. We hypothesize that model explanation is inaccurate for a given system without explainable input data quality. We estimate \textit{uncertainty} and mass functions for features in radar sensor data and incorporate them into the training model through experimental evaluation. The mass function allows the classifier to categorize all spoofed inputs accurately with an incorrect class label.
△ Less
Submitted 28 February, 2022;
originally announced March 2022.