Search | arXiv e-print repository

TransCAB: Transferable Clean-Annotation Backdoor to Object Detection with Natural Trigger in Real-World

Authors: Hua Ma, Yinshan Li, Yansong Gao, Zhi Zhang, Alsharif Abuadbba, Anmin Fu, Said F. Al-Sarawi, Nepal Surya, Derek Abbott

Abstract: Object detection is the foundation of various critical computer-vision tasks such as segmentation, object tracking, and event detection. To train an object detector with satisfactory accuracy, a large amount of data is required. However, due to the intensive workforce involved with annotating large datasets, such a data curation task is often outsourced to a third party or relied on volunteers. Th… ▽ More Object detection is the foundation of various critical computer-vision tasks such as segmentation, object tracking, and event detection. To train an object detector with satisfactory accuracy, a large amount of data is required. However, due to the intensive workforce involved with annotating large datasets, such a data curation task is often outsourced to a third party or relied on volunteers. This work reveals severe vulnerabilities of such data curation pipeline. We propose MACAB that crafts clean-annotated images to stealthily implant the backdoor into the object detectors trained on them even when the data curator can manually audit the images. We observe that the backdoor effect of both misclassification and the cloaking are robustly achieved in the wild when the backdoor is activated with inconspicuously natural physical triggers. Backdooring non-classification object detection with clean-annotation is challenging compared to backdooring existing image classification tasks with clean-label, owing to the complexity of having multiple objects within each frame, including victim and non-victim objects. The efficacy of the MACAB is ensured by constructively i abusing the image-scaling function used by the deep learning framework, ii incorporating the proposed adversarial clean image replica technique, and iii combining poison data selection criteria given constrained attacking budget. Extensive experiments demonstrate that MACAB exhibits more than 90% attack success rate under various real-world scenes. This includes both cloaking and misclassification backdoor effect even restricted with a small attack budget. The poisoned samples cannot be effectively identified by state-of-the-art detection techniques.The comprehensive video demo is at https://youtu.be/MA7L_LpXkp4, which is based on a poison rate of 0.14% for YOLOv4 cloaking backdoor and Faster R-CNN misclassification backdoor. △ Less

Submitted 2 September, 2023; v1 submitted 6 September, 2022; originally announced September 2022.

arXiv:2206.04531 [pdf, other]

ECLAD: Extracting Concepts with Local Aggregated Descriptors

Authors: Andres Felipe Posada-Moreno, Nikita Surya, Sebastian Trimpe

Abstract: Convolutional neural networks (CNNs) are increasingly being used in critical systems, where robustness and alignment are crucial. In this context, the field of explainable artificial intelligence has proposed the generation of high-level explanations of the prediction process of CNNs through concept extraction. While these methods can detect whether or not a concept is present in an image, they ar… ▽ More Convolutional neural networks (CNNs) are increasingly being used in critical systems, where robustness and alignment are crucial. In this context, the field of explainable artificial intelligence has proposed the generation of high-level explanations of the prediction process of CNNs through concept extraction. While these methods can detect whether or not a concept is present in an image, they are unable to determine its location. What is more, a fair comparison of such approaches is difficult due to a lack of proper validation procedures. To address these issues, we propose a novel method for automatic concept extraction and localization based on representations obtained through pixel-wise aggregations of CNN activation maps. Further, we introduce a process for the validation of concept-extraction techniques based on synthetic datasets with pixel-wise annotations of their main components, reducing the need for human intervention. Extensive experimentation on both synthetic and real-world datasets demonstrates that our method outperforms state-of-the-art alternatives. △ Less

Submitted 11 August, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

Comments: 34 pages, under review

MSC Class: 68T01 ACM Class: I.2.10; I.2.m

arXiv:2201.08619 [pdf, other]

Dangerous Cloaking: Natural Trigger based Backdoor Attacks on Object Detectors in the Physical World

Authors: Hua Ma, Yinshan Li, Yansong Gao, Alsharif Abuadbba, Zhi Zhang, Anmin Fu, Hyoungshick Kim, Said F. Al-Sarawi, Nepal Surya, Derek Abbott

Abstract: Deep learning models have been shown to be vulnerable to recent backdoor attacks. A backdoored model behaves normally for inputs containing no attacker-secretly-chosen trigger and maliciously for inputs with the trigger. To date, backdoor attacks and countermeasures mainly focus on image classification tasks. And most of them are implemented in the digital world with digital triggers. Besides the… ▽ More Deep learning models have been shown to be vulnerable to recent backdoor attacks. A backdoored model behaves normally for inputs containing no attacker-secretly-chosen trigger and maliciously for inputs with the trigger. To date, backdoor attacks and countermeasures mainly focus on image classification tasks. And most of them are implemented in the digital world with digital triggers. Besides the classification tasks, object detection systems are also considered as one of the basic foundations of computer vision tasks. However, there is no investigation and understanding of the backdoor vulnerability of the object detector, even in the digital world with digital triggers. For the first time, this work demonstrates that existing object detectors are inherently susceptible to physical backdoor attacks. We use a natural T-shirt bought from a market as a trigger to enable the cloaking effect--the person bounding-box disappears in front of the object detector. We show that such a backdoor can be implanted from two exploitable attack scenarios into the object detector, which is outsourced or fine-tuned through a pretrained model. We have extensively evaluated three popular object detection algorithms: anchor-based Yolo-V3, Yolo-V4, and anchor-free CenterNet. Building upon 19 videos shot in real-world scenes, we confirm that the backdoor attack is robust against various factors: movement, distance, angle, non-rigid deformation, and lighting. Specifically, the attack success rate (ASR) in most videos is 100% or close to it, while the clean data accuracy of the backdoored model is the same as its clean counterpart. The latter implies that it is infeasible to detect the backdoor behavior merely through a validation set. The averaged ASR still remains sufficiently high to be 78% in the transfer learning attack scenarios evaluated on CenterNet. See the demo video on https://youtu.be/Q3HOF4OobbY. △ Less

Submitted 29 May, 2022; v1 submitted 21 January, 2022; originally announced January 2022.

arXiv:2104.00442 [pdf, other]

Touch-based Curiosity for Sparse-Reward Tasks

Authors: Sai Rajeswar, Cyril Ibrahim, Nitin Surya, Florian Golemo, David Vazquez, Aaron Courville, Pedro O. Pinheiro

Abstract: Robots in many real-world settings have access to force/torque sensors in their gripper and tactile sensing is often necessary in tasks that involve contact-rich motion. In this work, we leverage surprise from mismatches in touch feedback to guide exploration in hard sparse-reward reinforcement learning tasks. Our approach, Touch-based Curiosity (ToC), learns what visible objects interactions are… ▽ More Robots in many real-world settings have access to force/torque sensors in their gripper and tactile sensing is often necessary in tasks that involve contact-rich motion. In this work, we leverage surprise from mismatches in touch feedback to guide exploration in hard sparse-reward reinforcement learning tasks. Our approach, Touch-based Curiosity (ToC), learns what visible objects interactions are supposed to "feel" like. We encourage exploration by rewarding interactions where the expectation and the experience don't match. In our proposed method, an initial task-independent exploration phase is followed by an on-task learning phase, in which the original interactions are relabeled with on-task rewards. We test our approach on a range of touch-intensive robot arm tasks (e.g. pushing objects, opening doors), which we also release as part of this work. Across multiple experiments in a simulated setting, we demonstrate that our method is able to learn these difficult tasks through sparse reward and curiosity alone. We compare our cross-modal approach to single-modality (touch- or vision-only) approaches as well as other curiosity-based methods and find that our method performs better and is more sample-efficient. △ Less

Submitted 26 June, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

arXiv:2102.10269 [pdf, other]

SoftTRR: Protect Page Tables Against RowHammer Attacks using Software-only Target Row Refresh

Authors: Zhi Zhang, Yueqiang Cheng, Minghua Wang, Wei He, Wenhao Wang, Nepal Surya, Yansong Gao, Kang Li, Zhe Wang, Chenggang Wu

Abstract: Rowhammer attacks that corrupt level-1 page tables to gain kernel privilege are the most detrimental to system security and hard to mitigate. However, recently proposed software-only mitigations are not effective against such kernel privilege escalation attacks. In this paper, we propose an effective and practical software-only defense, called SoftTRR, to protect page tables from all existing rowh… ▽ More Rowhammer attacks that corrupt level-1 page tables to gain kernel privilege are the most detrimental to system security and hard to mitigate. However, recently proposed software-only mitigations are not effective against such kernel privilege escalation attacks. In this paper, we propose an effective and practical software-only defense, called SoftTRR, to protect page tables from all existing rowhammer attacks on x86. The key idea of SoftTRR is to refresh the rows occupied by page tables when a suspicious rowhammer activity is detected. SoftTRR is motivated by DRAM-chip-based target row refresh (ChipTRR) but eliminates its main security limitation (i.e., ChipTRR tracks a limited number of rows and thus can be bypassed by many-sided hammer). Specifically, SoftTRR protects an unlimited number of page tables by tracking memory accesses to the rows that are in close proximity to page-table rows and refreshing the page-table rows once the tracked access count exceeds a pre-defined threshold. We implement a prototype of SoftTRR as a loadable kernel module, and evaluate its security effectiveness, performance overhead, and memory consumption. The experimental results show that SoftTRR protects page tables from real-world rowhammer attacks and incurs small performance overhead as well as memory cost. △ Less

Submitted 12 December, 2021; v1 submitted 20 February, 2021; originally announced February 2021.

arXiv:1801.06619 [pdf, ps, other]

Machine Learning Methods for User Positioning With Uplink RSS in Distributed Massive MIMO

Authors: K. N. R. Surya Vara Prasad, Ekram Hossain, Vijay K. Bhargava

Abstract: We consider a machine learning approach based on Gaussian process regression (GP) to position users in a distributed massive multiple-input multiple-output (MIMO) system with the uplink received signal strength (RSS) data. We focus on the scenario where noise-free RSS is available for training, but only noisy RSS is available for testing purposes. To estimate the test user locations and their 2σ e… ▽ More We consider a machine learning approach based on Gaussian process regression (GP) to position users in a distributed massive multiple-input multiple-output (MIMO) system with the uplink received signal strength (RSS) data. We focus on the scenario where noise-free RSS is available for training, but only noisy RSS is available for testing purposes. To estimate the test user locations and their 2σ error-bars, we adopt two state-of-the-art GP methods, namely, the conventional GP (CGP) and the numerical approximation GP (NaGP) methods. We find that the CGP method, which treats the noisy test RSS vectors as noise-free, provides unrealistically small 2σ error-bars on the estimated locations. To alleviate this concern, we derive the true predictive distribution for the test user locations and then employ the NaGP method to numerically approximate it as a Gaussian with the same first and second order moments. We also derive a Bayesian Cramer-Rao lower bound (BCRLB) on the achievable root- mean-squared-error (RMSE) performance of the two GP methods. Simulation studies reveal that: (i) the NaGP method indeed provides realistic 2σ error-bars on the estimated locations, (ii) operation in massive MIMO regime improves the RMSE performance, and (iii) the achieved RMSE performances are very close to the derived BCRLB. △ Less

Submitted 19 January, 2018; originally announced January 2018.

Comments: submitted to IEEE Trans. Wireless Commun., Jan 2018

arXiv:1708.02279 [pdf, ps, other]

Low-Dimensionality of Noise-Free RSS and its Application in Distributed Massive MIMO

Authors: K. N. R. Surya Vara Prasad, Ekram Hossain, Vijay K. Bhargava

Abstract: We examine the dimensionality of noise-free uplink received signal strength (RSS) data in a distributed multiuser massive multiple-input multiple-output system. Specifically, we apply principal component analysis to the noise-free uplink RSS and observe that it has a low-dimensional principal subspace. We make use of this unique property to propose RecGP - a reconstruction-based Gaussian process r… ▽ More We examine the dimensionality of noise-free uplink received signal strength (RSS) data in a distributed multiuser massive multiple-input multiple-output system. Specifically, we apply principal component analysis to the noise-free uplink RSS and observe that it has a low-dimensional principal subspace. We make use of this unique property to propose RecGP - a reconstruction-based Gaussian process regression (GP) method which predicts user locations from uplink RSS data. Considering noise-free RSS for training and noisy test RSS for location prediction, RecGP reconstructs the noisy test RSS from a low- dimensional principal subspace of the noise-free training RSS. The reconstructed RSS is input to a trained GP model for location prediction. Noise reduction facilitated by the reconstruction step allows RecGP to achieve lower prediction error than standard GP methods which directly use the test RSS for location prediction. △ Less

Submitted 7 August, 2017; originally announced August 2017.

Comments: submitted to IEEE Wireless Communication Letters, July 2017

Showing 1–7 of 7 results for author: Surya, N