One Thing to Fool them All: Generating Interpretable, Universal, and Physically-Realizable Adversarial Features

Casper, Stephen; Nadeau, Max; Kreiman, Gabriel

Computer Science > Machine Learning

arXiv:2110.03605v1 (cs)

[Submitted on 7 Oct 2021 (this version), latest version 11 Sep 2023 (v7)]

Title:One Thing to Fool them All: Generating Interpretable, Universal, and Physically-Realizable Adversarial Features

Authors:Stephen Casper, Max Nadeau, Gabriel Kreiman

View PDF

Abstract:It is well understood that modern deep networks are vulnerable to adversarial attacks. However, conventional methods fail to produce adversarial perturbations that are intelligible to humans, and they pose limited threats in the physical world. To study feature-class associations in networks and better understand the real-world threats they face, we develop feature-level adversarial perturbations using deep image generators and a novel optimization objective. We term these feature-fool attacks. We show that they are versatile and use them to generate targeted feature-level attacks at the ImageNet scale that are simultaneously interpretable, universal to any source image, and physically-realizable. These attacks can also reveal spurious, semantically-describable feature/class associations, and we use them to guide the design of "copy/paste" adversaries in which one natural image is pasted into another to cause a targeted misclassification.

Comments:	Code is available at: this https URL
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2110.03605 [cs.LG]
	(or arXiv:2110.03605v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2110.03605

Submission history

From: Maxwell Nadeau [view email]
[v1] Thu, 7 Oct 2021 16:33:11 UTC (9,245 KB)
[v2] Mon, 11 Oct 2021 19:09:55 UTC (9,245 KB)
[v3] Fri, 28 Jan 2022 18:04:56 UTC (9,597 KB)
[v4] Thu, 2 Jun 2022 02:45:59 UTC (11,461 KB)
[v5] Sun, 16 Oct 2022 19:41:06 UTC (13,014 KB)
[v6] Sat, 7 Jan 2023 23:40:24 UTC (13,014 KB)
[v7] Mon, 11 Sep 2023 16:31:55 UTC (13,014 KB)

Computer Science > Machine Learning

Title:One Thing to Fool them All: Generating Interpretable, Universal, and Physically-Realizable Adversarial Features

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:One Thing to Fool them All: Generating Interpretable, Universal, and Physically-Realizable Adversarial Features

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators