From Hero to Z\'eroe: A Benchmark of Low-Level Adversarial Attacks

Eger, Steffen; Benz, Yannik

Computer Science > Computation and Language

arXiv:2010.05648 (cs)

[Submitted on 12 Oct 2020 (v1), last revised 28 Oct 2020 (this version, v2)]

Title:From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks

Authors:Steffen Eger, Yannik Benz

View PDF

Abstract:Adversarial attacks are label-preserving modifications to inputs of machine learning classifiers designed to fool machines but not humans. Natural Language Processing (NLP) has mostly focused on high-level attack scenarios such as paraphrasing input texts. We argue that these are less realistic in typical application scenarios such as in social media, and instead focus on low-level attacks on the character-level. Guided by human cognitive abilities and human robustness, we propose the first large-scale catalogue and benchmark of low-level adversarial attacks, which we dub Zéroe, encompassing nine different attack modes including visual and phonetic adversaries. We show that RoBERTa, NLP's current workhorse, fails on our attacks. Our dataset provides a benchmark for testing robustness of future more human-like NLP models.

Comments:	Authors accidentally in wrong order; cannot be undone due to conference constraints. Accepted for publication at AACL 2020
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2010.05648 [cs.CL]
	(or arXiv:2010.05648v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2010.05648

Submission history

From: Steffen Eger [view email]
[v1] Mon, 12 Oct 2020 12:35:36 UTC (1,141 KB)
[v2] Wed, 28 Oct 2020 12:53:05 UTC (1,141 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-10

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Steffen Eger

export BibTeX citation

Computer Science > Computation and Language

Title:From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators