Exploring the Interplay of Interpretability and Robustness in Deep Neural Networks: A Saliency-guided Approach

Guesmi, Amira; Aswani, Nishant Suresh; Shafique, Muhammad

Computer Science > Computer Vision and Pattern Recognition

arXiv:2405.06278v1 (cs)

[Submitted on 10 May 2024]

Title:Exploring the Interplay of Interpretability and Robustness in Deep Neural Networks: A Saliency-guided Approach

Authors:Amira Guesmi, Nishant Suresh Aswani, Muhammad Shafique

View PDF HTML (experimental)

Abstract:Adversarial attacks pose a significant challenge to deploying deep learning models in safety-critical applications. Maintaining model robustness while ensuring interpretability is vital for fostering trust and comprehension in these models. This study investigates the impact of Saliency-guided Training (SGT) on model robustness, a technique aimed at improving the clarity of saliency maps to deepen understanding of the model's decision-making process. Experiments were conducted on standard benchmark datasets using various deep learning architectures trained with and without SGT. Findings demonstrate that SGT enhances both model robustness and interpretability. Additionally, we propose a novel approach combining SGT with standard adversarial training to achieve even greater robustness while preserving saliency map quality. Our strategy is grounded in the assumption that preserving salient features crucial for correctly classifying adversarial examples enhances model robustness, while masking non-relevant features improves interpretability. Our technique yields significant gains, achieving a 35\% and 20\% improvement in robustness against PGD attack with noise magnitudes of $0.2$ and $0.02$ for the MNIST and CIFAR-10 datasets, respectively, while producing high-quality saliency maps.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
Cite as:	arXiv:2405.06278 [cs.CV]
	(or arXiv:2405.06278v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2405.06278

Submission history

From: Amira Guesmi [view email]
[v1] Fri, 10 May 2024 07:21:03 UTC (4,732 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Exploring the Interplay of Interpretability and Robustness in Deep Neural Networks: A Saliency-guided Approach

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Exploring the Interplay of Interpretability and Robustness in Deep Neural Networks: A Saliency-guided Approach

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators