Assessing Systematic Weaknesses of DNNs using Counterfactuals
Authors:
Sujan Sai Gannamaneni,
Michael Mock,
Maram Akila
Abstract:
With the advancement of DNNs into safety-critical applications, testing approaches for such models have gained more attention. A current direction is the search for and identification of systematic weaknesses that put safety assumptions based on average performance values at risk. Such weaknesses can take on the form of (semantically coherent) subsets or areas in the input space where a DNN perfor…
▽ More
With the advancement of DNNs into safety-critical applications, testing approaches for such models have gained more attention. A current direction is the search for and identification of systematic weaknesses that put safety assumptions based on average performance values at risk. Such weaknesses can take on the form of (semantically coherent) subsets or areas in the input space where a DNN performs systematically worse than its expected average. However, it is non-trivial to attribute the reason for such observed low performances to the specific semantic features that describe the subset. For instance, inhomogeneities within the data w.r.t. other (non-considered) attributes might distort results. However, taking into account all (available) attributes and their interaction is often computationally highly expensive. Inspired by counterfactual explanations, we propose an effective and computationally cheap algorithm to validate the semantic attribution of existing subsets, i.e., to check whether the identified attribute is likely to have caused the degraded performance. We demonstrate this approach on an example from the autonomous driving domain using highly annotated simulated data, where we show for a semantic segmentation model that (i) performance differences among the different pedestrian assets exist, but (ii) only in some cases is the asset type itself the reason for this reduction in the performance.
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
Inspect, Understand, Overcome: A Survey of Practical Methods for AI Safety
Authors:
Sebastian Houben,
Stephanie Abrecht,
Maram Akila,
Andreas Bär,
Felix Brockherde,
Patrick Feifel,
Tim Fingscheidt,
Sujan Sai Gannamaneni,
Seyed Eghbal Ghobadi,
Ahmed Hammam,
Anselm Haselhoff,
Felix Hauser,
Christian Heinzemann,
Marco Hoffmann,
Nikhil Kapoor,
Falk Kappel,
Marvin Klingner,
Jan Kronenberger,
Fabian Küppers,
Jonas Löhdefink,
Michael Mlynarski,
Michael Mock,
Firas Mualla,
Svetlana Pavlitskaya,
Maximilian Poretschkin
, et al. (16 additional authors not shown)
Abstract:
The use of deep neural networks (DNNs) in safety-critical applications like mobile health and autonomous driving is challenging due to numerous model-inherent shortcomings. These shortcomings are diverse and range from a lack of generalization over insufficient interpretability to problems with malicious inputs. Cyber-physical systems employing DNNs are therefore likely to suffer from safety conce…
▽ More
The use of deep neural networks (DNNs) in safety-critical applications like mobile health and autonomous driving is challenging due to numerous model-inherent shortcomings. These shortcomings are diverse and range from a lack of generalization over insufficient interpretability to problems with malicious inputs. Cyber-physical systems employing DNNs are therefore likely to suffer from safety concerns. In recent years, a zoo of state-of-the-art techniques aiming to address these safety concerns has emerged. This work provides a structured and broad overview of them. We first identify categories of insufficiencies to then describe research activities aiming at their detection, quantification, or mitigation. Our paper addresses both machine learning experts and safety engineers: The former ones might profit from the broad range of machine learning topics covered and discussions on limitations of recent methods. The latter ones might gain insights into the specifics of modern ML methods. We moreover hope that our contribution fuels discussions on desiderata for ML systems and strategies on how to propel existing approaches accordingly.
△ Less
Submitted 29 April, 2021;
originally announced April 2021.