Search | arXiv e-print repository

Certificates of Differential Privacy and Unlearning for Gradient-Based Training

Authors: Matthew Wicker, Philip Sosnin, Adrianna Janik, Mark N. Müller, Adrian Weller, Calvin Tsay

Abstract: Proper data stewardship requires that model owners protect the privacy of individuals' data used during training. Whether through anonymization with differential privacy or the use of unlearning in non-anonymized settings, the gold-standard techniques for providing privacy guarantees can come with significant performance penalties or be too weak to provide practical assurances. In part, this is du… ▽ More Proper data stewardship requires that model owners protect the privacy of individuals' data used during training. Whether through anonymization with differential privacy or the use of unlearning in non-anonymized settings, the gold-standard techniques for providing privacy guarantees can come with significant performance penalties or be too weak to provide practical assurances. In part, this is due to the fact that the guarantee provided by differential privacy represents the worst-case privacy leakage for any individual, while the true privacy leakage of releasing the prediction for a given individual might be substantially smaller or even, as we show, non-existent. This work provides a novel framework based on convex relaxations and bounds propagation that can compute formal guarantees (certificates) that releasing specific predictions satisfies $ε=0$ privacy guarantees or do not depend on data that is subject to an unlearning request. Our framework offers a new verification-centric approach to privacy and unlearning guarantees, that can be used to further engender user trust with tighter privacy guarantees, provide formal proofs of robustness to certain membership inference attacks, identify potentially vulnerable records, and enhance current unlearning approaches. We validate the effectiveness of our approach on tasks from financial services, medical imaging, and natural language processing. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 15 pages, 14 figures

arXiv:2406.05670 [pdf, other]

Certified Robustness to Data Poisoning in Gradient-Based Training

Authors: Philip Sosnin, Mark N. Müller, Maximilian Baader, Calvin Tsay, Matthew Wicker

Abstract: Modern machine learning pipelines leverage large amounts of public data, making it infeasible to guarantee data quality and leaving models open to poisoning and backdoor attacks. However, provably bounding model behavior under such attacks remains an open problem. In this work, we address this challenge and develop the first framework providing provable guarantees on the behavior of models trained… ▽ More Modern machine learning pipelines leverage large amounts of public data, making it infeasible to guarantee data quality and leaving models open to poisoning and backdoor attacks. However, provably bounding model behavior under such attacks remains an open problem. In this work, we address this challenge and develop the first framework providing provable guarantees on the behavior of models trained with potentially manipulated data. In particular, our framework certifies robustness against untargeted and targeted poisoning as well as backdoor attacks for both input and label manipulations. Our method leverages convex relaxations to over-approximate the set of all possible parameter updates for a given poisoning threat model, allowing us to bound the set of all reachable parameters for any gradient-based learning algorithm. Given this set of parameters, we provide bounds on worst-case behavior, including model performance and backdoor success rate. We demonstrate our approach on multiple real-world datasets from applications including energy consumption, medical imaging, and autonomous driving. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: 15 pages, 5 figures

arXiv:2311.11911 [pdf, other]

Certification of Distributional Individual Fairness

Authors: Matthew Wicker, Vihari Piratia, Adrian Weller

Abstract: Providing formal guarantees of algorithmic fairness is of paramount importance to socially responsible deployment of machine learning algorithms. In this work, we study formal guarantees, i.e., certificates, for individual fairness (IF) of neural networks. We start by introducing a novel convex approximation of IF constraints that exponentially decreases the computational cost of providing formal… ▽ More Providing formal guarantees of algorithmic fairness is of paramount importance to socially responsible deployment of machine learning algorithms. In this work, we study formal guarantees, i.e., certificates, for individual fairness (IF) of neural networks. We start by introducing a novel convex approximation of IF constraints that exponentially decreases the computational cost of providing formal guarantees of local individual fairness. We highlight that prior methods are constrained by their focus on global IF certification and can therefore only scale to models with a few dozen hidden neurons, thus limiting their practical impact. We propose to certify distributional individual fairness which ensures that for a given empirical distribution and all distributions within a $γ$-Wasserstein ball, the neural network has guaranteed individually fair predictions. Leveraging developments in quasi-convex optimization, we provide novel and efficient certified bounds on distributional individual fairness and show that our method allows us to certify and regularize neural networks that are several orders of magnitude larger than those considered by prior works. Moreover, we study real-world distribution shifts and find our bounds to be a scalable, practical, and sound source of IF guarantees. △ Less

Submitted 20 November, 2023; originally announced November 2023.

Comments: 21 Pages, Neural Information Processing Systems 2023

arXiv:2310.01951 [pdf, other]

Probabilistic Reach-Avoid for Bayesian Neural Networks

Authors: Matthew Wicker, Luca Laurenti, Andrea Patane, Nicola Paoletti, Alessandro Abate, Marta Kwiatkowska

Abstract: Model-based reinforcement learning seeks to simultaneously learn the dynamics of an unknown stochastic environment and synthesise an optimal policy for acting in it. Ensuring the safety and robustness of sequential decisions made through a policy in such an environment is a key challenge for policies intended for safety-critical scenarios. In this work, we investigate two complementary problems: f… ▽ More Model-based reinforcement learning seeks to simultaneously learn the dynamics of an unknown stochastic environment and synthesise an optimal policy for acting in it. Ensuring the safety and robustness of sequential decisions made through a policy in such an environment is a key challenge for policies intended for safety-critical scenarios. In this work, we investigate two complementary problems: first, computing reach-avoid probabilities for iterative predictions made with dynamical models, with dynamics described by Bayesian neural network (BNN); second, synthesising control policies that are optimal with respect to a given reach-avoid specification (reaching a "target" state, while avoiding a set of "unsafe" states) and a learned BNN model. Our solution leverages interval propagation and backward recursion techniques to compute lower bounds for the probability that a policy's sequence of actions leads to satisfying the reach-avoid specification. Such computed lower bounds provide safety certification for the given policy and BNN model. We then introduce control synthesis algorithms to derive policies maximizing said lower bounds on the safety probability. We demonstrate the effectiveness of our method on a series of control benchmarks characterized by learned BNN dynamics models. On our most challenging benchmark, compared to purely data-driven policies the optimal synthesis algorithm is able to provide more than a four-fold increase in the number of certifiable states and more than a three-fold increase in the average guaranteed reach-avoid probability. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: 47 pages, 10 figures. arXiv admin note: text overlap with arXiv:2105.10134

arXiv:2306.13614 [pdf, other]

Adversarial Robustness Certification for Bayesian Neural Networks

Authors: Matthew Wicker, Andrea Patane, Luca Laurenti, Marta Kwiatkowska

Abstract: We study the problem of certifying the robustness of Bayesian neural networks (BNNs) to adversarial input perturbations. Given a compact set of input points $T \subseteq \mathbb{R}^m$ and a set of output points $S \subseteq \mathbb{R}^n$, we define two notions of robustness for BNNs in an adversarial setting: probabilistic robustness and decision robustness. Probabilistic robustness is the probabi… ▽ More We study the problem of certifying the robustness of Bayesian neural networks (BNNs) to adversarial input perturbations. Given a compact set of input points $T \subseteq \mathbb{R}^m$ and a set of output points $S \subseteq \mathbb{R}^n$, we define two notions of robustness for BNNs in an adversarial setting: probabilistic robustness and decision robustness. Probabilistic robustness is the probability that for all points in $T$ the output of a BNN sampled from the posterior is in $S$. On the other hand, decision robustness considers the optimal decision of a BNN and checks if for all points in $T$ the optimal decision of the BNN for a given loss function lies within the output set $S$. Although exact computation of these robustness properties is challenging due to the probabilistic and non-convex nature of BNNs, we present a unified computational framework for efficiently and formally bounding them. Our approach is based on weight interval sampling, integration, and bound propagation techniques, and can be applied to BNNs with a large number of parameters, and independently of the (approximate) inference method employed to train the BNN. We evaluate the effectiveness of our methods on various regression and classification tasks, including an industrial regression benchmark, MNIST, traffic sign recognition, and airborne collision avoidance, and demonstrate that our approach enables certification of robustness and uncertainty of BNN predictions. △ Less

Submitted 23 June, 2023; originally announced June 2023.

arXiv:2304.10828 [pdf, other]

Individual Fairness in Bayesian Neural Networks

Authors: Alice Doherty, Matthew Wicker, Luca Laurenti, Andrea Patane

Abstract: We study Individual Fairness (IF) for Bayesian neural networks (BNNs). Specifically, we consider the $ε$-$δ$-individual fairness notion, which requires that, for any pair of input points that are $ε$-similar according to a given similarity metrics, the output of the BNN is within a given tolerance $δ>0.$ We leverage bounds on statistical sampling over the input space and the relationship between a… ▽ More We study Individual Fairness (IF) for Bayesian neural networks (BNNs). Specifically, we consider the $ε$-$δ$-individual fairness notion, which requires that, for any pair of input points that are $ε$-similar according to a given similarity metrics, the output of the BNN is within a given tolerance $δ>0.$ We leverage bounds on statistical sampling over the input space and the relationship between adversarial robustness and individual fairness to derive a framework for the systematic estimation of $ε$-$δ$-IF, designing Fair-FGSM and Fair-PGD as global,fairness-aware extensions to gradient-based attacks for BNNs. We empirically study IF of a variety of approximately inferred BNNs with different architectures on fairness benchmarks, and compare against deterministic models learnt using frequentist techniques. Interestingly, we find that BNNs trained by means of approximate Bayesian inference consistently tend to be markedly more individually fair than their deterministic counterparts. △ Less

Submitted 21 April, 2023; originally announced April 2023.

arXiv:2303.06419 [pdf, other]

Use Perturbations when Learning from Explanations

Authors: Juyeon Heo, Vihari Piratla, Matthew Wicker, Adrian Weller

Abstract: Machine learning from explanations (MLX) is an approach to learning that uses human-provided explanations of relevant or irrelevant features for each input to ensure that model predictions are right for the right reasons. Existing MLX approaches rely on local model interpretation methods and require strong model smoothing to align model and human explanations, leading to sub-optimal performance. W… ▽ More Machine learning from explanations (MLX) is an approach to learning that uses human-provided explanations of relevant or irrelevant features for each input to ensure that model predictions are right for the right reasons. Existing MLX approaches rely on local model interpretation methods and require strong model smoothing to align model and human explanations, leading to sub-optimal performance. We recast MLX as a robustness problem, where human explanations specify a lower dimensional manifold from which perturbations can be drawn, and show both theoretically and empirically how this approach alleviates the need for strong model smoothing. We consider various approaches to achieving robustness, leading to improved performance over prior MLX methods. Finally, we show how to combine robustness with an earlier MLX method, yielding state-of-the-art results on both synthetic and real-world benchmarks. △ Less

Submitted 1 December, 2023; v1 submitted 11 March, 2023; originally announced March 2023.

Comments: NeurIPS 2023; https://github.com/vihari/robust_mlx

arXiv:2212.08507 [pdf, other]

Robust Explanation Constraints for Neural Networks

Authors: Matthew Wicker, Juyeon Heo, Luca Costabello, Adrian Weller

Abstract: Post-hoc explanation methods are used with the intent of providing insights about neural networks and are sometimes said to help engender trust in their outputs. However, popular explanations methods have been found to be fragile to minor perturbations of input features or model parameters. Relying on constraint relaxation techniques from non-convex optimization, we develop a method that upper-bou… ▽ More Post-hoc explanation methods are used with the intent of providing insights about neural networks and are sometimes said to help engender trust in their outputs. However, popular explanations methods have been found to be fragile to minor perturbations of input features or model parameters. Relying on constraint relaxation techniques from non-convex optimization, we develop a method that upper-bounds the largest change an adversary can make to a gradient-based explanation via bounded manipulation of either the input features or model parameters. By propagating a compact input or parameter set as symbolic intervals through the forwards and backwards computations of the neural network we can formally certify the robustness of gradient-based explanations. Our bounds are differentiable, hence we can incorporate provable explanation robustness into neural network training. Empirically, our method surpasses the robustness provided by previous heuristic approaches. We find that our training method is the only method able to learn neural networks with certificates of explanation robustness across all six datasets tested. △ Less

Submitted 16 December, 2022; originally announced December 2022.

Comments: 23 pages, 12 figures

arXiv:2210.17406 [pdf, other]

Emergent Linguistic Structures in Neural Networks are Fragile

Authors: Emanuele La Malfa, Matthew Wicker, Marta Kwiatkowska

Abstract: Large Language Models (LLMs) have been reported to have strong performance on natural language processing tasks. However, performance metrics such as accuracy do not measure the quality of the model in terms of its ability to robustly represent complex linguistic structures. In this paper, focusing on the ability of language models to represent syntax, we propose a framework to assess the consiste… ▽ More Large Language Models (LLMs) have been reported to have strong performance on natural language processing tasks. However, performance metrics such as accuracy do not measure the quality of the model in terms of its ability to robustly represent complex linguistic structures. In this paper, focusing on the ability of language models to represent syntax, we propose a framework to assess the consistency and robustness of linguistic representations. To this end, we introduce measures of robustness of neural network models that leverage recent advances in extracting linguistic constructs from LLMs via probing tasks, i.e., simple tasks used to extract meaningful information about a single facet of a language model, such as syntax reconstruction and root identification. Empirically, we study the performance of four LLMs across six different corpora on the proposed robustness measures by analysing their performance and robustness with respect to syntax-preserving perturbations. We provide evidence that context-free representation (e.g., GloVe) are in some cases competitive with context-dependent representations from modern LLMs (e.g., BERT), yet equally brittle to syntax-preserving perturbations. Our key observation is that emergent syntactic representations in neural networks are brittle. We make the code, trained models and logs available to the community as a contribution to the debate about the capabilities of LLMs. △ Less

Submitted 31 May, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

arXiv:2207.06154 [pdf, other]

On the Robustness of Bayesian Neural Networks to Adversarial Attacks

Authors: Luca Bortolussi, Ginevra Carbone, Luca Laurenti, Andrea Patane, Guido Sanguinetti, Matthew Wicker

Abstract: Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparameterized limit for Bayesian… ▽ More Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparameterized limit for Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lies on a lower-dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit BNN posteriors are robust to gradient-based adversarial attacks. Crucially, we prove that the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each neural network sampled from the posterior is vulnerable to gradient-based attacks. Experimental results on the MNIST, Fashion MNIST, and half moons datasets, representing the finite data regime, with BNNs trained with Hamiltonian Monte Carlo and Variational Inference, support this line of arguments, showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free based adversarial attacks. △ Less

Submitted 28 February, 2024; v1 submitted 13 July, 2022; originally announced July 2022.

Comments: arXiv admin note: text overlap with arXiv:2002.04359

arXiv:2205.05763 [pdf, other]

Individual Fairness Guarantees for Neural Networks

Authors: Elias Benussi, Andrea Patane, Matthew Wicker, Luca Laurenti, Marta Kwiatkowska

Abstract: We consider the problem of certifying the individual fairness (IF) of feed-forward neural networks (NNs). In particular, we work with the $ε$-$δ$-IF formulation, which, given a NN and a similarity metric learnt from data, requires that the output difference between any pair of $ε$-similar individuals is bounded by a maximum decision tolerance $δ\geq 0$. Working with a range of metrics, including t… ▽ More We consider the problem of certifying the individual fairness (IF) of feed-forward neural networks (NNs). In particular, we work with the $ε$-$δ$-IF formulation, which, given a NN and a similarity metric learnt from data, requires that the output difference between any pair of $ε$-similar individuals is bounded by a maximum decision tolerance $δ\geq 0$. Working with a range of metrics, including the Mahalanobis distance, we propose a method to overapproximate the resulting optimisation problem using piecewise-linear functions to lower and upper bound the NN's non-linearities globally over the input space. We encode this computation as the solution of a Mixed-Integer Linear Programming problem and demonstrate that it can be used to compute IF guarantees on four datasets widely used for fairness benchmarking. We show how this formulation can be used to encourage models' fairness at training time by modifying the NN loss, and empirically confirm our approach yields NNs that are orders of magnitude fairer than state-of-the-art methods. △ Less

Submitted 11 May, 2022; originally announced May 2022.

arXiv:2204.14170 [pdf, other]

Tractable Uncertainty for Structure Learning

Authors: Benjie Wang, Matthew Wicker, Marta Kwiatkowska

Abstract: Bayesian structure learning allows one to capture uncertainty over the causal directed acyclic graph (DAG) responsible for generating given data. In this work, we present Tractable Uncertainty for STructure learning (TRUST), a framework for approximate posterior inference that relies on probabilistic circuits as the representation of our posterior belief. In contrast to sample-based posterior appr… ▽ More Bayesian structure learning allows one to capture uncertainty over the causal directed acyclic graph (DAG) responsible for generating given data. In this work, we present Tractable Uncertainty for STructure learning (TRUST), a framework for approximate posterior inference that relies on probabilistic circuits as the representation of our posterior belief. In contrast to sample-based posterior approximations, our representation can capture a much richer space of DAGs, while also being able to tractably reason about the uncertainty through a range of useful inference queries. We empirically show how probabilistic circuits can be used as an augmented representation for structure learning methods, leading to improvement in both the quality of inferred structures and posterior uncertainty. Experimental results on conditional query answering further demonstrate the practical utility of the representational capacity of TRUST. △ Less

Submitted 1 July, 2022; v1 submitted 29 April, 2022; originally announced April 2022.

Comments: ICML 2022 (long talk); 20 pages

arXiv:2105.10134 [pdf, other]

Certification of Iterative Predictions in Bayesian Neural Networks

Authors: Matthew Wicker, Luca Laurenti, Andrea Patane, Nicola Paoletti, Alessandro Abate, Marta Kwiatkowska

Abstract: We consider the problem of computing reach-avoid probabilities for iterative predictions made with Bayesian neural network (BNN) models. Specifically, we leverage bound propagation techniques and backward recursion to compute lower bounds for the probability that trajectories of the BNN model reach a given set of states while avoiding a set of unsafe states. We use the lower bounds in the context… ▽ More We consider the problem of computing reach-avoid probabilities for iterative predictions made with Bayesian neural network (BNN) models. Specifically, we leverage bound propagation techniques and backward recursion to compute lower bounds for the probability that trajectories of the BNN model reach a given set of states while avoiding a set of unsafe states. We use the lower bounds in the context of control and reinforcement learning to provide safety certification for given control policies, as well as to synthesize control policies that improve the certification bounds. On a set of benchmarks, we demonstrate that our framework can be employed to certify policies over BNNs predictions for problems of more than $10$ dimensions, and to effectively synthesize policies that significantly increase the lower bound on the satisfaction probability. △ Less

Submitted 19 June, 2021; v1 submitted 21 May, 2021; originally announced May 2021.

Comments: Accepted, UAI 2021. 17 pages

arXiv:2102.05289 [pdf, other]

Bayesian Inference with Certifiable Adversarial Robustness

Authors: Matthew Wicker, Luca Laurenti, Andrea Patane, Zhoutong Chen, Zheng Zhang, Marta Kwiatkowska

Abstract: We consider adversarial training of deep neural networks through the lens of Bayesian learning, and present a principled framework for adversarial training of Bayesian Neural Networks (BNNs) with certifiable guarantees. We rely on techniques from constraint relaxation of non-convex optimisation problems and modify the standard cross-entropy error model to enforce posterior robustness to worst-case… ▽ More We consider adversarial training of deep neural networks through the lens of Bayesian learning, and present a principled framework for adversarial training of Bayesian Neural Networks (BNNs) with certifiable guarantees. We rely on techniques from constraint relaxation of non-convex optimisation problems and modify the standard cross-entropy error model to enforce posterior robustness to worst-case perturbations in $ε$-balls around input points. We illustrate how the resulting framework can be combined with methods commonly employed for approximate inference of BNNs. In an empirical investigation, we demonstrate that the presented approach enables training of certifiably robust models on MNIST, FashionMNIST and CIFAR-10 and can also be beneficial for uncertainty calibration. Our method is the first to directly train certifiable BNNs, thus facilitating their deployment in safety-critical applications. △ Less

Submitted 22 February, 2021; v1 submitted 10 February, 2021; originally announced February 2021.

Comments: Accepted AISTATS2021

arXiv:2012.12640 [pdf, other]

Gradient-Free Adversarial Attacks for Bayesian Neural Networks

Authors: Matthew Yuan, Matthew Wicker, Luca Laurenti

Abstract: The existence of adversarial examples underscores the importance of understanding the robustness of machine learning models. Bayesian neural networks (BNNs), due to their calibrated uncertainty, have been shown to posses favorable adversarial robustness properties. However, when approximate Bayesian inference methods are employed, the adversarial robustness of BNNs is still not well understood. In… ▽ More The existence of adversarial examples underscores the importance of understanding the robustness of machine learning models. Bayesian neural networks (BNNs), due to their calibrated uncertainty, have been shown to posses favorable adversarial robustness properties. However, when approximate Bayesian inference methods are employed, the adversarial robustness of BNNs is still not well understood. In this work, we employ gradient-free optimization methods in order to find adversarial examples for BNNs. In particular, we consider genetic algorithms, surrogate models, as well as zeroth order optimization methods and adapt them to the goal of finding adversarial examples for BNNs. In an empirical evaluation on the MNIST and Fashion MNIST datasets, we show that for various approximate Bayesian inference methods the usage of gradient-free algorithms can greatly improve the rate of finding adversarial examples compared to state-of-the-art gradient-based methods. △ Less

Submitted 23 December, 2020; originally announced December 2020.

Comments: 6 Pages, 2 Figures, AABI2021

arXiv:2004.10281 [pdf, other]

Probabilistic Safety for Bayesian Neural Networks

Authors: Matthew Wicker, Luca Laurenti, Andrea Patane, Marta Kwiatkowska

Abstract: We study probabilistic safety for Bayesian Neural Networks (BNNs) under adversarial input perturbations. Given a compact set of input points, $T \subseteq \mathbb{R}^m$, we study the probability w.r.t. the BNN posterior that all the points in $T$ are mapped to the same region $S$ in the output space. In particular, this can be used to evaluate the probability that a network sampled from the BNN is… ▽ More We study probabilistic safety for Bayesian Neural Networks (BNNs) under adversarial input perturbations. Given a compact set of input points, $T \subseteq \mathbb{R}^m$, we study the probability w.r.t. the BNN posterior that all the points in $T$ are mapped to the same region $S$ in the output space. In particular, this can be used to evaluate the probability that a network sampled from the BNN is vulnerable to adversarial attacks. We rely on relaxation techniques from non-convex optimization to develop a method for computing a lower bound on probabilistic safety for BNNs, deriving explicit procedures for the case of interval and linear function propagation techniques. We apply our methods to BNNs trained on a regression task, airborne collision avoidance, and MNIST, empirically showing that our approach allows one to certify probabilistic safety of BNNs with millions of parameters. △ Less

Submitted 18 June, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

Comments: UAI 2020; 13 pages, 5 figures, 1 table

arXiv:2002.04359 [pdf, other]

Robustness of Bayesian Neural Networks to Gradient-Based Attacks

Authors: Ginevra Carbone, Matthew Wicker, Luca Laurenti, Andrea Patane, Luca Bortolussi, Guido Sanguinetti

Abstract: Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, the problem remains open. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparametrized limit for Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerabi… ▽ More Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, the problem remains open. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparametrized limit for Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lies on a lower-dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in the limit BNN posteriors are robust to gradient-based adversarial attacks. Experimental results on the MNIST and Fashion MNIST datasets with BNNs trained with Hamiltonian Monte Carlo and Variational Inference support this line of argument, showing that BNNs can display both high accuracy and robustness to gradient based adversarial attacks. △ Less

Submitted 24 June, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

arXiv:1909.09884 [pdf, other]

Uncertainty Quantification with Statistical Guarantees in End-to-End Autonomous Driving Control

Authors: Rhiannon Michelmore, Matthew Wicker, Luca Laurenti, Luca Cardelli, Yarin Gal, Marta Kwiatkowska

Abstract: Deep neural network controllers for autonomous driving have recently benefited from significant performance improvements, and have begun deployment in the real world. Prior to their widespread adoption, safety guarantees are needed on the controller behaviour that properly take account of the uncertainty within the model as well as sensor noise. Bayesian neural networks, which assume a prior over… ▽ More Deep neural network controllers for autonomous driving have recently benefited from significant performance improvements, and have begun deployment in the real world. Prior to their widespread adoption, safety guarantees are needed on the controller behaviour that properly take account of the uncertainty within the model as well as sensor noise. Bayesian neural networks, which assume a prior over the weights, have been shown capable of producing such uncertainty measures, but properties surrounding their safety have not yet been quantified for use in autonomous driving scenarios. In this paper, we develop a framework based on a state-of-the-art simulator for evaluating end-to-end Bayesian controllers. In addition to computing pointwise uncertainty measures that can be computed in real time and with statistical guarantees, we also provide a method for estimating the probability that, given a scenario, the controller keeps the car safe within a finite horizon. We experimentally evaluate the quality of uncertainty computation by several Bayesian inference methods in different scenarios and show how the uncertainty measures can be combined and calibrated for use in collision avoidance. Our results suggest that uncertainty estimates can greatly aid decision making in autonomous driving. △ Less

Submitted 21 September, 2019; originally announced September 2019.

Comments: 7 pages, 3 figures, submitted to ICRA 2020

arXiv:1904.00923 [pdf, other]

Robustness of 3D Deep Learning in an Adversarial Setting

Authors: Matthew Wicker, Marta Kwiatkowska

Abstract: Understanding the spatial arrangement and nature of real-world objects is of paramount importance to many complex engineering tasks, including autonomous navigation. Deep learning has revolutionized state-of-the-art performance for tasks in 3D environments; however, relatively little is known about the robustness of these approaches in an adversarial setting. The lack of comprehensive analysis mak… ▽ More Understanding the spatial arrangement and nature of real-world objects is of paramount importance to many complex engineering tasks, including autonomous navigation. Deep learning has revolutionized state-of-the-art performance for tasks in 3D environments; however, relatively little is known about the robustness of these approaches in an adversarial setting. The lack of comprehensive analysis makes it difficult to justify deployment of 3D deep learning models in real-world, safety-critical applications. In this work, we develop an algorithm for analysis of pointwise robustness of neural networks that operate on 3D data. We show that current approaches presented for understanding the resilience of state-of-the-art models vastly overestimate their robustness. We then use our algorithm to evaluate an array of state-of-the-art models in order to demonstrate their vulnerability to occlusion attacks. We show that, in the worst case, these networks can be reduced to 0% classification accuracy after the occlusion of at most 6.5% of the occupied input space. △ Less

Submitted 1 April, 2019; originally announced April 2019.

Comments: 10 pages, 8 figures, 1 table

arXiv:1903.01980 [pdf, other]

Statistical Guarantees for the Robustness of Bayesian Neural Networks

Authors: Luca Cardelli, Marta Kwiatkowska, Luca Laurenti, Nicola Paoletti, Andrea Patane, Matthew Wicker

Abstract: We introduce a probabilistic robustness measure for Bayesian Neural Networks (BNNs), defined as the probability that, given a test point, there exists a point within a bounded set such that the BNN prediction differs between the two. Such a measure can be used, for instance, to quantify the probability of the existence of adversarial examples. Building on statistical verification techniques for pr… ▽ More We introduce a probabilistic robustness measure for Bayesian Neural Networks (BNNs), defined as the probability that, given a test point, there exists a point within a bounded set such that the BNN prediction differs between the two. Such a measure can be used, for instance, to quantify the probability of the existence of adversarial examples. Building on statistical verification techniques for probabilistic models, we develop a framework that allows us to estimate probabilistic robustness for a BNN with statistical guarantees, i.e., with a priori error and confidence bounds. We provide experimental comparison for several approximate BNN inference techniques on image classification tasks associated to MNIST and a two-class subset of the GTSRB dataset. Our results enable quantification of uncertainty of BNN predictions in adversarial settings. △ Less

Submitted 5 March, 2019; originally announced March 2019.

Comments: 9 pages, 6 figures

arXiv:1807.03571 [pdf, other]

doi 10.1016/j.tcs.2019.05.046

A Game-Based Approximate Verification of Deep Neural Networks with Provable Guarantees

Authors: Min Wu, Matthew Wicker, Wenjie Ruan, Xiaowei Huang, Marta Kwiatkowska

Abstract: Despite the improved accuracy of deep neural networks, the discovery of adversarial examples has raised serious safety concerns. In this paper, we study two variants of pointwise robustness, the maximum safe radius problem, which for a given input sample computes the minimum distance to an adversarial example, and the feature robustness problem, which aims to quantify the robustness of individual… ▽ More Despite the improved accuracy of deep neural networks, the discovery of adversarial examples has raised serious safety concerns. In this paper, we study two variants of pointwise robustness, the maximum safe radius problem, which for a given input sample computes the minimum distance to an adversarial example, and the feature robustness problem, which aims to quantify the robustness of individual features to adversarial perturbations. We demonstrate that, under the assumption of Lipschitz continuity, both problems can be approximated using finite optimisation by discretising the input space, and the approximation has provable guarantees, i.e., the error is bounded. We then show that the resulting optimisation problems can be reduced to the solution of two-player turn-based games, where the first player selects features and the second perturbs the image within the feature. While the second player aims to minimise the distance to an adversarial example, depending on the optimisation objective the first player can be cooperative or competitive. We employ an anytime approach to solve the games, in the sense of approximating the value of a game by monotonically improving its upper and lower bounds. The Monte Carlo tree search algorithm is applied to compute upper bounds for both games, and the Admissible A* and the Alpha-Beta Pruning algorithms are, respectively, used to compute lower bounds for the maximum safety radius and feature robustness games. When working on the upper bound of the maximum safe radius problem, our tool demonstrates competitive performance against existing adversarial example crafting algorithms. Furthermore, we show how our framework can be deployed to evaluate pointwise robustness of neural networks in safety-critical applications such as traffic sign recognition in self-driving cars. △ Less

Submitted 6 March, 2019; v1 submitted 10 July, 2018; originally announced July 2018.

Journal ref: Theoretical Computer Science 807 (2020) 298-329

arXiv:1801.06900 [pdf, ps, other]

Efficient Learning of Optimal Markov Network Topology with k-Tree Modeling

Authors: Liang Ding, Di Chang, Russell Malmberg, Aaron Martinez, David Robinson, Matthew Wicker, Hongfei Yan, Liming Cai

Abstract: The seminal work of Chow and Liu (1968) shows that approximation of a finite probabilistic system by Markov trees can achieve the minimum information loss with the topology of a maximum spanning tree. Our current paper generalizes the result to Markov networks of tree width $\leq k$, for every fixed $k\geq 2$. In particular, we prove that approximation of a finite probabilistic system with such Ma… ▽ More The seminal work of Chow and Liu (1968) shows that approximation of a finite probabilistic system by Markov trees can achieve the minimum information loss with the topology of a maximum spanning tree. Our current paper generalizes the result to Markov networks of tree width $\leq k$, for every fixed $k\geq 2$. In particular, we prove that approximation of a finite probabilistic system with such Markov networks has the minimum information loss when the network topology is achieved with a maximum spanning $k$-tree. While constructing a maximum spanning $k$-tree is intractable for even $k=2$, we show that polynomial algorithms can be ensured by a sufficient condition accommodated by many meaningful applications. In particular, we prove an efficient algorithm for learning the optimal topology of higher order correlations among random variables that belong to an underlying linear structure. △ Less

Submitted 21 January, 2018; originally announced January 2018.

Comments: 18 pages main text, 2 pages appendix

arXiv:1710.07859 [pdf, other]

Feature-Guided Black-Box Safety Testing of Deep Neural Networks

Authors: Matthew Wicker, Xiaowei Huang, Marta Kwiatkowska

Abstract: Despite the improved accuracy of deep neural networks, the discovery of adversarial examples has raised serious safety concerns. Most existing approaches for crafting adversarial examples necessitate some knowledge (architecture, parameters, etc.) of the network at hand. In this paper, we focus on image classifiers and propose a feature-guided black-box approach to test the safety of deep neural n… ▽ More Despite the improved accuracy of deep neural networks, the discovery of adversarial examples has raised serious safety concerns. Most existing approaches for crafting adversarial examples necessitate some knowledge (architecture, parameters, etc.) of the network at hand. In this paper, we focus on image classifiers and propose a feature-guided black-box approach to test the safety of deep neural networks that requires no such knowledge. Our algorithm employs object detection techniques such as SIFT (Scale Invariant Feature Transform) to extract features from an image. These features are converted into a mutable saliency distribution, where high probability is assigned to pixels that affect the composition of the image with respect to the human visual system. We formulate the crafting of adversarial examples as a two-player turn-based stochastic game, where the first player's objective is to minimise the distance to an adversarial example by manipulating the features, and the second player can be cooperative, adversarial, or random. We show that, theoretically, the two-player game can con- verge to the optimal strategy, and that the optimal strategy represents a globally minimal adversarial image. For Lipschitz networks, we also identify conditions that provide safety guarantees that no adversarial examples exist. Using Monte Carlo tree search we gradually explore the game state space to search for adversarial examples. Our experiments show that, despite the black-box setting, manipulations guided by a perception-based saliency distribution are competitive with state-of-the-art methods that rely on white-box saliency matrices or sophisticated optimization procedures. Finally, we show how our method can be used to evaluate robustness of neural networks in safety-critical applications such as traffic sign recognition in self-driving cars. △ Less

Submitted 20 February, 2018; v1 submitted 21 October, 2017; originally announced October 2017.

Comments: 35 pages, 5 tables, 23 figures

Showing 1–23 of 23 results for author: Wicker, M