Search | arXiv e-print repository

arXiv:2403.14641 [pdf, other]

Testing autonomous vehicles and AI: perspectives and challenges from cybersecurity, transparency, robustness and fairness

Authors: David Fernández Llorca, Ronan Hamon, Henrik Junklewitz, Kathrin Grosse, Lars Kunze, Patrick Seiniger, Robert Swaim, Nick Reed, Alexandre Alahi, Emilia Gómez, Ignacio Sánchez, Akos Kriston

Abstract: This study explores the complexities of integrating Artificial Intelligence (AI) into Autonomous Vehicles (AVs), examining the challenges introduced by AI components and the impact on testing procedures, focusing on some of the essential requirements for trustworthy AI. Topics addressed include the role of AI at various operational layers of AVs, the implications of the EU's AI Act on AVs, and the… ▽ More This study explores the complexities of integrating Artificial Intelligence (AI) into Autonomous Vehicles (AVs), examining the challenges introduced by AI components and the impact on testing procedures, focusing on some of the essential requirements for trustworthy AI. Topics addressed include the role of AI at various operational layers of AVs, the implications of the EU's AI Act on AVs, and the need for new testing methodologies for Advanced Driver Assistance Systems (ADAS) and Automated Driving Systems (ADS). The study also provides a detailed analysis on the importance of cybersecurity audits, the need for explainability in AI decision-making processes and protocols for assessing the robustness and ethical behaviour of predictive systems in AVs. The paper identifies significant challenges and suggests future directions for research and development of AI in AV technology, highlighting the need for multidisciplinary expertise. △ Less

Submitted 21 February, 2024; originally announced March 2024.

Comments: 44 pages, 8 figures, submitted to a peer-review journal

arXiv:2312.13863 [pdf, other]

Manipulating Trajectory Prediction with Backdoors

Authors: Kaouther Messaoud, Kathrin Grosse, Mickael Chen, Matthieu Cord, Patrick Pérez, Alexandre Alahi

Abstract: Autonomous vehicles ought to predict the surrounding agents' trajectories to allow safe maneuvers in uncertain and complex traffic situations. As companies increasingly apply trajectory prediction in the real world, security becomes a relevant concern. In this paper, we focus on backdoors - a security threat acknowledged in other fields but so far overlooked for trajectory prediction. To this end,… ▽ More Autonomous vehicles ought to predict the surrounding agents' trajectories to allow safe maneuvers in uncertain and complex traffic situations. As companies increasingly apply trajectory prediction in the real world, security becomes a relevant concern. In this paper, we focus on backdoors - a security threat acknowledged in other fields but so far overlooked for trajectory prediction. To this end, we describe and investigate four triggers that could affect trajectory prediction. We then show that these triggers (for example, a braking vehicle), when correlated with a desired output (for example, a curve) during training, cause the desired output of a state-of-the-art trajectory prediction model. In other words, the model has good benign performance but is vulnerable to backdoors. This is the case even if the trigger maneuver is performed by a non-casual agent behind the target vehicle. As a side-effect, our analysis reveals interesting limitations within trajectory prediction models. Finally, we evaluate a range of defenses against backdoors. While some, like simple offroad checks, do not enable detection for all triggers, clustering is a promising candidate to support manual inspection to find backdoors. △ Less

Submitted 3 January, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

Comments: 9 pages, 7 figures

arXiv:2311.09994 [pdf, other]

Towards more Practical Threat Models in Artificial Intelligence Security

Authors: Kathrin Grosse, Lukas Bieringer, Tarek Richard Besold, Alexandre Alahi

Abstract: Recent works have identified a gap between research and practice in artificial intelligence security: threats studied in academia do not always reflect the practical use and security risks of AI. For example, while models are often studied in isolation, they form part of larger ML pipelines in practice. Recent works also brought forward that adversarial manipulations introduced by academic attacks… ▽ More Recent works have identified a gap between research and practice in artificial intelligence security: threats studied in academia do not always reflect the practical use and security risks of AI. For example, while models are often studied in isolation, they form part of larger ML pipelines in practice. Recent works also brought forward that adversarial manipulations introduced by academic attacks are impractical. We take a first step towards describing the full extent of this disparity. To this end, we revisit the threat models of the six most studied attacks in AI security research and match them to AI usage in practice via a survey with 271 industrial practitioners. On the one hand, we find that all existing threat models are indeed applicable. On the other hand, there are significant mismatches: research is often too generous with the attacker, assuming access to information not frequently available in real-world settings. Our paper is thus a call for action to study more practical threat models in artificial intelligence security. △ Less

Submitted 26 March, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

Comments: 18 pages, 4 figures, 8 tables, accepted to Usenix Security, incorporated external feedback

arXiv:2302.04623 [pdf]

Employing Channel Probing to Derive End-of-Life Service Margins for Optical Spectrum Services. To appear in OPTICA Journal of Optical Communications and Networking

Authors: K. Kaeval, F. Slyne, S. Troia, E. Kenny, K. Große, H. Griesser, D. C. Kilper, M. Ruffini, J-J Pedreno-Manresa, S. K. Patri, G. Jervan

Abstract: Optical Spectrum as a Service (OSaaS) spanning over multiple transparent optical network domains, can significantly reduce the investment and operational costs of the end-to-end service. Based on the black-link approach, these services are empowered by reconfigurable transceivers and the emerging disaggregation trend in optical transport networks. This work investigates the accuracy aspects of the… ▽ More Optical Spectrum as a Service (OSaaS) spanning over multiple transparent optical network domains, can significantly reduce the investment and operational costs of the end-to-end service. Based on the black-link approach, these services are empowered by reconfigurable transceivers and the emerging disaggregation trend in optical transport networks. This work investigates the accuracy aspects of the channel probing method used in Generalized Signal to Noise Ratio (GSNR)-based OSaaS characterization in terrestrial brownfield systems. OSaaS service margins to accommodate impacts from enabling neighboring channels and end-of-life channel loads are experimentally derived in a systematic lab study carried out in the Open Ireland testbed. The applicability of the lab-derived margins is then verified in the HEAnet production network using a 400 GHz wide OSaaS. Finally, the probing accuracy is tested by depleting the GSNR margin through power adjustments utilizing the same 400 GHz OSaaS in the HEAnet live network. A minimum of 0.92 dB and 1.46 dB of service margin allocation is recommended to accommodate the impacts of enabling neighboring channels and end-of-life channel loads. Further 0.6 dB of GSNR margin should be allocated to compensate for probing inaccuracies. △ Less

Submitted 9 February, 2023; originally announced February 2023.

arXiv:2212.06123 [pdf, other]

A Survey on Reinforcement Learning Security with Application to Autonomous Driving

Authors: Ambra Demontis, Maura Pintor, Luca Demetrio, Kathrin Grosse, Hsiao-Ying Lin, Chengfang Fang, Battista Biggio, Fabio Roli

Abstract: Reinforcement learning allows machines to learn from their own experience. Nowadays, it is used in safety-critical applications, such as autonomous driving, despite being vulnerable to attacks carefully crafted to either prevent that the reinforcement learning algorithm learns an effective and reliable policy, or to induce the trained agent to make a wrong decision. The literature about the securi… ▽ More Reinforcement learning allows machines to learn from their own experience. Nowadays, it is used in safety-critical applications, such as autonomous driving, despite being vulnerable to attacks carefully crafted to either prevent that the reinforcement learning algorithm learns an effective and reliable policy, or to induce the trained agent to make a wrong decision. The literature about the security of reinforcement learning is rapidly growing, and some surveys have been proposed to shed light on this field. However, their categorizations are insufficient for choosing an appropriate defense given the kind of system at hand. In our survey, we do not only overcome this limitation by considering a different perspective, but we also discuss the applicability of state-of-the-art attacks and defenses when reinforcement learning algorithms are used in the context of autonomous driving. △ Less

Submitted 12 December, 2022; originally announced December 2022.

arXiv:2207.05164 [pdf, other]

doi 10.1109/TIFS.2023.3251842

Machine Learning Security in Industry: A Quantitative Survey

Authors: Kathrin Grosse, Lukas Bieringer, Tarek Richard Besold, Battista Biggio, Katharina Krombholz

Abstract: Despite the large body of academic work on machine learning security, little is known about the occurrence of attacks on machine learning systems in the wild. In this paper, we report on a quantitative study with 139 industrial practitioners. We analyze attack occurrence and concern and evaluate statistical hypotheses on factors influencing threat perception and exposure. Our results shed light on… ▽ More Despite the large body of academic work on machine learning security, little is known about the occurrence of attacks on machine learning systems in the wild. In this paper, we report on a quantitative study with 139 industrial practitioners. We analyze attack occurrence and concern and evaluate statistical hypotheses on factors influencing threat perception and exposure. Our results shed light on real-world attacks on deployed machine learning. On the organizational level, while we find no predictors for threat exposure in our sample, the amount of implement defenses depends on exposure to threats or expected likelihood to become a target. We also provide a detailed analysis of practitioners' replies on the relevance of individual machine learning attacks, unveiling complex concerns like unreliable decision making, business information leakage, and bias introduction into models. Finally, we find that on the individual level, prior knowledge about machine learning security influences threat perception. Our work paves the way for more research about adversarial machine learning in practice, but yields also insights for regulation and auditing. △ Less

Submitted 10 March, 2023; v1 submitted 11 July, 2022; originally announced July 2022.

Comments: Accepted at TIFS, version with more detailed appendix containing more detailed statistical results. 17 pages, 6 tables and 4 figures

arXiv:2205.01992 [pdf, other]

doi 10.1145/3585385

Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning

Authors: Antonio Emanuele Cinà, Kathrin Grosse, Ambra Demontis, Sebastiano Vascon, Werner Zellinger, Bernhard A. Moser, Alina Oprea, Battista Biggio, Marcello Pelillo, Fabio Roli

Abstract: The success of machine learning is fueled by the increasing availability of computing power and large training datasets. The training data is used to learn new models or update existing ones, assuming that it is sufficiently representative of the data that will be encountered at test time. This assumption is challenged by the threat of poisoning, an attack that manipulates the training data to com… ▽ More The success of machine learning is fueled by the increasing availability of computing power and large training datasets. The training data is used to learn new models or update existing ones, assuming that it is sufficiently representative of the data that will be encountered at test time. This assumption is challenged by the threat of poisoning, an attack that manipulates the training data to compromise the model's performance at test time. Although poisoning has been acknowledged as a relevant threat in industry applications, and a variety of different attacks and defenses have been proposed so far, a complete systematization and critical review of the field is still missing. In this survey, we provide a comprehensive systematization of poisoning attacks and defenses in machine learning, reviewing more than 100 papers published in the field in the last 15 years. We start by categorizing the current threat models and attacks, and then organize existing defenses accordingly. While we focus mostly on computer-vision applications, we argue that our systematization also encompasses state-of-the-art attacks and defenses for other data modalities. Finally, we discuss existing resources for research in poisoning, and shed light on the current limitations and open research questions in this research field. △ Less

Submitted 9 March, 2023; v1 submitted 4 May, 2022; originally announced May 2022.

Comments: 35 pages, Accepted at ACM Computing Surveys

arXiv:2204.05986 [pdf, other]

doi 10.1109/MC.2023.3299572

Machine Learning Security against Data Poisoning: Are We There Yet?

Authors: Antonio Emanuele Cinà, Kathrin Grosse, Ambra Demontis, Battista Biggio, Fabio Roli, Marcello Pelillo

Abstract: The recent success of machine learning (ML) has been fueled by the increasing availability of computing power and large amounts of data in many different applications. However, the trustworthiness of the resulting models can be compromised when such data is maliciously manipulated to mislead the learning process. In this article, we first review poisoning attacks that compromise the training data… ▽ More The recent success of machine learning (ML) has been fueled by the increasing availability of computing power and large amounts of data in many different applications. However, the trustworthiness of the resulting models can be compromised when such data is maliciously manipulated to mislead the learning process. In this article, we first review poisoning attacks that compromise the training data used to learn ML models, including attacks that aim to reduce the overall performance, manipulate the predictions on specific test samples, and even implant backdoors in the model. We then discuss how to mitigate these attacks using basic security principles, or by deploying ML-oriented defensive mechanisms. We conclude our article by formulating some relevant open challenges which are hindering the development of testing methods and benchmarks suitable for assessing and improving the trustworthiness of ML models against data poisoning attacks △ Less

Submitted 8 March, 2024; v1 submitted 12 April, 2022; originally announced April 2022.

Comments: preprint, 10 pages, 3 figures. Paper accepted to the IEEE Computer - Special Issue on Trustworthy AI

arXiv:2106.07214 [pdf, other]

Backdoor Learning Curves: Explaining Backdoor Poisoning Beyond Influence Functions

Authors: Antonio Emanuele Cinà, Kathrin Grosse, Sebastiano Vascon, Ambra Demontis, Battista Biggio, Fabio Roli, Marcello Pelillo

Abstract: Backdoor attacks inject poisoning samples during training, with the goal of forcing a machine learning model to output an attacker-chosen class when presented a specific trigger at test time. Although backdoor attacks have been demonstrated in a variety of settings and against different models, the factors affecting their effectiveness are still not well understood. In this work, we provide a unif… ▽ More Backdoor attacks inject poisoning samples during training, with the goal of forcing a machine learning model to output an attacker-chosen class when presented a specific trigger at test time. Although backdoor attacks have been demonstrated in a variety of settings and against different models, the factors affecting their effectiveness are still not well understood. In this work, we provide a unifying framework to study the process of backdoor learning under the lens of incremental learning and influence functions. We show that the effectiveness of backdoor attacks depends on: (i) the complexity of the learning algorithm, controlled by its hyperparameters; (ii) the fraction of backdoor samples injected into the training set; and (iii) the size and visibility of the backdoor trigger. These factors affect how fast a model learns to correlate the presence of the backdoor trigger with the target class. Our analysis unveils the intriguing existence of a region in the hyperparameter space in which the accuracy on clean test samples is still high while backdoor attacks are ineffective, thereby suggesting novel criteria to improve existing defenses. △ Less

Submitted 16 March, 2022; v1 submitted 14 June, 2021; originally announced June 2021.

Comments: preprint; 28 pages

arXiv:2105.03726 [pdf, other]

Mental Models of Adversarial Machine Learning

Authors: Lukas Bieringer, Kathrin Grosse, Michael Backes, Battista Biggio, Katharina Krombholz

Abstract: Although machine learning is widely used in practice, little is known about practitioners' understanding of potential security challenges. In this work, we close this substantial gap and contribute a qualitative study focusing on developers' mental models of the machine learning pipeline and potentially vulnerable components. Similar studies have helped in other security fields to discover root ca… ▽ More Although machine learning is widely used in practice, little is known about practitioners' understanding of potential security challenges. In this work, we close this substantial gap and contribute a qualitative study focusing on developers' mental models of the machine learning pipeline and potentially vulnerable components. Similar studies have helped in other security fields to discover root causes or improve risk communication. Our study reveals two \facets of practitioners' mental models of machine learning security. Firstly, practitioners often confuse machine learning security with threats and defences that are not directly related to machine learning. Secondly, in contrast to most academic research, our participants perceive security of machine learning as not solely related to individual models, but rather in the context of entire workflows that consist of multiple components. Jointly with our additional findings, these two facets provide a foundation to substantiate mental models for machine learning security and have implications for the integration of adversarial machine learning into corporate workflows, \new{decreasing practitioners' reported uncertainty}, and appropriate regulatory frameworks for machine learning security. △ Less

Submitted 29 June, 2022; v1 submitted 8 May, 2021; originally announced May 2021.

Comments: accepted at SOUPS 2022

arXiv:2102.02760 [pdf, other]

doi 10.1063/5.0045697

Ignition and propagation of nanosecond pulsed discharges in distilled water -- negative vs. positive polarity applied to a pin electrode

Authors: K. Grosse, M. Falke, A. von Keudell

Abstract: Nanosecond plasmas in liquids are being used for water treatment, electrolysis or biomedical applications. The exact nature of these very dynamic plasmas and most important their ignition physics are strongly debated. The ignition itself may be explained by two competing hypothesis: (i) ignition via field effects or (ii) via electron multiplication in nanovoids. Both hypothesis are supported by th… ▽ More Nanosecond plasmas in liquids are being used for water treatment, electrolysis or biomedical applications. The exact nature of these very dynamic plasmas and most important their ignition physics are strongly debated. The ignition itself may be explained by two competing hypothesis: (i) ignition via field effects or (ii) via electron multiplication in nanovoids. Both hypothesis are supported by theory, but experimental data are very sparse due to the difficulty to monitor the very fast processes in space and time. In this paper, we are using fast camera measurements and fast emission spectroscopy of nanosecond plasmas in water applying a positive and a negative polarity to a sharp tungsten electrode. It is shown that plasma ignition is dominated by field effects at the electrode-liquid interface either as field ionization for positive polarity or as field emission for negative polarity. This leads to a hot tungsten surface at a temperature of 7000 K for positive polarity, whereas the surface temperature is much lower for the negative polarity. At ignition, the electron density reaches 4 $\cdot$ 10$^{25}$ m$^{-3}$ for positive and only 2 $\cdot$ 10$^{25}$ m$^{-3}$ for the negative polarity. At the same time, the emission of the \Ha~light for the positive polarity is 4 times higher than that for the negative polarity. During plasma propagation, the electron densities are almost identical of the order of a 1 to 2 $\cdot$ 10$^{25}$ m$^{-3}$ and decay after the end of the pulse over 15 ns. It is concluded that plasma propagation is governed by field effects in a low density region that is created either by nanovoids or by density fluctuation in super critical water surrounding the electrode that is created by the pressure at the moment of plasma ignition. △ Less

Submitted 4 February, 2021; originally announced February 2021.

arXiv:2007.06993 [pdf, ps, other]

Adversarial Examples and Metrics

Authors: Nico Döttling, Kathrin Grosse, Michael Backes, Ian Molloy

Abstract: Adversarial examples are a type of attack on machine learning (ML) systems which cause misclassification of inputs. Achieving robustness against adversarial examples is crucial to apply ML in the real world. While most prior work on adversarial examples is empirical, a recent line of work establishes fundamental limitations of robust classification based on cryptographic hardness. Most positive an… ▽ More Adversarial examples are a type of attack on machine learning (ML) systems which cause misclassification of inputs. Achieving robustness against adversarial examples is crucial to apply ML in the real world. While most prior work on adversarial examples is empirical, a recent line of work establishes fundamental limitations of robust classification based on cryptographic hardness. Most positive and negative results in this field however assume that there is a fixed target metric which constrains the adversary, and we argue that this is often an unrealistic assumption. In this work we study the limitations of robust classification if the target metric is uncertain. Concretely, we construct a classification problem, which admits robust classification by a small classifier if the target metric is known at the time the model is trained, but for which robust classification is impossible for small classifiers if the target metric is chosen after the fact. In the process, we explore a novel connection between hardness of robust classification and bounded storage model cryptography. △ Less

Submitted 15 July, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

Comments: 25 pages, 1 figure, under submission, fixe typos from previous version

arXiv:2006.07014 [pdf, other]

How many winning tickets are there in one DNN?

Authors: Kathrin Grosse, Michael Backes

Abstract: The recent lottery ticket hypothesis proposes that there is one sub-network that matches the accuracy of the original network when trained in isolation. We show that instead each network contains several winning tickets, even if the initial weights are fixed. The resulting winning sub-networks are not instances of the same network under weight space symmetry, and show no overlap or correlation sig… ▽ More The recent lottery ticket hypothesis proposes that there is one sub-network that matches the accuracy of the original network when trained in isolation. We show that instead each network contains several winning tickets, even if the initial weights are fixed. The resulting winning sub-networks are not instances of the same network under weight space symmetry, and show no overlap or correlation significantly larger than expected by chance. If randomness during training is decreased, overlaps higher than chance occur, even if the networks are trained on different tasks. We conclude that there is rather a distribution over capable sub-networks, as opposed to a single winning ticket. △ Less

Submitted 12 June, 2020; originally announced June 2020.

Comments: 17 pages, 15 figures, under submission

arXiv:2006.06721 [pdf, other]

Backdoor Smoothing: Demystifying Backdoor Attacks on Deep Neural Networks

Authors: Kathrin Grosse, Taesung Lee, Battista Biggio, Youngja Park, Michael Backes, Ian Molloy

Abstract: Backdoor attacks mislead machine-learning models to output an attacker-specified class when presented a specific trigger at test time. These attacks require poisoning the training data to compromise the learning algorithm, e.g., by injecting poisoning samples containing the trigger into the training set, along with the desired class label. Despite the increasing number of studies on backdoor attac… ▽ More Backdoor attacks mislead machine-learning models to output an attacker-specified class when presented a specific trigger at test time. These attacks require poisoning the training data to compromise the learning algorithm, e.g., by injecting poisoning samples containing the trigger into the training set, along with the desired class label. Despite the increasing number of studies on backdoor attacks and defenses, the underlying factors affecting the success of backdoor attacks, along with their impact on the learning algorithm, are not yet well understood. In this work, we aim to shed light on this issue by unveiling that backdoor attacks induce a smoother decision function around the triggered samples -- a phenomenon which we refer to as \textit{backdoor smoothing}. To quantify backdoor smoothing, we define a measure that evaluates the uncertainty associated to the predictions of a classifier around the input samples. Our experiments show that smoothness increases when the trigger is added to the input samples, and that this phenomenon is more pronounced for more successful attacks. We also provide preliminary evidence that backdoor triggers are not the only smoothing-inducing patterns, but that also other artificial patterns can be detected by our approach, paving the way towards understanding the limitations of current defenses and designing novel ones. △ Less

Submitted 2 November, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

Comments: 9 pages, 7 figures, under submission

arXiv:1909.08864 [pdf, other]

Adversarial Vulnerability Bounds for Gaussian Process Classification

Authors: Michael Thomas Smith, Kathrin Grosse, Michael Backes, Mauricio A Alvarez

Abstract: Machine learning (ML) classification is increasingly used in safety-critical systems. Protecting ML classifiers from adversarial examples is crucial. We propose that the main threat is that of an attacker perturbing a confidently classified input to produce a confident misclassification. To protect against this we devise an adversarial bound (AB) for a Gaussian process classifier, that holds for t… ▽ More Machine learning (ML) classification is increasingly used in safety-critical systems. Protecting ML classifiers from adversarial examples is crucial. We propose that the main threat is that of an attacker perturbing a confidently classified input to produce a confident misclassification. To protect against this we devise an adversarial bound (AB) for a Gaussian process classifier, that holds for the entire input domain, bounding the potential for any future adversarial method to cause such misclassification. This is a formal guarantee of robustness, not just an empirically derived result. We investigate how to configure the classifier to maximise the bound, including the use of a sparse approximation, leading to the method producing a practical, useful and provably robust classifier, which we test using a variety of datasets. △ Less

Submitted 19 September, 2019; originally announced September 2019.

Comments: 10 pages + 2 pages references + 7 pages of supplementary. 12 figures. Submitted to AAAI

arXiv:1902.03020 [pdf, ps, other]

On the security relevance of weights in deep learning

Authors: Kathrin Grosse, Thomas A. Trost, Marius Mosbach, Michael Backes, Dietrich Klakow

Abstract: Recently, a weight-based attack on stochastic gradient descent inducing overfitting has been proposed. We show that the threat is broader: A task-independent permutation on the initial weights suffices to limit the achieved accuracy to for example 50% on the Fashion MNIST dataset from initially more than $90$%. These findings are confirmed on MNIST and CIFAR. We formally confirm that the attack su… ▽ More Recently, a weight-based attack on stochastic gradient descent inducing overfitting has been proposed. We show that the threat is broader: A task-independent permutation on the initial weights suffices to limit the achieved accuracy to for example 50% on the Fashion MNIST dataset from initially more than $90$%. These findings are confirmed on MNIST and CIFAR. We formally confirm that the attack succeeds with high likelihood and does not depend on the data. Empirically, weight statistics and loss appear unsuspicious, making it hard to detect the attack if the user is not aware. Our paper is thus a call for action to acknowledge the importance of the initial weights in deep learning. △ Less

Submitted 29 November, 2020; v1 submitted 8 February, 2019; originally announced February 2019.

Comments: 16 pages, 18 figures, long version of paper published at ICANN 2020

arXiv:1812.02606 [pdf, other]

The Limitations of Model Uncertainty in Adversarial Settings

Authors: Kathrin Grosse, David Pfaff, Michael Thomas Smith, Michael Backes

Abstract: Machine learning models are vulnerable to adversarial examples: minor perturbations to input samples intended to deliberately cause misclassification. While an obvious security threat, adversarial examples yield as well insights about the applied model itself. We investigate adversarial examples in the context of Bayesian neural network's (BNN's) uncertainty measures. As these measures are highly… ▽ More Machine learning models are vulnerable to adversarial examples: minor perturbations to input samples intended to deliberately cause misclassification. While an obvious security threat, adversarial examples yield as well insights about the applied model itself. We investigate adversarial examples in the context of Bayesian neural network's (BNN's) uncertainty measures. As these measures are highly non-smooth, we use a smooth Gaussian process classifier (GPC) as substitute. We show that both confidence and uncertainty can be unsuspicious even if the output is wrong. Intriguingly, we find subtle differences in the features influencing uncertainty and confidence for most tasks. △ Less

Submitted 17 November, 2019; v1 submitted 6 December, 2018; originally announced December 2018.

Comments: Accepted to the Bayesian Deep Learning Workshop 2019 at NeurIPS. For longer version with more background, refer to previous version

arXiv:1808.00590 [pdf, other]

MLCapsule: Guarded Offline Deployment of Machine Learning as a Service

Authors: Lucjan Hanzlik, Yang Zhang, Kathrin Grosse, Ahmed Salem, Max Augustin, Michael Backes, Mario Fritz

Abstract: With the widespread use of machine learning (ML) techniques, ML as a service has become increasingly popular. In this setting, an ML model resides on a server and users can query it with their data via an API. However, if the user's input is sensitive, sending it to the server is undesirable and sometimes even legally not possible. Equally, the service provider does not want to share the model by… ▽ More With the widespread use of machine learning (ML) techniques, ML as a service has become increasingly popular. In this setting, an ML model resides on a server and users can query it with their data via an API. However, if the user's input is sensitive, sending it to the server is undesirable and sometimes even legally not possible. Equally, the service provider does not want to share the model by sending it to the client for protecting its intellectual property and pay-per-query business model. In this paper, we propose MLCapsule, a guarded offline deployment of machine learning as a service. MLCapsule executes the model locally on the user's side and therefore the data never leaves the client. Meanwhile, MLCapsule offers the service provider the same level of control and security of its model as the commonly used server-side execution. In addition, MLCapsule is applicable to offline applications that require local execution. Beyond protecting against direct model access, we couple the secure offline deployment with defenses against advanced attacks on machine learning models such as model stealing, reverse engineering, and membership inference. △ Less

Submitted 6 February, 2019; v1 submitted 1 August, 2018; originally announced August 2018.

arXiv:1806.02032 [pdf, other]

Killing four birds with one Gaussian process: the relation between different test-time attacks

Authors: Kathrin Grosse, Michael T. Smith, Michael Backes

Abstract: In machine learning (ML) security, attacks like evasion, model stealing or membership inference are generally studied in individually. Previous work has also shown a relationship between some attacks and decision function curvature of the targeted model. Consequently, we study an ML model allowing direct control over the decision surface curvature: Gaussian Process classifiers (GPCs). For evasion,… ▽ More In machine learning (ML) security, attacks like evasion, model stealing or membership inference are generally studied in individually. Previous work has also shown a relationship between some attacks and decision function curvature of the targeted model. Consequently, we study an ML model allowing direct control over the decision surface curvature: Gaussian Process classifiers (GPCs). For evasion, we find that changing GPC's curvature to be robust against one attack algorithm boils down to enabling a different norm or attack algorithm to succeed. This is backed up by our formal analysis showing that static security guarantees are opposed to learning. Concerning intellectual property, we show formally that lazy learning does not necessarily leak all information when applied. In practice, often a seemingly secure curvature can be found. For example, we are able to secure GPC against empirical membership inference by proper configuration. In this configuration, however, the GPC's hyper-parameters are leaked, e.g. model reverse engineering succeeds. We conclude that attacks on classification should not be studied in isolation, but in relation to each other. △ Less

Submitted 29 November, 2020; v1 submitted 6 June, 2018; originally announced June 2018.

Comments: 10 pages, 8 figures, long version of paper accepted at ICPR 2020

arXiv:1711.06598

How Wrong Am I? - Studying Adversarial Examples and their Impact on Uncertainty in Gaussian Process Machine Learning Models

Authors: Kathrin Grosse, David Pfaff, Michael Thomas Smith, Michael Backes

Abstract: Machine learning models are vulnerable to Adversarial Examples: minor perturbations to input samples intended to deliberately cause misclassification. Current defenses against adversarial examples, especially for Deep Neural Networks (DNN), are primarily derived from empirical developments, and their security guarantees are often only justified retroactively. Many defenses therefore rely on hidden… ▽ More Machine learning models are vulnerable to Adversarial Examples: minor perturbations to input samples intended to deliberately cause misclassification. Current defenses against adversarial examples, especially for Deep Neural Networks (DNN), are primarily derived from empirical developments, and their security guarantees are often only justified retroactively. Many defenses therefore rely on hidden assumptions that are subsequently subverted by increasingly elaborate attacks. This is not surprising: deep learning notoriously lacks a comprehensive mathematical framework to provide meaningful guarantees. In this paper, we leverage Gaussian Processes to investigate adversarial examples in the framework of Bayesian inference. Across different models and datasets, we find deviating levels of uncertainty reflect the perturbation introduced to benign samples by state-of-the-art attacks, including novel white-box attacks on Gaussian Processes. Our experiments demonstrate that even unoptimized uncertainty thresholds already reject adversarial examples in many scenarios. Comment: Thresholds can be broken in a modified attack, which was done in arXiv:1812.02606 (The limitations of model uncertainty in adversarial settings). △ Less

Submitted 3 January, 2019; v1 submitted 17 November, 2017; originally announced November 2017.

Comments: Reasoning incomplete. Fixed issue in arXiv:1812.02606 (The limitations of model uncertainty in adversarial settings)

arXiv:1702.06280 [pdf, other]

On the (Statistical) Detection of Adversarial Examples

Authors: Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, Patrick McDaniel

Abstract: Machine Learning (ML) models are applied in a variety of tasks such as network intrusion detection or Malware classification. Yet, these models are vulnerable to a class of malicious inputs known as adversarial examples. These are slightly perturbed inputs that are classified incorrectly by the ML model. The mitigation of these adversarial inputs remains an open problem. As a step towards understa… ▽ More Machine Learning (ML) models are applied in a variety of tasks such as network intrusion detection or Malware classification. Yet, these models are vulnerable to a class of malicious inputs known as adversarial examples. These are slightly perturbed inputs that are classified incorrectly by the ML model. The mitigation of these adversarial inputs remains an open problem. As a step towards understanding adversarial examples, we show that they are not drawn from the same distribution than the original data, and can thus be detected using statistical tests. Using thus knowledge, we introduce a complimentary approach to identify specific inputs that are adversarial. Specifically, we augment our ML model with an additional output, in which the model is trained to classify all adversarial inputs. We evaluate our approach on multiple adversarial example crafting methods (including the fast gradient sign and saliency map methods) with several datasets. The statistical test flags sample sets containing adversarial inputs confidently at sample sizes between 10 and 100 data points. Furthermore, our augmented model either detects adversarial examples as outliers with high accuracy (> 80%) or increases the adversary's cost - the perturbation added - by more than 150%. In this way, we show that statistical properties of adversarial examples are essential to their detection. △ Less

Submitted 17 October, 2017; v1 submitted 21 February, 2017; originally announced February 2017.

Comments: 13 pages, 4 figures, 5 tables. New version: improved writing, incorporating external feedback

arXiv:1606.04435 [pdf, other]

Adversarial Perturbations Against Deep Neural Networks for Malware Classification

Authors: Kathrin Grosse, Nicolas Papernot, Praveen Manoharan, Michael Backes, Patrick McDaniel

Abstract: Deep neural networks, like many other machine learning models, have recently been shown to lack robustness against adversarially crafted inputs. These inputs are derived from regular inputs by minor yet carefully selected perturbations that deceive machine learning models into desired misclassifications. Existing work in this emerging field was largely specific to the domain of image classificatio… ▽ More Deep neural networks, like many other machine learning models, have recently been shown to lack robustness against adversarially crafted inputs. These inputs are derived from regular inputs by minor yet carefully selected perturbations that deceive machine learning models into desired misclassifications. Existing work in this emerging field was largely specific to the domain of image classification, since the high-entropy of images can be conveniently manipulated without changing the images' overall visual appearance. Yet, it remains unclear how such attacks translate to more security-sensitive applications such as malware detection - which may pose significant challenges in sample generation and arguably grave consequences for failure. In this paper, we show how to construct highly-effective adversarial sample crafting attacks for neural networks used as malware classifiers. The application domain of malware classification introduces additional constraints in the adversarial sample crafting problem when compared to the computer vision domain: (i) continuous, differentiable input domains are replaced by discrete, often binary inputs; and (ii) the loose condition of leaving visual appearance unchanged is replaced by requiring equivalent functional behavior. We demonstrate the feasibility of these attacks on many different instances of malware classifiers that we trained using the DREBIN Android malware data set. We furthermore evaluate to which extent potential defensive mechanisms against adversarial crafting can be leveraged to the setting of malware classification. While feature reduction did not prove to have a positive impact, distillation and re-training on adversarially crafted samples show promising results. △ Less

Submitted 16 June, 2016; v1 submitted 14 June, 2016; originally announced June 2016.

Comments: version update: correcting typos, incorporating external feedback

arXiv:1305.4946 [pdf]

doi 10.1063/1.4803172

Direct observation of nanometer-scale Joule and Peltier effects in phase change memory devices

Authors: Kyle L. Grosse, Feng Xiong, Sungduk Hong, William P. King, Eric Pop

Abstract: We measure power dissipation in phase change memory (PCM) devices by scanning Joule ex-pansion microscopy (SJEM) with ~50 nm spatial and 0.2 K temperature resolution. The temperature rise in the Ge2Sb2Te5 (GST) is dominated by Joule heating, but at the GST-TiW contacts it is a combination of Peltier and current crowding effects. Comparison of SJEM and electrical characterization with simulations o… ▽ More We measure power dissipation in phase change memory (PCM) devices by scanning Joule ex-pansion microscopy (SJEM) with ~50 nm spatial and 0.2 K temperature resolution. The temperature rise in the Ge2Sb2Te5 (GST) is dominated by Joule heating, but at the GST-TiW contacts it is a combination of Peltier and current crowding effects. Comparison of SJEM and electrical characterization with simulations of the PCM devices uncovers a thermopower ~350 uV/K for 25 nm thick films of face centered-cubic crystallized GST, and contact resistance ~2.0 x 10^-8 Ohm-m2. Knowledge of such nanoscale Joule, Peltier, and current crowding effects is essential for energy-efficient design of future PCM technology. △ Less

Submitted 21 May, 2013; originally announced May 2013.

Comments: includes supplement

Journal ref: Applied Physics Letters, vol. 102, p. 193503 (2013)

Showing 1–23 of 23 results for author: Grosse, K