-
Rethinking How to Evaluate Language Model Jailbreak
Authors:
Hongyu Cai,
Arjun Arunasalam,
Leo Y. Lin,
Antonio Bianchi,
Z. Berkay Celik
Abstract:
Large language models (LLMs) have become increasingly integrated with various applications. To ensure that LLMs do not generate unsafe responses, they are aligned with safeguards that specify what content is restricted. However, such alignment can be bypassed to produce prohibited content using a technique commonly referred to as jailbreak. Different systems have been proposed to perform the jailb…
▽ More
Large language models (LLMs) have become increasingly integrated with various applications. To ensure that LLMs do not generate unsafe responses, they are aligned with safeguards that specify what content is restricted. However, such alignment can be bypassed to produce prohibited content using a technique commonly referred to as jailbreak. Different systems have been proposed to perform the jailbreak automatically. These systems rely on evaluation methods to determine whether a jailbreak attempt is successful. However, our analysis reveals that current jailbreak evaluation methods have two limitations. (1) Their objectives lack clarity and do not align with the goal of identifying unsafe responses. (2) They oversimplify the jailbreak result as a binary outcome, successful or not. In this paper, we propose three metrics, safeguard violation, informativeness, and relative truthfulness, to evaluate language model jailbreak. Additionally, we demonstrate how these metrics correlate with the goal of different malicious actors. To compute these metrics, we introduce a multifaceted approach that extends the natural language generation evaluation method after preprocessing the response. We evaluate our metrics on a benchmark dataset produced from three malicious intent datasets and three jailbreak systems. The benchmark dataset is labeled by three annotators. We compare our multifaceted approach with three existing jailbreak evaluation methods. Experiments demonstrate that our multifaceted evaluation outperforms existing methods, with F1 scores improving on average by 17% compared to existing baselines. Our findings motivate the need to move away from the binary view of the jailbreak problem and incorporate a more comprehensive evaluation to ensure the safety of the language model.
△ Less
Submitted 7 May, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Software Engineering for Robotics: Future Research Directions; Report from the 2023 Workshop on Software Engineering for Robotics
Authors:
Claire Le Goues,
Sebastian Elbaum,
David Anthony,
Z. Berkay Celik,
Mauricio Castillo-Effen,
Nikolaus Correll,
Pooyan Jamshidi,
Morgan Quigley,
Trenton Tabor,
Qi Zhu
Abstract:
Robots are experiencing a revolution as they permeate many aspects of our daily lives, from performing house maintenance to infrastructure inspection, from efficiently warehousing goods to autonomous vehicles, and more. This technical progress and its impact are astounding. This revolution, however, is outstrip** the capabilities of existing software development processes, techniques, and tools,…
▽ More
Robots are experiencing a revolution as they permeate many aspects of our daily lives, from performing house maintenance to infrastructure inspection, from efficiently warehousing goods to autonomous vehicles, and more. This technical progress and its impact are astounding. This revolution, however, is outstrip** the capabilities of existing software development processes, techniques, and tools, which largely have remained unchanged for decades. These capabilities are ill-suited to handling the challenges unique to robotics software such as dealing with a wide diversity of domains, heterogeneous hardware, programmed and learned components, complex physical environments captured and modeled with uncertainty, emergent behaviors that include human interactions, and scalability demands that span across multiple dimensions.
Looking ahead to the need to develop software for robots that are ever more ubiquitous, autonomous, and reliant on complex adaptive components, hardware, and data, motivated an NSF-sponsored community workshop on the subject of Software Engineering for Robotics, held in Detroit, Michigan in October 2023. The goal of the workshop was to bring together thought leaders across robotics and software engineering to coalesce a community, and identify key problems in the area of SE for robotics that that community should aim to solve over the next 5 years. This report serves to summarize the motivation, activities, and findings of that workshop, in particular by articulating the challenges unique to robot software, and identifying a vision for fruitful near-term research directions to tackle them.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
Can Large Language Models Provide Security & Privacy Advice? Measuring the Ability of LLMs to Refute Misconceptions
Authors:
Yufan Chen,
Arjun Arunasalam,
Z. Berkay Celik
Abstract:
Users seek security & privacy (S&P) advice from online resources, including trusted websites and content-sharing platforms. These resources help users understand S&P technologies and tools and suggest actionable strategies. Large Language Models (LLMs) have recently emerged as trusted information sources. However, their accuracy and correctness have been called into question. Prior research has ou…
▽ More
Users seek security & privacy (S&P) advice from online resources, including trusted websites and content-sharing platforms. These resources help users understand S&P technologies and tools and suggest actionable strategies. Large Language Models (LLMs) have recently emerged as trusted information sources. However, their accuracy and correctness have been called into question. Prior research has outlined the shortcomings of LLMs in answering multiple-choice questions and user ability to inadvertently circumvent model restrictions (e.g., to produce toxic content). Yet, the ability of LLMs to provide reliable S&P advice is not well-explored. In this paper, we measure their ability to refute popular S&P misconceptions that the general public holds. We first study recent academic literature to curate a dataset of over a hundred S&P-related misconceptions across six different topics. We then query two popular LLMs (Bard and ChatGPT) and develop a labeling guide to evaluate their responses to these misconceptions. To comprehensively evaluate their responses, we further apply three strategies: query each misconception multiple times, generate and query their paraphrases, and solicit source URLs of the responses. Both models demonstrate, on average, a 21.3% non-negligible error rate, incorrectly supporting popular S&P misconceptions. The error rate increases to 32.6% when we repeatedly query LLMs with the same or paraphrased misconceptions. We also expose that models may partially support a misconception or remain noncommittal, refusing a firm stance on misconceptions. Our exploration of information sources for responses revealed that LLMs are susceptible to providing invalid URLs (21.2% for Bard and 67.7% for ChatGPT) or point to unrelated sources (44.2% returned by Bard and 18.3% by ChatGPT).
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
User Training with Error Augmentation for Electromyogram-based Gesture Classification
Authors:
Yunus Bicer,
Niklas Smedemark-Margulies,
Basak Celik,
Elifnur Sunger,
Ryan Orendorff,
Stephanie Naufel,
Tales Imbiriba,
Deniz Erdoğmuş,
Eugene Tunik,
Mathew Yarossi
Abstract:
We designed and tested a system for real-time control of a user interface by extracting surface electromyographic (sEMG) activity from eight electrodes in a wrist-band configuration. sEMG data were streamed into a machine-learning algorithm that classified hand gestures in real-time. After an initial model calibration, participants were presented with one of three types of feedback during a human-…
▽ More
We designed and tested a system for real-time control of a user interface by extracting surface electromyographic (sEMG) activity from eight electrodes in a wrist-band configuration. sEMG data were streamed into a machine-learning algorithm that classified hand gestures in real-time. After an initial model calibration, participants were presented with one of three types of feedback during a human-learning stage: veridical feedback, in which predicted probabilities from the gesture classification algorithm were displayed without alteration, modified feedback, in which we applied a hidden augmentation of error to these probabilities, and no feedback. User performance was then evaluated in a series of minigames, in which subjects were required to use eight gestures to manipulate their game avatar to complete a task. Experimental results indicated that, relative to baseline, the modified feedback condition led to significantly improved accuracy and improved gesture class separation. These findings suggest that real-time feedback in a gamified user interface with manipulation of feedback may enable intuitive, rapid, and accurate task acquisition for sEMG-based gesture recognition applications.
△ Less
Submitted 22 March, 2024; v1 submitted 13 September, 2023;
originally announced September 2023.
-
Developmental Scaffolding with Large Language Models
Authors:
Batuhan Celik,
Alper Ahmetoglu,
Emre Ugur,
Erhan Oztop
Abstract:
Exploratoration and self-observation are key mechanisms of infant sensorimotor development. These processes are further guided by parental scaffolding accelerating skill and knowledge acquisition. In developmental robotics, this approach has been adopted often by having a human acting as the source of scaffolding. In this study, we investigate whether Large Language Models (LLMs) can act as a scaf…
▽ More
Exploratoration and self-observation are key mechanisms of infant sensorimotor development. These processes are further guided by parental scaffolding accelerating skill and knowledge acquisition. In developmental robotics, this approach has been adopted often by having a human acting as the source of scaffolding. In this study, we investigate whether Large Language Models (LLMs) can act as a scaffolding agent for a robotic system that aims to learn to predict the effects of its actions. To this end, an object manipulation setup is considered where one object can be picked and placed on top of or in the vicinity of another object. The adopted LLM is asked to guide the action selection process through algorithmically generated state descriptions and action selection alternatives in natural language. The simulation experiments that include cubes in this setup show that LLM-guided (GPT3.5-guided) learning yields significantly faster discovery of novel structures compared to random exploration. However, we observed that GPT3.5 fails to effectively guide the robot in generating structures with different affordances such as cubes and spheres. Overall, we conclude that even without fine-tuning, LLMs may serve as a moderate scaffolding agent for improving robot learning, however, they still lack affordance understanding which limits the applicability of the current LLMs in robotic scaffolding tasks.
△ Less
Submitted 22 November, 2023; v1 submitted 2 September, 2023;
originally announced September 2023.
-
Discovering Predictive Relational Object Symbols with Symbolic Attentive Layers
Authors:
Alper Ahmetoglu,
Batuhan Celik,
Erhan Oztop,
Emre Ugur
Abstract:
In this paper, we propose and realize a new deep learning architecture for discovering symbolic representations for objects and their relations based on the self-supervised continuous interaction of a manipulator robot with multiple objects on a tabletop environment. The key feature of the model is that it can handle a changing number number of objects naturally and map the object-object relations…
▽ More
In this paper, we propose and realize a new deep learning architecture for discovering symbolic representations for objects and their relations based on the self-supervised continuous interaction of a manipulator robot with multiple objects on a tabletop environment. The key feature of the model is that it can handle a changing number number of objects naturally and map the object-object relations into symbolic domain explicitly. In the model, we employ a self-attention layer that computes discrete attention weights from object features, which are treated as relational symbols between objects. These relational symbols are then used to aggregate the learned object symbols and predict the effects of executed actions on each object. The result is a pipeline that allows the formation of object symbols and relational symbols from a dataset of object features, actions, and effects in an end-to-end manner. We compare the performance of our proposed architecture with state-of-the-art symbol discovery methods in a simulated tabletop environment where the robot needs to discover symbols related to the relative positions of objects to predict the observed effect successfully. Our experiments show that the proposed architecture performs better than other baselines in effect prediction while forming not only object symbols but also relational symbols. Furthermore, we analyze the learned symbols and relational patterns between objects to learn about how the model interprets the environment. Our analysis shows that the learned symbols relate to the relative positions of objects, object types, and their horizontal alignment on the table, which reflect the regularities in the environment.
△ Less
Submitted 2 September, 2023;
originally announced September 2023.
-
Recursive Estimation of User Intent from Noninvasive Electroencephalography using Discriminative Models
Authors:
Niklas Smedemark-Margulies,
Basak Celik,
Tales Imbiriba,
Aziz Kocanaogullari,
Deniz Erdogmus
Abstract:
We study the problem of inferring user intent from noninvasive electroencephalography (EEG) to restore communication for people with severe speech and physical impairments (SSPI). The focus of this work is improving the estimation of posterior symbol probabilities in a ty** task. At each iteration of the ty** procedure, a subset of symbols is chosen for the next query based on the current prob…
▽ More
We study the problem of inferring user intent from noninvasive electroencephalography (EEG) to restore communication for people with severe speech and physical impairments (SSPI). The focus of this work is improving the estimation of posterior symbol probabilities in a ty** task. At each iteration of the ty** procedure, a subset of symbols is chosen for the next query based on the current probability estimate. Evidence about the user's response is collected from event-related potentials (ERP) in order to update symbol probabilities, until one symbol exceeds a predefined confidence threshold. We provide a graphical model describing this task, and derive a recursive Bayesian update rule based on a discriminative probability over label vectors for each query, which we approximate using a neural network classifier. We evaluate the proposed method in a simulated ty** task and show that it outperforms previous approaches based on generative modeling.
△ Less
Submitted 29 October, 2022;
originally announced November 2022.
-
Online AutoML: An adaptive AutoML framework for online learning
Authors:
Bilge Celik,
Prabhant Singh,
Joaquin Vanschoren
Abstract:
Automated Machine Learning (AutoML) has been used successfully in settings where the learning task is assumed to be static. In many real-world scenarios, however, the data distribution will evolve over time, and it is yet to be shown whether AutoML techniques can effectively design online pipelines in dynamic environments. This study aims to automate pipeline design for online learning while conti…
▽ More
Automated Machine Learning (AutoML) has been used successfully in settings where the learning task is assumed to be static. In many real-world scenarios, however, the data distribution will evolve over time, and it is yet to be shown whether AutoML techniques can effectively design online pipelines in dynamic environments. This study aims to automate pipeline design for online learning while continuously adapting to data drift. For this purpose, we design an adaptive Online Automated Machine Learning (OAML) system, searching the complete pipeline configuration space of online learners, including preprocessing algorithms and ensembling techniques. This system combines the inherent adaptation capabilities of online learners with the fast automated pipeline (re)optimization capabilities of AutoML. Focusing on optimization techniques that can adapt to evolving objectives, we evaluate asynchronous genetic programming and asynchronous successive halving to optimize these pipelines continually. We experiment on real and artificial data streams with varying types of concept drift to test the performance and adaptation capabilities of the proposed system. The results confirm the utility of OAML over popular online learning algorithms and underscore the benefits of continuous pipeline redesign in the presence of data drift.
△ Less
Submitted 7 December, 2022; v1 submitted 24 January, 2022;
originally announced January 2022.
-
A Stochastic Programming Approach to Surgery Scheduling under Parallel Processing Principle
Authors:
Batuhan Celik,
Serhat Gul,
Melih Celik
Abstract:
Parallel processing is a principle which enables simultaneous implementation of anesthesia induction and operating room (OR) turnover with the aim of improving OR utilization. In this article, we study the problem of scheduling surgeries for multiple ORs and induction rooms (IR) that function based on the parallel processing principle under uncertainty. We propose a two-stage stochastic mixed-inte…
▽ More
Parallel processing is a principle which enables simultaneous implementation of anesthesia induction and operating room (OR) turnover with the aim of improving OR utilization. In this article, we study the problem of scheduling surgeries for multiple ORs and induction rooms (IR) that function based on the parallel processing principle under uncertainty. We propose a two-stage stochastic mixed-integer programming model considering the uncertainty in induction, surgery and turnover durations. We sequence patients and set appointment times for surgeries in the first stage and assign patients to IRs at the second stage of the model. We show that an optimal myopic policy can be used for IR assignment decisions due to the special structure of the model. We minimize the expected total cost of patient waiting time, OR idle time and IR idle time in the objective function. We enhance the model formulation using bounds on variables and symmetry-breaking constraints. We implement a novel progressive hedging algorithm by proposing a penalty update method and a variable fixing mechanism. Based on real data of a large academic hospital, we compare our solution approach with several scheduling heuristics from the literature. We assess the additional benefits and costs associated with the implementation of parallel processing using near-optimal schedules. We examine how the benefits are inflated by increasing the number of IRs. Finally, we estimate the value of stochastic solution to underline the importance of considering uncertainty in durations.
△ Less
Submitted 30 December, 2021;
originally announced December 2021.
-
New Metrics to Evaluate the Performance and Fairness of Personalized Federated Learning
Authors:
Siddharth Divi,
Yi-Shan Lin,
Habiba Farrukh,
Z. Berkay Celik
Abstract:
In Federated Learning (FL), the clients learn a single global model (FedAvg) through a central aggregator. In this setting, the non-IID distribution of the data across clients restricts the global FL model from delivering good performance on the local data of each client. Personalized FL aims to address this problem by finding a personalized model for each client. Recent works widely report the av…
▽ More
In Federated Learning (FL), the clients learn a single global model (FedAvg) through a central aggregator. In this setting, the non-IID distribution of the data across clients restricts the global FL model from delivering good performance on the local data of each client. Personalized FL aims to address this problem by finding a personalized model for each client. Recent works widely report the average personalized model accuracy on a particular data split of a dataset to evaluate the effectiveness of their methods. However, considering the multitude of personalization approaches proposed, it is critical to study the per-user personalized accuracy and the accuracy improvements among users with an equitable notion of fairness. To address these issues, we present a set of performance and fairness metrics intending to assess the quality of personalized FL methods. We apply these metrics to four recently proposed personalized FL methods, PersFL, FedPer, pFedMe, and Per-FedAvg, on three different data splits of the CIFAR-10 dataset. Our evaluations show that the personalized model with the highest average accuracy across users may not necessarily be the fairest. Our code is available at https://tinyurl.com/1hp9ywfa for public use.
△ Less
Submitted 28 July, 2021;
originally announced July 2021.
-
Unifying Distillation with Personalization in Federated Learning
Authors:
Siddharth Divi,
Habiba Farrukh,
Berkay Celik
Abstract:
Federated learning (FL) is a decentralized privacy-preserving learning technique in which clients learn a joint collaborative model through a central aggregator without sharing their data. In this setting, all clients learn a single common predictor (FedAvg), which does not generalize well on each client's local data due to the statistical data heterogeneity among clients. In this paper, we addres…
▽ More
Federated learning (FL) is a decentralized privacy-preserving learning technique in which clients learn a joint collaborative model through a central aggregator without sharing their data. In this setting, all clients learn a single common predictor (FedAvg), which does not generalize well on each client's local data due to the statistical data heterogeneity among clients. In this paper, we address this problem with PersFL, a discrete two-stage personalized learning algorithm. In the first stage, PersFL finds the optimal teacher model of each client during the FL training phase. In the second stage, PersFL distills the useful knowledge from optimal teachers into each user's local model. The teacher model provides each client with some rich, high-level representation that a client can easily adapt to its local model, which overcomes the statistical heterogeneity present at different clients. We evaluate PersFL on CIFAR-10 and MNIST datasets using three data-splitting strategies to control the diversity between clients' data distributions. We empirically show that PersFL outperforms FedAvg and three state-of-the-art personalization methods, pFedMe, Per-FedAvg, and FedPer on majority data-splits with minimal communication cost. Further, we study the performance of PersFL on different distillation objectives, how this performance is affected by the equitable notion of fairness among clients, and the number of required communication rounds. PersFL code is available at https://tinyurl.com/hdh5zhxs for public use and validation.
△ Less
Submitted 31 May, 2021;
originally announced May 2021.
-
On the Safety Implications of Misordered Events and Commands in IoT Systems
Authors:
Furkan Goksel,
Muslum Ozgur Ozmen,
Michael Reeves,
Basavesh Shivakumar,
Z. Berkay Celik
Abstract:
IoT devices, equipped with embedded actuators and sensors, provide custom automation in the form of IoT apps. IoT apps subscribe to events and upon receipt, transmit actuation commands which trigger a set of actuators. Events and actuation commands follow paths in the IoT ecosystem such as sensor-to-edge, edge-to-cloud, and cloud-to-actuator, with different network and processing delays between th…
▽ More
IoT devices, equipped with embedded actuators and sensors, provide custom automation in the form of IoT apps. IoT apps subscribe to events and upon receipt, transmit actuation commands which trigger a set of actuators. Events and actuation commands follow paths in the IoT ecosystem such as sensor-to-edge, edge-to-cloud, and cloud-to-actuator, with different network and processing delays between these connections. Significant delays may occur especially when an IoT system cloud interacts with other clouds. Due to this variation in delays, the cloud may receive events in an incorrect order, and in turn, devices may receive and actuate misordered commands. In this paper, we first study eight major IoT platforms and show that they do not make strong guarantees on event orderings to address these issues. We then analyze the end-to-end interactions among IoT components, from the creation of an event to the invocation of a command. From this, we identify and formalize the root causes of misorderings in events and commands leading to undesired states. We deploy 23 apps in a simulated smart home containing 35 IoT devices to evaluate the misordering problem. Our experiments demonstrate a high number of misordered events and commands that occur through different interaction paths. Through this effort, we reveal the root and extent of the misordering problem and guide future work to ensure correct ordering in IoT systems.
△ Less
Submitted 3 May, 2021;
originally announced May 2021.
-
S3: Side-Channel Attack on Stylus Pencil through Sensors
Authors:
Habiba Farrukh,
Tinghan Yang,
Hanwen Xu,
Yuxuan Yin,
He Wang,
Z. Berkay Celik
Abstract:
With smart devices being an essential part of our everyday lives, unsupervised access to the mobile sensors' data can result in a multitude of side-channel attacks. In this paper, we study potential data leaks from Apple Pencil (2nd generation) supported by the Apple iPad Pro, the latest stylus pen which attaches to the iPad body magnetically for charging. We observe that the Pencil's body affects…
▽ More
With smart devices being an essential part of our everyday lives, unsupervised access to the mobile sensors' data can result in a multitude of side-channel attacks. In this paper, we study potential data leaks from Apple Pencil (2nd generation) supported by the Apple iPad Pro, the latest stylus pen which attaches to the iPad body magnetically for charging. We observe that the Pencil's body affects the magnetic readings sensed by the iPad's magnetometer when a user is using the Pencil. Therefore, we ask: Can we infer what a user is writing on the iPad screen with the Apple Pencil, given access to only the iPad's motion sensors' data? To answer this question, we present Side-channel attack on Stylus pencil through Sensors (S3), a system that identifies what a user is writing from motion sensor readings. We first use the sharp fluctuations in the motion sensors' data to determine when a user is writing on the iPad. We then introduce a high-dimensional particle filter to track the location and orientation of the Pencil during usage. Lastly, to guide particles, we build the Pencil's magnetic map serving as a bridge between the measured magnetic data and the Pencil location and orientation. We evaluate S3 with 10 subjects and demonstrate that we correctly identify 93.9%, 96%, 97.9%, and 93.33% of the letters, numbers, shapes, and words by only having access to the motion sensors' data.
△ Less
Submitted 9 March, 2021;
originally announced March 2021.
-
Discovering IoT Physical Channel Vulnerabilities
Authors:
Muslum Ozgur Ozmen,
Xuansong Li,
Andrew Chu,
Z. Berkay Celik,
Bardh Hoxha,
Xiangyu Zhang
Abstract:
Smart homes contain diverse sensors and actuators controlled by IoT apps that provide custom automation. Prior works showed that an adversary could exploit physical interaction vulnerabilities among apps and put the users and environment at risk, e.g., to break into a house, an adversary turns on the heater to trigger an app that opens windows when the temperature exceeds a threshold. Currently, t…
▽ More
Smart homes contain diverse sensors and actuators controlled by IoT apps that provide custom automation. Prior works showed that an adversary could exploit physical interaction vulnerabilities among apps and put the users and environment at risk, e.g., to break into a house, an adversary turns on the heater to trigger an app that opens windows when the temperature exceeds a threshold. Currently, the safe behavior of physical interactions relies on either app code analysis or dynamic analysis of device states with manually derived policies by developers. However, existing works fail to achieve sufficient breadth and fidelity to translate the app code into their physical behavior or provide incomplete security policies, causing poor accuracy and false alarms. In this paper, we introduce a new approach, IoTSeer, which efficiently combines app code analysis and dynamic analysis with new security policies to discover physical interaction vulnerabilities. IoTSeer works by first translating sensor events and actuator commands of each app into a physical execution model (PeM) and unifying PeMs to express composite physical execution of apps (CPeM). CPeM allows us to deploy IoTSeer in different smart homes by defining its execution parameters with minimal data collection. IoTSeer supports new security policies with intended/unintended physical channel labels. It then efficiently checks them on the CPeM via falsification, which addresses the undecidability of verification due to the continuous and discrete behavior of IoT devices. We evaluate IoTSeer in an actual house with 14 actuators, six sensors, and 39 apps. IoTSeer discovers 16 unique policy violations, whereas prior works identify only 2 out of 16 with 18 falsely flagged violations. IoTSeer only requires 30 mins of data collection for each actuator to set the CPeM parameters and is adaptive to newly added, removed, and relocated devices.
△ Less
Submitted 7 September, 2022; v1 submitted 2 February, 2021;
originally announced February 2021.
-
What Do You See? Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors
Authors:
Yi-Shan Lin,
Wen-Chuan Lee,
Z. Berkay Celik
Abstract:
EXplainable AI (XAI) methods have been proposed to interpret how a deep neural network predicts inputs through model saliency explanations that highlight the parts of the inputs deemed important to arrive a decision at a specific target. However, it remains challenging to quantify correctness of their interpretability as current evaluation approaches either require subjective input from humans or…
▽ More
EXplainable AI (XAI) methods have been proposed to interpret how a deep neural network predicts inputs through model saliency explanations that highlight the parts of the inputs deemed important to arrive a decision at a specific target. However, it remains challenging to quantify correctness of their interpretability as current evaluation approaches either require subjective input from humans or incur high computation cost with automated evaluation. In this paper, we propose backdoor trigger patterns--hidden malicious functionalities that cause misclassification--to automate the evaluation of saliency explanations. Our key observation is that triggers provide ground truth for inputs to evaluate whether the regions identified by an XAI method are truly relevant to its output. Since backdoor triggers are the most important features that cause deliberate misclassification, a robust XAI method should reveal their presence at inference time. We introduce three complementary metrics for systematic evaluation of explanations that an XAI method generates and evaluate seven state-of-the-art model-free and model-specific posthoc methods through 36 models trojaned with specifically crafted triggers using color, shape, texture, location, and size. We discovered six methods that use local explanation and feature relevance fail to completely highlight trigger regions, and only a model-free approach can uncover the entire trigger region.
△ Less
Submitted 22 September, 2020;
originally announced September 2020.
-
On the Feasibility of Exploiting Traffic Collision Avoidance System Vulnerabilities
Authors:
Paul M. Berges,
Basavesh Ammanaghatta Shivakumar,
Timothy Graziano,
Ryan Gerdes,
Z. Berkay Celik
Abstract:
Traffic Collision Avoidance Systems (TCAS) are safety-critical systems required on most commercial aircrafts in service today. However, TCAS was not designed to account for malicious actors. While in the past it may have been infeasible for an attacker to craft radio signals to mimic TCAS signals, attackers today have access to open-source digital signal processing software, like GNU Radio, and in…
▽ More
Traffic Collision Avoidance Systems (TCAS) are safety-critical systems required on most commercial aircrafts in service today. However, TCAS was not designed to account for malicious actors. While in the past it may have been infeasible for an attacker to craft radio signals to mimic TCAS signals, attackers today have access to open-source digital signal processing software, like GNU Radio, and inexpensive software defined radios (SDR) that enable the transmission of spurious TCAS messages. In this paper, methods, both qualitative and quantitative, for analyzing TCAS from an adversarial perspective are presented. To demonstrate the feasibility of inducing near mid-air collisions between current day TCAS-equipped aircraft, an experimental Phantom Aircraft generator is developed using GNU Radio and an SDR against a realistic threat model.
△ Less
Submitted 25 June, 2020;
originally announced June 2020.
-
Adaptation Strategies for Automated Machine Learning on Evolving Data
Authors:
Bilge Celik,
Joaquin Vanschoren
Abstract:
Automated Machine Learning (AutoML) systems have been shown to efficiently build good models for new datasets. However, it is often not clear how well they can adapt when the data evolves over time. The main goal of this study is to understand the effect of data stream challenges such as concept drift on the performance of AutoML methods, and which adaptation strategies can be employed to make the…
▽ More
Automated Machine Learning (AutoML) systems have been shown to efficiently build good models for new datasets. However, it is often not clear how well they can adapt when the data evolves over time. The main goal of this study is to understand the effect of data stream challenges such as concept drift on the performance of AutoML methods, and which adaptation strategies can be employed to make them more robust. To that end, we propose 6 concept drift adaptation strategies and evaluate their effectiveness on different AutoML approaches. We do this for a variety of AutoML approaches for building machine learning pipelines, including those that leverage Bayesian optimization, genetic programming, and random search with automated stacking. These are evaluated empirically on real-world and synthetic data streams with different types of concept drift. Based on this analysis, we propose ways to develop more sophisticated and robust AutoML techniques.
△ Less
Submitted 10 May, 2022; v1 submitted 9 June, 2020;
originally announced June 2020.
-
IoTRepair: Systematically Addressing Device Faults in Commodity IoT (Extended Paper)
Authors:
Michael Norris,
Berkay Celik,
Patrick McDaniel,
Gang Tan,
Prasanna Venkatesh,
Shulin Zhao,
Anand Sivasubramaniam
Abstract:
IoT devices are decentralized and deployed in un-stable environments, which causes them to be prone to various kinds of faults, such as device failure and network disruption. Yet, current IoT platforms require programmers to handle faults manually, a complex and error-prone task. In this paper, we present IoTRepair, a fault-handling system for IoT that (1)integrates a fault identification module t…
▽ More
IoT devices are decentralized and deployed in un-stable environments, which causes them to be prone to various kinds of faults, such as device failure and network disruption. Yet, current IoT platforms require programmers to handle faults manually, a complex and error-prone task. In this paper, we present IoTRepair, a fault-handling system for IoT that (1)integrates a fault identification module to track faulty devices,(2) provides a library of fault-handling functions for effectively handling different fault types, (3) provides a fault handler on top of the library for autonomous IoT fault handling, with user and developer configuration as input. Through an evaluation in a simulated lab environment and with various fault injectio nmethods,IoTRepair is compared with current fault-handling solutions. The fault handler reduces the incorrect states on average 50.01%, which corresponds to less unsafe and insecure device states. Overall, through a systematic design of an IoT fault handler, we provide users flexibility and convenience in handling complex IoT fault handling, allowing safer IoT environments.
△ Less
Submitted 17 February, 2020;
originally announced February 2020.
-
Real-time Analysis of Privacy-(un)aware IoT Applications
Authors:
Leonardo Babun,
Z. Berkay Celik,
Patrick McDaniel,
A. Selcuk Uluagac
Abstract:
Users trust IoT apps to control and automate their smart devices. These apps necessarily have access to sensitive data to implement their functionality. However, users lack visibility into how their sensitive data is used (or leaked), and they often blindly trust the app developers. In this paper, we present IoTWatcH, a novel dynamic analysis tool that uncovers the privacy risks of IoT apps in rea…
▽ More
Users trust IoT apps to control and automate their smart devices. These apps necessarily have access to sensitive data to implement their functionality. However, users lack visibility into how their sensitive data is used (or leaked), and they often blindly trust the app developers. In this paper, we present IoTWatcH, a novel dynamic analysis tool that uncovers the privacy risks of IoT apps in real-time. We designed and built IoTWatcH based on an IoT privacy survey that considers the privacy needs of IoT users. IoTWatcH provides users with a simple interface to specify their privacy preferences with an IoT app. Then, in runtime, it analyzes both the data that is sent out of the IoT app and its recipients using Natural Language Processing (NLP) techniques. Moreover, IoTWatcH informs the users with its findings to make them aware of the privacy risks with the IoT app. We implemented IoTWatcH on real IoT applications. Specifically, we analyzed 540 IoT apps to train the NLP model and evaluate its effectiveness. IoTWatcH successfully classifies IoT app data sent to external parties to correct privacy labels with an average accuracy of 94.25%, and flags IoT apps that leak privacy data to unauthorized parties. Finally, IoTWatcH yields minimal overhead to an IoT app's execution, on average 105 ms additional latency.
△ Less
Submitted 24 November, 2019;
originally announced November 2019.
-
KRATOS: Multi-User Multi-Device-Aware Access Control System for the Smart Home
Authors:
Amit Kumar Sikder,
Leonardo Babun,
Z. Berkay Celik,
Abbas Acar,
Hidayet Aksu,
Patrick McDaniel,
Engin Kirda,
A. Selcuk Uluagac
Abstract:
In a smart home system, multiple users have access to multiple devices, typically through a dedicated app installed on a mobile device. Traditional access control mechanisms consider one unique trusted user that controls the access to the devices. However, multi-user multi-device smart home settings pose fundamentally different challenges to traditional single-user systems. For instance, in a mult…
▽ More
In a smart home system, multiple users have access to multiple devices, typically through a dedicated app installed on a mobile device. Traditional access control mechanisms consider one unique trusted user that controls the access to the devices. However, multi-user multi-device smart home settings pose fundamentally different challenges to traditional single-user systems. For instance, in a multi-user environment, users have conflicting, complex, and dynamically changing demands on multiple devices, which cannot be handled by traditional access control techniques. To address these challenges, in this paper, we introduce Kratos, a novel multiuser and multi-device-aware access control mechanism that allows smart home users to flexibly specify their access control demands. Kratos has three main components: user interaction module, backend server, and policy manager. Users can specify their desired access control settings using the interaction module which are translated into access control policies in the backend server. The policy manager analyzes these policies and initiates negotiation between users to resolve conflicting demands and generates final policies. We implemented Kratos and evaluated its performance on real smart home deployments featuring multi-user scenarios with a rich set of configurations (309 different policies including 213 demand conflicts and 24 restriction policies). These configurations included five different threats associated with access control mechanisms. Our extensive evaluations show that Kratos is very effective in resolving conflicting access control demands with minimal overhead and robust against different attacks.
△ Less
Submitted 2 June, 2020; v1 submitted 22 November, 2019;
originally announced November 2019.
-
Program Analysis of Commodity IoT Applications for Security and Privacy: Challenges and Opportunities
Authors:
Z. Berkay Celik,
Earlence Fernandes,
Eric Pauley,
Gang Tan,
Patrick McDaniel
Abstract:
Recent advances in Internet of Things (IoT) have enabled myriad domains such as smart homes, personal monitoring devices, and enhanced manufacturing. IoT is now pervasive---new applications are being used in nearly every conceivable environment, which leads to the adoption of device-based interaction and automation. However, IoT has also raised issues about the security and privacy of these digita…
▽ More
Recent advances in Internet of Things (IoT) have enabled myriad domains such as smart homes, personal monitoring devices, and enhanced manufacturing. IoT is now pervasive---new applications are being used in nearly every conceivable environment, which leads to the adoption of device-based interaction and automation. However, IoT has also raised issues about the security and privacy of these digitally augmented spaces. Program analysis is crucial in identifying those issues, yet the application and scope of program analysis in IoT remains largely unexplored by the technical community. In this paper, we study privacy and security issues in IoT that require program-analysis techniques with an emphasis on identified attacks against these systems and defenses implemented so far. Based on a study of five IoT programming platforms, we identify the key insights that result from research efforts in both the program analysis and security communities and relate the efficacy of program-analysis techniques to security and privacy issues. We conclude by studying recent IoT analysis systems and exploring their implementations. Through these explorations, we highlight key challenges and opportunities in calibrating for the environments in which IoT systems will be used.
△ Less
Submitted 24 December, 2018; v1 submitted 18 September, 2018;
originally announced September 2018.
-
Soteria: Automated IoT Safety and Security Analysis
Authors:
Z. Berkay Celik,
Patrick McDaniel,
Gang Tan
Abstract:
Broadly defined as the Internet of Things (IoT), the growth of commodity devices that integrate physical processes with digital systems have changed the way we live, play and work. Yet existing IoT platforms cannot evaluate whether an IoT app or environment is safe, secure, and operates correctly. In this paper, we present Soteria, a static analysis system for validating whether an IoT app or IoT…
▽ More
Broadly defined as the Internet of Things (IoT), the growth of commodity devices that integrate physical processes with digital systems have changed the way we live, play and work. Yet existing IoT platforms cannot evaluate whether an IoT app or environment is safe, secure, and operates correctly. In this paper, we present Soteria, a static analysis system for validating whether an IoT app or IoT environment (collection of apps working in concert) adheres to identified safety, security, and functional properties. Soteria operates in three phases; (a) translation of platform-specific IoT source code into an intermediate representation (IR), (b) extracting a state model from the IR, (c) applying model checking to verify desired properties. We evaluate Soteria on 65 SmartThings market apps through 35 properties and find nine (14%) individual apps violate ten (29%) properties. Further, our study of combined app environments uncovered eleven property violations not exhibited in the isolated apps. Lastly, we demonstrate Soteria on MalIoT, a novel open-source test suite containing 17 apps with 20 unique violations.
△ Less
Submitted 22 May, 2018;
originally announced May 2018.
-
Sensitive Information Tracking in Commodity IoT
Authors:
Z. Berkay Celik,
Leonardo Babun,
Amit K. Sikder,
Hidayet Aksu,
Gang Tan,
Patrick McDaniel,
A. Selcuk Uluagac
Abstract:
Broadly defined as the Internet of Things (IoT), the growth of commodity devices that integrate physical processes with digital connectivity has had profound effects on society--smart homes, personal monitoring devices, enhanced manufacturing and other IoT apps have changed the way we live, play, and work. Yet extant IoT platforms provide few means of evaluating the use (and potential avenues for…
▽ More
Broadly defined as the Internet of Things (IoT), the growth of commodity devices that integrate physical processes with digital connectivity has had profound effects on society--smart homes, personal monitoring devices, enhanced manufacturing and other IoT apps have changed the way we live, play, and work. Yet extant IoT platforms provide few means of evaluating the use (and potential avenues for misuse) of sensitive information. Thus, consumers and organizations have little information to assess the security and privacy risks these devices present. In this paper, we present SainT, a static taint analysis tool for IoT applications. SainT operates in three phases; (a) translation of platform-specific IoT source code into an intermediate representation (IR), (b) identifying sensitive sources and sinks, and (c) performing static analysis to identify sensitive data flows. We evaluate SainT on 230 SmartThings market apps and find 138 (60%) include sensitive data flows. In addition, we demonstrate SainT on IoTBench, a novel open-source test suite containing 19 apps with 27 unique data leaks. Through this effort, we introduce a rigorously grounded framework for evaluating the use of sensitive information in IoT apps---and therein provide developers, markets, and consumers a means of identifying potential threats to security and privacy.
△ Less
Submitted 22 February, 2018;
originally announced February 2018.
-
Determining Positive Cancer Rescue Mutations in p53 Based Cancers by using Artificial Intelligence
Authors:
Kaan Aygen,
Berkay Celik,
Umut Eser
Abstract:
A mutation in a protein-coding gene in DNA can alter the protein structure coded by the same gene. Structurally altered proteins usually lose their functions and sometimes gain an undesirable function instead. These types of mutations and their effects can result in genetic diseases or antibiotic resistant bacteria, among other health issues. Important curing methods have been developed for detect…
▽ More
A mutation in a protein-coding gene in DNA can alter the protein structure coded by the same gene. Structurally altered proteins usually lose their functions and sometimes gain an undesirable function instead. These types of mutations and their effects can result in genetic diseases or antibiotic resistant bacteria, among other health issues. Important curing methods have been developed for detecting mutations against AIDS as well as genetic diseases. Another example is the influenza virus. The reasons why a vaccination developed to fight against influenza does not work the following year are (a) the mutation of its DNA and (b) the outbreak of the virus after it has been mutated especially if it is a virus that escaped the vaccinations target. Due to such reasons, it is highly important to know in advance the location of a potential mutation in a protein as well as the problems it might cause the medical sciences. In this study we have used artificial neural networks, which are one of the latest artificial intelligence technologies, to determine the effects of cancer mutations. The model we developed has given more successful results compared to other methods. We foresee that our model will bring a new dimension to medical research and the medical industry.
△ Less
Submitted 27 August, 2017;
originally announced August 2017.
-
Achieving Secure and Differentially Private Computations in Multiparty Settings
Authors:
Abbas Acar,
Z. Berkay Celik,
Hidayet Aksu,
A. Selcuk Uluagac,
Patrick McDaniel
Abstract:
Sharing and working on sensitive data in distributed settings from healthcare to finance is a major challenge due to security and privacy concerns. Secure multiparty computation (SMC) is a viable panacea for this, allowing distributed parties to make computations while the parties learn nothing about their data, but the final result. Although SMC is instrumental in such distributed settings, it do…
▽ More
Sharing and working on sensitive data in distributed settings from healthcare to finance is a major challenge due to security and privacy concerns. Secure multiparty computation (SMC) is a viable panacea for this, allowing distributed parties to make computations while the parties learn nothing about their data, but the final result. Although SMC is instrumental in such distributed settings, it does not provide any guarantees not to leak any information about individuals to adversaries. Differential privacy (DP) can be utilized to address this; however, achieving SMC with DP is not a trivial task, either. In this paper, we propose a novel Secure Multiparty Distributed Differentially Private (SM-DDP) protocol to achieve secure and private computations in a multiparty environment. Specifically, with our protocol, we simultaneously achieve SMC and DP in distributed settings focusing on linear regression on horizontally distributed data. That is, parties do not see each others' data and further, can not infer information about individuals from the final constructed statistical model. Any statistical model function that allows independent calculation of local statistics can be computed through our protocol. The protocol implements homomorphic encryption for SMC and functional mechanism for DP to achieve the desired security and privacy guarantees. In this work, we first introduce the theoretical foundation for the SM-DDP protocol and then evaluate its efficacy and performance on two different datasets. Our results show that one can achieve individual-level privacy through the proposed protocol with distributed DP, which is independently applied by each party in a distributed fashion. Moreover, our results also show that the SM-DDP protocol incurs minimal computational overhead, is scalable, and provides security and privacy guarantees.
△ Less
Submitted 6 July, 2017;
originally announced July 2017.
-
Curie: Policy-based Secure Data Exchange
Authors:
Z. Berkay Celik,
Hidayet Aksu,
Abbas Acar,
Ryan Sheatsley,
A. Selcuk Uluagac,
Patrick McDaniel
Abstract:
Data sharing among partners---users, organizations, companies---is crucial for the advancement of data analytics in many domains. Sharing through secure computation and differential privacy allows these partners to perform private computations on their sensitive data in controlled ways. However, in reality, there exist complex relationships among members. Politics, regulations, interest, trust, da…
▽ More
Data sharing among partners---users, organizations, companies---is crucial for the advancement of data analytics in many domains. Sharing through secure computation and differential privacy allows these partners to perform private computations on their sensitive data in controlled ways. However, in reality, there exist complex relationships among members. Politics, regulations, interest, trust, data demands and needs are one of the many reasons. Thus, there is a need for a mechanism to meet these conflicting relationships on data sharing. This paper presents Curie, an approach to exchange data among members whose membership has complex relationships. The CPL policy language that allows members to define the specifications of data exchange requirements is introduced. Members (partners) assert who and what to exchange through their local policies and negotiate a global sharing agreement. The agreement is implemented in a multi-party computation that guarantees sharing among members will comply with the policy as negotiated. The use of Curie is validated through an example of a health care application built on recently introduced secure multi-party computation and differential privacy frameworks, and policy and performance trade-offs are explored.
△ Less
Submitted 9 February, 2019; v1 submitted 27 February, 2017;
originally announced February 2017.
-
Patient-Driven Privacy Control through Generalized Distillation
Authors:
Z. Berkay Celik,
David Lopez-Paz,
Patrick McDaniel
Abstract:
The introduction of data analytics into medicine has changed the nature of patient treatment. In this, patients are asked to disclose personal information such as genetic markers, lifestyle habits, and clinical history. This data is then used by statistical models to predict personalized treatments. However, due to privacy concerns, patients often desire to withhold sensitive information. This sel…
▽ More
The introduction of data analytics into medicine has changed the nature of patient treatment. In this, patients are asked to disclose personal information such as genetic markers, lifestyle habits, and clinical history. This data is then used by statistical models to predict personalized treatments. However, due to privacy concerns, patients often desire to withhold sensitive information. This self-censorship can impede proper diagnosis and treatment, which may lead to serious health complications and even death over time. In this paper, we present privacy distillation, a mechanism which allows patients to control the type and amount of information they wish to disclose to the healthcare providers for use in statistical models. Meanwhile, it retains the accuracy of models that have access to all patient data under a sufficient but not full set of privacy-relevant information. We validate privacy distillation using a corpus of patients prescribed to warfarin for a personalized dosage. We use a deep neural network to implement privacy distillation for training and making dose predictions. We find that privacy distillation with sufficient privacy-relevant information i) retains accuracy almost as good as having all patient data (only 3\% worse), and ii) is effective at preventing errors that introduce health-related risks (only 3.9\% worse under- or over-prescriptions).
△ Less
Submitted 13 October, 2017; v1 submitted 25 November, 2016;
originally announced November 2016.
-
Detection under Privileged Information
Authors:
Z. Berkay Celik,
Patrick McDaniel,
Rauf Izmailov,
Nicolas Papernot,
Ryan Sheatsley,
Raquel Alvarez,
Ananthram Swami
Abstract:
For well over a quarter century, detection systems have been driven by models learned from input features collected from real or simulated environments. An artifact (e.g., network event, potential malware sample, suspicious email) is deemed malicious or non-malicious based on its similarity to the learned model at runtime. However, the training of the models has been historically limited to only t…
▽ More
For well over a quarter century, detection systems have been driven by models learned from input features collected from real or simulated environments. An artifact (e.g., network event, potential malware sample, suspicious email) is deemed malicious or non-malicious based on its similarity to the learned model at runtime. However, the training of the models has been historically limited to only those features available at runtime. In this paper, we consider an alternate learning approach that trains models using "privileged" information--features available at training time but not at runtime--to improve the accuracy and resilience of detection systems. In particular, we adapt and extend recent advances in knowledge transfer, model influence, and distillation to enable the use of forensic or other data unavailable at runtime in a range of security domains. An empirical evaluation shows that privileged information increases precision and recall over a system with no privileged information: we observe up to 7.7% relative decrease in detection error for fast-flux bot detection, 8.6% for malware traffic detection, 7.3% for malware classification, and 16.9% for face recognition. We explore the limitations and applications of different privileged information techniques in detection systems. Such techniques provide a new means for detection systems to learn from data that would otherwise not be available at runtime.
△ Less
Submitted 30 March, 2018; v1 submitted 31 March, 2016;
originally announced March 2016.
-
Practical Black-Box Attacks against Machine Learning
Authors:
Nicolas Papernot,
Patrick McDaniel,
Ian Goodfellow,
Somesh Jha,
Z. Berkay Celik,
Ananthram Swami
Abstract:
Machine learning (ML) models, e.g., deep neural networks (DNNs), are vulnerable to adversarial examples: malicious inputs modified to yield erroneous model outputs, while appearing unmodified to human observers. Potential attacks include having malicious content like malware identified as legitimate or controlling vehicle behavior. Yet, all existing adversarial example attacks require knowledge of…
▽ More
Machine learning (ML) models, e.g., deep neural networks (DNNs), are vulnerable to adversarial examples: malicious inputs modified to yield erroneous model outputs, while appearing unmodified to human observers. Potential attacks include having malicious content like malware identified as legitimate or controlling vehicle behavior. Yet, all existing adversarial example attacks require knowledge of either the model internals or its training data. We introduce the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge. Indeed, the only capability of our black-box adversary is to observe labels given by the DNN to chosen inputs. Our attack strategy consists in training a local model to substitute for the target DNN, using inputs synthetically generated by an adversary and labeled by the target DNN. We use the local substitute to craft adversarial examples, and find that they are misclassified by the targeted DNN. To perform a real-world and properly-blinded evaluation, we attack a DNN hosted by MetaMind, an online deep learning API. We find that their DNN misclassifies 84.24% of the adversarial examples crafted with our substitute. We demonstrate the general applicability of our strategy to many ML techniques by conducting the same attack against models hosted by Amazon and Google, using logistic regression substitutes. They yield adversarial examples misclassified by Amazon and Google at rates of 96.19% and 88.94%. We also find that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder.
△ Less
Submitted 19 March, 2017; v1 submitted 8 February, 2016;
originally announced February 2016.
-
The Limitations of Deep Learning in Adversarial Settings
Authors:
Nicolas Papernot,
Patrick McDaniel,
Somesh Jha,
Matt Fredrikson,
Z. Berkay Celik,
Ananthram Swami
Abstract:
Deep learning takes advantage of large datasets and computationally efficient training algorithms to outperform other approaches at various machine learning tasks. However, imperfections in the training phase of deep neural networks make them vulnerable to adversarial samples: inputs crafted by adversaries with the intent of causing deep neural networks to misclassify. In this work, we formalize t…
▽ More
Deep learning takes advantage of large datasets and computationally efficient training algorithms to outperform other approaches at various machine learning tasks. However, imperfections in the training phase of deep neural networks make them vulnerable to adversarial samples: inputs crafted by adversaries with the intent of causing deep neural networks to misclassify. In this work, we formalize the space of adversaries against deep neural networks (DNNs) and introduce a novel class of algorithms to craft adversarial samples based on a precise understanding of the map** between inputs and outputs of DNNs. In an application to computer vision, we show that our algorithms can reliably produce samples correctly classified by human subjects but misclassified in specific targets by a DNN with a 97% adversarial success rate while only modifying on average 4.02% of the input features per sample. We then evaluate the vulnerability of different sample classes to adversarial perturbations by defining a hardness measure. Finally, we describe preliminary work outlining defenses against adversarial samples by defining a predictive measure of distance between a benign input and a target classification.
△ Less
Submitted 23 November, 2015;
originally announced November 2015.