-
A One-Layer Decoder-Only Transformer is a Two-Layer RNN: With an Application to Certified Robustness
Authors:
Yuhao Zhang,
Aws Albarghouthi,
Loris D'Antoni
Abstract:
This paper reveals a key insight that a one-layer decoder-only Transformer is equivalent to a two-layer Recurrent Neural Network (RNN). Building on this insight, we propose ARC-Tran, a novel approach for verifying the robustness of decoder-only Transformers against arbitrary perturbation spaces. Compared to ARC-Tran, current robustness verification techniques are limited either to specific and len…
▽ More
This paper reveals a key insight that a one-layer decoder-only Transformer is equivalent to a two-layer Recurrent Neural Network (RNN). Building on this insight, we propose ARC-Tran, a novel approach for verifying the robustness of decoder-only Transformers against arbitrary perturbation spaces. Compared to ARC-Tran, current robustness verification techniques are limited either to specific and length-preserving perturbations like word substitutions or to recursive models like LSTMs. ARC-Tran addresses these limitations by meticulously managing position encoding to prevent mismatches and by utilizing our key insight to achieve precise and scalable verification. Our evaluation shows that ARC-Tran (1) trains models more robust to arbitrary perturbation spaces than those produced by existing techniques and (2) shows high certification accuracy of the resulting models.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Crowdsourcing Task Traces for Service Robotics
Authors:
David Porfirio,
Allison Sauppé,
Maya Cakmak,
Aws Albarghouthi,
Bilge Mutlu
Abstract:
Demonstration is an effective end-user development paradigm for teaching robots how to perform new tasks. In this paper, we posit that demonstration is useful not only as a teaching tool, but also as a way to understand and assist end-user developers in thinking about a task at hand. As a first step toward gaining this understanding, we constructed a lightweight web interface to crowdsource step-b…
▽ More
Demonstration is an effective end-user development paradigm for teaching robots how to perform new tasks. In this paper, we posit that demonstration is useful not only as a teaching tool, but also as a way to understand and assist end-user developers in thinking about a task at hand. As a first step toward gaining this understanding, we constructed a lightweight web interface to crowdsource step-by-step instructions of common household tasks, leveraging the imaginations and past experiences of potential end-user developers. As evidence of the utility of our interface, we deployed the interface on Amazon Mechanical Turk and collected 207 task traces that span 18 different task categories. We describe our vision for how these task traces can be operationalized as task models within end-user development tools and provide a roadmap for future work.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
Verified Training for Counterfactual Explanation Robustness under Data Shift
Authors:
Anna P. Meyer,
Yuhao Zhang,
Aws Albarghouthi,
Loris D'Antoni
Abstract:
Counterfactual explanations (CEs) enhance the interpretability of machine learning models by describing what changes to an input are necessary to change its prediction to a desired class. These explanations are commonly used to guide users' actions, e.g., by describing how a user whose loan application was denied can be approved for a loan in the future. Existing approaches generate CEs by focusin…
▽ More
Counterfactual explanations (CEs) enhance the interpretability of machine learning models by describing what changes to an input are necessary to change its prediction to a desired class. These explanations are commonly used to guide users' actions, e.g., by describing how a user whose loan application was denied can be approved for a loan in the future. Existing approaches generate CEs by focusing on a single, fixed model, and do not provide any formal guarantees on the CEs' future validity. When models are updated periodically to account for data shift, if the generated CEs are not robust to the shifts, users' actions may no longer have the desired impacts on their predictions. This paper introduces VeriTraCER, an approach that jointly trains a classifier and an explainer to explicitly consider the robustness of the generated CEs to small model shifts. VeriTraCER optimizes over a carefully designed loss function that ensures the verifiable robustness of CEs to local model updates, thus providing deterministic guarantees to CE validity. Our empirical evaluation demonstrates that VeriTraCER generates CEs that (1) are verifiably robust to small model updates and (2) display competitive robustness to state-of-the-art approaches in handling empirical model updates including random initialization, leave-one-out, and distribution shifts.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
Compilation for Surface Code Quantum Computers
Authors:
Abtin Molavi,
Amanda Xu,
Swamit Tannu,
Aws Albarghouthi
Abstract:
Practical applications of quantum computing depend on fault-tolerant devices with error correction. Today, the most promising approach is a class of error-correcting codes called surface codes. In this paper, we study the problem of compiling quantum circuits for quantum computers implementing surface codes. The problem involves (1) map** circuit qubits to the device qubits and (2) routing execu…
▽ More
Practical applications of quantum computing depend on fault-tolerant devices with error correction. Today, the most promising approach is a class of error-correcting codes called surface codes. In this paper, we study the problem of compiling quantum circuits for quantum computers implementing surface codes. The problem involves (1) map** circuit qubits to the device qubits and (2) routing execution paths between pairs of interacting qubits. We call this the surface code map** and routing problem (SCMR).
Solving SCMR near-optimally is critical for both efficiency and correctness. An optimal solution limits the cost of a computation in terms of precious quantum resources and also minimizes the probability of incurring an undetected logical error, which increases with each additional time step.
We study SCMR from a theoretical and practical perspective. First, we prove that SCMR, as well as a constrained version of the problem, is NP-complete. Second, we present a optimal algorithm for solving SCMR that is based on a SAT encoding. Third, we present a spectrum of efficient relaxations of SCMR, for example, by exploiting greedy algorithms for solving the problem of node-disjoint paths. Finally, we implement and evaluate our algorithms on a large suite of real and synthetic circuits. Our results suggest that our relaxations are a powerful tool for compiling realistic workloads. The relaxation-based algorithms are orders of magnitude faster than the optimal algorithm (solving instances with tens of thousands of gates in minutes), while still finding high-quality solutions, achieving the theoretical lower bound on up to 55 out of 168 circuits from a diverse benchmark suite.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
The Dataset Multiplicity Problem: How Unreliable Data Impacts Predictions
Authors:
Anna P. Meyer,
Aws Albarghouthi,
Loris D'Antoni
Abstract:
We introduce dataset multiplicity, a way to study how inaccuracies, uncertainty, and social bias in training datasets impact test-time predictions. The dataset multiplicity framework asks a counterfactual question of what the set of resultant models (and associated test-time predictions) would be if we could somehow access all hypothetical, unbiased versions of the dataset. We discuss how to use t…
▽ More
We introduce dataset multiplicity, a way to study how inaccuracies, uncertainty, and social bias in training datasets impact test-time predictions. The dataset multiplicity framework asks a counterfactual question of what the set of resultant models (and associated test-time predictions) would be if we could somehow access all hypothetical, unbiased versions of the dataset. We discuss how to use this framework to encapsulate various sources of uncertainty in datasets' factualness, including systemic social bias, data collection practices, and noisy labels or features. We show how to exactly analyze the impacts of dataset multiplicity for a specific model architecture and type of uncertainty: linear models with label errors. Our empirical analysis shows that real-world datasets, under reasonable assumptions, contain many test samples whose predictions are affected by dataset multiplicity. Furthermore, the choice of domain-specific dataset multiplicity definition determines what samples are affected, and whether different demographic groups are disparately impacted. Finally, we discuss implications of dataset multiplicity for machine learning practice and research, including considerations for when model outcomes should not be trusted.
△ Less
Submitted 20 April, 2023;
originally announced April 2023.
-
Sketching Robot Programs On the Fly
Authors:
David Porfirio,
Laura Stegner,
Maya Cakmak,
Allison Sauppé,
Aws Albarghouthi,
Bilge Mutlu
Abstract:
Service robots for personal use in the home and the workplace require end-user development solutions for swiftly scripting robot tasks as the need arises. Many existing solutions preserve ease, efficiency, and convenience through simple programming interfaces or by restricting task complexity. Others facilitate meticulous task design but often do so at the expense of simplicity and efficiency. The…
▽ More
Service robots for personal use in the home and the workplace require end-user development solutions for swiftly scripting robot tasks as the need arises. Many existing solutions preserve ease, efficiency, and convenience through simple programming interfaces or by restricting task complexity. Others facilitate meticulous task design but often do so at the expense of simplicity and efficiency. There is a need for robot programming solutions that reconcile the complexity of robotics with the on-the-fly goals of end-user development. In response to this need, we present a novel, multimodal, and on-the-fly development system, Tabula. Inspired by a formative design study with a prototype, Tabula leverages a combination of spoken language for specifying the core of a robot task and sketching for contextualizing the core. The result is that developers can script partial, sloppy versions of robot programs to be completed and refined by a program synthesizer. Lastly, we demonstrate our anticipated use cases of Tabula via a set of application scenarios.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
PECAN: A Deterministic Certified Defense Against Backdoor Attacks
Authors:
Yuhao Zhang,
Aws Albarghouthi,
Loris D'Antoni
Abstract:
Neural networks are vulnerable to backdoor poisoning attacks, where the attackers maliciously poison the training set and insert triggers into the test input to change the prediction of the victim model. Existing defenses for backdoor attacks either provide no formal guarantees or come with expensive-to-compute and ineffective probabilistic guarantees. We present PECAN, an efficient and certified…
▽ More
Neural networks are vulnerable to backdoor poisoning attacks, where the attackers maliciously poison the training set and insert triggers into the test input to change the prediction of the victim model. Existing defenses for backdoor attacks either provide no formal guarantees or come with expensive-to-compute and ineffective probabilistic guarantees. We present PECAN, an efficient and certified approach for defending against backdoor attacks. The key insight powering PECAN is to apply off-the-shelf test-time evasion certification techniques on a set of neural networks trained on disjoint partitions of the data. We evaluate PECAN on image classification and malware detection datasets. Our results demonstrate that PECAN can (1) significantly outperform the state-of-the-art certified backdoor defense, both in defense strength and efficiency, and (2) on real back-door attacks, PECAN can reduce attack success rate by order of magnitude when compared to a range of baselines from the literature.
△ Less
Submitted 20 May, 2024; v1 submitted 27 January, 2023;
originally announced January 2023.
-
Synthesizing Quantum-Circuit Optimizers
Authors:
Amanda Xu,
Abtin Molavi,
Lauren Pick,
Swamit Tannu,
Aws Albarghouthi
Abstract:
Near-term quantum computers are expected to work in an environment where each operation is noisy, with no error correction. Therefore, quantum-circuit optimizers are applied to minimize the number of noisy operations. Today, physicists are constantly experimenting with novel devices and architectures. For every new physical substrate and for every modification of a quantum computer, we need to mod…
▽ More
Near-term quantum computers are expected to work in an environment where each operation is noisy, with no error correction. Therefore, quantum-circuit optimizers are applied to minimize the number of noisy operations. Today, physicists are constantly experimenting with novel devices and architectures. For every new physical substrate and for every modification of a quantum computer, we need to modify or rewrite major pieces of the optimizer to run successful experiments. In this paper, we present QUESO, an efficient approach for automatically synthesizing a quantum-circuit optimizer for a given quantum device. For instance, in 1.2 minutes, QUESO can synthesize an optimizer with high-probability correctness guarantees for IBM computers that significantly outperforms leading compilers, such as IBM's Qiskit and TKET, on the majority (85%) of the circuits in a diverse benchmark suite.
A number of theoretical and algorithmic insights underlie QUESO: (1) An algebraic approach for representing rewrite rules and their semantics. This facilitates reasoning about complex symbolic rewrite rules that are beyond the scope of existing techniques. (2) A fast approach for probabilistically verifying equivalence of quantum circuits by reducing the problem to a special form of polynomial identity testing. (3) A novel probabilistic data structure, called a polynomial identity filter (PIF), for efficiently synthesizing rewrite rules. (4) A beam-search-based algorithm that efficiently applies the synthesized symbolic rewrite rules to optimize quantum circuits.
△ Less
Submitted 10 May, 2023; v1 submitted 17 November, 2022;
originally announced November 2022.
-
AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels
Authors:
Nicholas Roberts,
Xintong Li,
Tzu-Heng Huang,
Dyah Adila,
Spencer Schoenberg,
Cheng-Yu Liu,
Lauren Pick,
Haotian Ma,
Aws Albarghouthi,
Frederic Sala
Abstract:
Weak supervision (WS) is a powerful method to build labeled datasets for training supervised models in the face of little-to-no labeled data. It replaces hand-labeling data with aggregating multiple noisy-but-cheap label estimates expressed by labeling functions (LFs). While it has been used successfully in many domains, weak supervision's application scope is limited by the difficulty of construc…
▽ More
Weak supervision (WS) is a powerful method to build labeled datasets for training supervised models in the face of little-to-no labeled data. It replaces hand-labeling data with aggregating multiple noisy-but-cheap label estimates expressed by labeling functions (LFs). While it has been used successfully in many domains, weak supervision's application scope is limited by the difficulty of constructing labeling functions for domains with complex or high-dimensional features. To address this, a handful of methods have proposed automating the LF design process using a small set of ground truth labels. In this work, we introduce AutoWS-Bench-101: a framework for evaluating automated WS (AutoWS) techniques in challenging WS settings -- a set of diverse application domains on which it has been previously difficult or impossible to apply traditional WS techniques. While AutoWS is a promising direction toward expanding the application-scope of WS, the emergence of powerful methods such as zero-shot foundation models reveals the need to understand how AutoWS techniques compare or cooperate with modern zero-shot or few-shot learners. This informs the central question of AutoWS-Bench-101: given an initial set of 100 labels for each task, we ask whether a practitioner should use an AutoWS method to generate additional labels or use some simpler baseline, such as zero-shot predictions from a foundation model or supervised learning. We observe that in many settings, it is necessary for AutoWS methods to incorporate signal from foundation models if they are to outperform simple few-shot baselines, and AutoWS-Bench-101 promotes future research in this direction. We conclude with a thorough ablation study of AutoWS methods.
△ Less
Submitted 24 November, 2023; v1 submitted 30 August, 2022;
originally announced August 2022.
-
Qubit Map** and Routing via MaxSAT
Authors:
Abtin Molavi,
Amanda Xu,
Martin Diges,
Lauren Pick,
Swamit Tannu,
Aws Albarghouthi
Abstract:
Near-term quantum computers will operate in a noisy environment, without error correction. A critical problem for near-term quantum computing is laying out a logical circuit onto a physical device with limited connectivity between qubits. This is known as the qubit map** and routing (QMR) problem, an intractable combinatorial problem. It is important to solve QMR as optimally as possible to redu…
▽ More
Near-term quantum computers will operate in a noisy environment, without error correction. A critical problem for near-term quantum computing is laying out a logical circuit onto a physical device with limited connectivity between qubits. This is known as the qubit map** and routing (QMR) problem, an intractable combinatorial problem. It is important to solve QMR as optimally as possible to reduce the amount of added noise, which may render a quantum computation useless. In this paper, we present a novel approach for optimally solving the QMR problem via a reduction to maximum satisfiability (MAXSAT). Additionally, we present two novel relaxation ideas that shrink the size of the MAXSAT constraints by exploiting the structure of a quantum circuit. Our thorough empirical evaluation demonstrates (1) the scalability of our approach compared to state-of-the-art optimal QMR techniques (solves more than 3x benchmarks with 40x speedup), (2) the significant cost reduction compared to state-of-the-art heuristic approaches (an average of ~5x swap reduction), and (3) the power of our proposed constraint relaxations.
△ Less
Submitted 29 August, 2022;
originally announced August 2022.
-
Certifying Data-Bias Robustness in Linear Regression
Authors:
Anna P. Meyer,
Aws Albarghouthi,
Loris D'Antoni
Abstract:
Datasets typically contain inaccuracies due to human error and societal biases, and these inaccuracies can affect the outcomes of models trained on such datasets. We present a technique for certifying whether linear regression models are pointwise-robust to label bias in the training dataset, i.e., whether bounded perturbations to the labels of a training dataset result in models that change the p…
▽ More
Datasets typically contain inaccuracies due to human error and societal biases, and these inaccuracies can affect the outcomes of models trained on such datasets. We present a technique for certifying whether linear regression models are pointwise-robust to label bias in the training dataset, i.e., whether bounded perturbations to the labels of a training dataset result in models that change the prediction of test points. We show how to solve this problem exactly for individual test points, and provide an approximate but more scalable method that does not require advance knowledge of the test point. We extensively evaluate both techniques and find that linear models -- both regression- and classification-based -- often display high levels of bias-robustness. However, we also unearth gaps in bias-robustness, such as high levels of non-robustness for certain bias assumptions on some datasets. Overall, our approach can serve as a guide for when to trust, or question, a model's output.
△ Less
Submitted 7 June, 2022;
originally announced June 2022.
-
BagFlip: A Certified Defense against Data Poisoning
Authors:
Yuhao Zhang,
Aws Albarghouthi,
Loris D'Antoni
Abstract:
Machine learning models are vulnerable to data-poisoning attacks, in which an attacker maliciously modifies the training set to change the prediction of a learned model. In a trigger-less attack, the attacker can modify the training set but not the test inputs, while in a backdoor attack the attacker can also modify test inputs. Existing model-agnostic defense approaches either cannot handle backd…
▽ More
Machine learning models are vulnerable to data-poisoning attacks, in which an attacker maliciously modifies the training set to change the prediction of a learned model. In a trigger-less attack, the attacker can modify the training set but not the test inputs, while in a backdoor attack the attacker can also modify test inputs. Existing model-agnostic defense approaches either cannot handle backdoor attacks or do not provide effective certificates (i.e., a proof of a defense). We present BagFlip, a model-agnostic certified approach that can effectively defend against both trigger-less and backdoor attacks. We evaluate BagFlip on image classification and malware detection datasets. BagFlip is equal to or more effective than the state-of-the-art approaches for trigger-less attacks and more effective than the state-of-the-art approaches for backdoor attacks.
△ Less
Submitted 16 October, 2022; v1 submitted 26 May, 2022;
originally announced May 2022.
-
Certifying Robustness to Programmable Data Bias in Decision Trees
Authors:
Anna P. Meyer,
Aws Albarghouthi,
Loris D'Antoni
Abstract:
Datasets can be biased due to societal inequities, human biases, under-representation of minorities, etc. Our goal is to certify that models produced by a learning algorithm are pointwise-robust to potential dataset biases. This is a challenging problem: it entails learning models for a large, or even infinite, number of datasets, ensuring that they all produce the same prediction. We focus on dec…
▽ More
Datasets can be biased due to societal inequities, human biases, under-representation of minorities, etc. Our goal is to certify that models produced by a learning algorithm are pointwise-robust to potential dataset biases. This is a challenging problem: it entails learning models for a large, or even infinite, number of datasets, ensuring that they all produce the same prediction. We focus on decision-tree learning due to the interpretable nature of the models. Our approach allows programmatically specifying bias models across a variety of dimensions (e.g., missing data for minorities), composing types of bias, and targeting bias towards a specific group. To certify robustness, we use a novel symbolic technique to evaluate a decision-tree learner on a large, or infinite, number of datasets, certifying that each and every dataset produces the same prediction for a specific test point. We evaluate our approach on datasets that are commonly used in the fairness literature, and demonstrate our approach's viability on a range of bias models.
△ Less
Submitted 8 October, 2021;
originally announced October 2021.
-
Introduction to Neural Network Verification
Authors:
Aws Albarghouthi
Abstract:
Deep learning has transformed the way we think of software and what it can do. But deep neural networks are fragile and their behaviors are often surprising. In many settings, we need to provide formal guarantees on the safety, security, correctness, or robustness of neural networks. This book covers foundational ideas from formal verification and their adaptation to reasoning about neural network…
▽ More
Deep learning has transformed the way we think of software and what it can do. But deep neural networks are fragile and their behaviors are often surprising. In many settings, we need to provide formal guarantees on the safety, security, correctness, or robustness of neural networks. This book covers foundational ideas from formal verification and their adaptation to reasoning about neural networks and deep learning.
△ Less
Submitted 4 October, 2021; v1 submitted 21 September, 2021;
originally announced September 2021.
-
Certified Robustness to Programmable Transformations in LSTMs
Authors:
Yuhao Zhang,
Aws Albarghouthi,
Loris D'Antoni
Abstract:
Deep neural networks for natural language processing are fragile in the face of adversarial examples -- small input perturbations, like synonym substitution or word duplication, which cause a neural network to change its prediction. We present an approach to certifying the robustness of LSTMs (and extensions of LSTMs) and training models that can be efficiently certified. Our approach can certify…
▽ More
Deep neural networks for natural language processing are fragile in the face of adversarial examples -- small input perturbations, like synonym substitution or word duplication, which cause a neural network to change its prediction. We present an approach to certifying the robustness of LSTMs (and extensions of LSTMs) and training models that can be efficiently certified. Our approach can certify robustness to intractably large perturbation spaces defined programmatically in a language of string transformations. Our evaluation shows that (1) our approach can train models that are more robust to combinations of string transformations than those produced using existing techniques; (2) our approach can show high certification accuracy of the resulting models.
△ Less
Submitted 7 September, 2021; v1 submitted 15 February, 2021;
originally announced February 2021.
-
Learning Differentially Private Mechanisms
Authors:
Subhajit Roy,
Justin Hsu,
Aws Albarghouthi
Abstract:
Differential privacy is a formal, mathematical definition of data privacy that has gained traction in academia, industry, and government. The task of correctly constructing differentially private algorithms is non-trivial, and mistakes have been made in foundational algorithms. Currently, there is no automated support for converting an existing, non-private program into a differentially private ve…
▽ More
Differential privacy is a formal, mathematical definition of data privacy that has gained traction in academia, industry, and government. The task of correctly constructing differentially private algorithms is non-trivial, and mistakes have been made in foundational algorithms. Currently, there is no automated support for converting an existing, non-private program into a differentially private version. In this paper, we propose a technique for automatically learning an accurate and differentially private version of a given non-private program. We show how to solve this difficult program synthesis problem via a combination of techniques: carefully picking representative example inputs, reducing the problem to continuous optimization, and map** the results back to symbolic expressions. We demonstrate that our approach is able to learn foundational algorithms from the differential privacy literature and significantly outperforms natural program synthesis baselines.
△ Less
Submitted 4 January, 2021;
originally announced January 2021.
-
Interval Universal Approximation for Neural Networks
Authors:
Zi Wang,
Aws Albarghouthi,
Gautam Prakriya,
Somesh Jha
Abstract:
To verify safety and robustness of neural networks, researchers have successfully applied abstract interpretation, primarily using the interval abstract domain. In this paper, we study the theoretical power and limits of the interval domain for neural-network verification.
First, we introduce the interval universal approximation (IUA) theorem. IUA shows that neural networks not only can approxim…
▽ More
To verify safety and robustness of neural networks, researchers have successfully applied abstract interpretation, primarily using the interval abstract domain. In this paper, we study the theoretical power and limits of the interval domain for neural-network verification.
First, we introduce the interval universal approximation (IUA) theorem. IUA shows that neural networks not only can approximate any continuous function $f$ (universal approximation) as we have known for decades, but we can find a neural network, using any well-behaved activation function, whose interval bounds are an arbitrarily close approximation of the set semantics of $f$ (the result of applying $f$ to a set of inputs). We call this notion of approximation interval approximation. Our theorem generalizes the recent result of Baader et al. (2020) from ReLUs to a rich class of activation functions that we call squashable functions. Additionally, the IUA theorem implies that we can always construct provably robust neural networks under $\ell_\infty$-norm using almost any practical activation function.
Second, we study the computational complexity of constructing neural networks that are amenable to precise interval analysis. This is a crucial question, as our constructive proof of IUA is exponential in the size of the approximation domain. We boil this question down to the problem of approximating the range of a neural network with squashable activation functions. We show that the range approximation problem (RA) is a $Δ_2$-intermediate problem, which is strictly harder than $\mathsf{NP}$-complete problems, assuming $\mathsf{coNP}\not\subset \mathsf{NP}$. As a result, IUA is an inherently hard problem: No matter what abstract domain or computational tools we consider to achieve interval approximation, there is no efficient construction of such a universal approximator.
△ Less
Submitted 14 July, 2021; v1 submitted 12 July, 2020;
originally announced July 2020.
-
Backdoors in Neural Models of Source Code
Authors:
Goutham Ramakrishnan,
Aws Albarghouthi
Abstract:
Deep neural networks are vulnerable to a range of adversaries. A particularly pernicious class of vulnerabilities are backdoors, where model predictions diverge in the presence of subtle triggers in inputs. An attacker can implant a backdoor by poisoning the training data to yield a desired target prediction on triggered inputs. We study backdoors in the context of deep-learning for source code. (…
▽ More
Deep neural networks are vulnerable to a range of adversaries. A particularly pernicious class of vulnerabilities are backdoors, where model predictions diverge in the presence of subtle triggers in inputs. An attacker can implant a backdoor by poisoning the training data to yield a desired target prediction on triggered inputs. We study backdoors in the context of deep-learning for source code. (1) We define a range of backdoor classes for source-code tasks and show how to poison a dataset to install such backdoors. (2) We adapt and improve recent algorithms from robust statistics for our setting, showing that backdoors leave a spectral signature in the learned representation of source code, thus enabling detection of poisoned data. (3) We conduct a thorough evaluation on different architectures and languages, showing the ease of injecting backdoors and our ability to eliminate them.
△ Less
Submitted 11 June, 2020;
originally announced June 2020.
-
A Comparative Exploration of ML Techniques for Tuning Query Degree of Parallelism
Authors:
Zhiwei Fan,
Rathijit Sen,
Paraschos Koutris,
Aws Albarghouthi
Abstract:
There is a large body of recent work applying machine learning (ML) techniques to query optimization and query performance prediction in relational database management systems (RDBMSs). However, these works typically ignore the effect of \textit{intra-parallelism} -- a key component used to boost the performance of OLAP queries in practice -- on query performance prediction. In this paper, we take…
▽ More
There is a large body of recent work applying machine learning (ML) techniques to query optimization and query performance prediction in relational database management systems (RDBMSs). However, these works typically ignore the effect of \textit{intra-parallelism} -- a key component used to boost the performance of OLAP queries in practice -- on query performance prediction. In this paper, we take a first step towards filling this gap by studying the problem of \textit{tuning the degree of parallelism (DOP) via ML techniques} in Microsoft SQL Server, a popular commercial RDBMS that allows an individual query to execute using multiple cores.
In our study, we cast the problem of DOP tuning as a {\em regression} task, and examine how several popular ML models can help with query performance prediction in a multi-core setting. We explore the design space and perform an extensive experimental study comparing different models against a list of performance metrics, testing how well they generalize in different settings: $(i)$ to queries from the same template, $(ii)$ to queries from a new template, $(iii)$ to instances of different scale, and $(iv)$ to different instances and queries. Our experimental results show that a simple featurization of the input query plan that ignores cost model estimations can accurately predict query performance, capture the speedup trend with respect to the available parallelism, as well as help with automatically choosing an optimal per-query DOP.
△ Less
Submitted 21 May, 2020; v1 submitted 17 May, 2020;
originally announced May 2020.
-
Robustness to Programmable String Transformations via Augmented Abstract Training
Authors:
Yuhao Zhang,
Aws Albarghouthi,
Loris D'Antoni
Abstract:
Deep neural networks for natural language processing tasks are vulnerable to adversarial input perturbations. In this paper, we present a versatile language for programmatically specifying string transformations -- e.g., insertions, deletions, substitutions, swaps, etc. -- that are relevant to the task at hand. We then present an approach to adversarially training models that are robust to such us…
▽ More
Deep neural networks for natural language processing tasks are vulnerable to adversarial input perturbations. In this paper, we present a versatile language for programmatically specifying string transformations -- e.g., insertions, deletions, substitutions, swaps, etc. -- that are relevant to the task at hand. We then present an approach to adversarially training models that are robust to such user-defined string transformations. Our approach combines the advantages of search-based techniques for adversarial training with abstraction-based techniques. Specifically, we show how to decompose a set of user-defined string transformations into two component specifications, one that benefits from search and another from abstraction. We use our technique to train models on the AG and SST2 datasets and show that the resulting models are robust to combinations of user-defined transformations mimicking spelling mistakes and other meaning-preserving transformations.
△ Less
Submitted 2 September, 2020; v1 submitted 21 February, 2020;
originally announced February 2020.
-
Semantic Robustness of Models of Source Code
Authors:
Goutham Ramakrishnan,
Jordan Henkel,
Zi Wang,
Aws Albarghouthi,
Somesh Jha,
Thomas Reps
Abstract:
Deep neural networks are vulnerable to adversarial examples - small input perturbations that result in incorrect predictions. We study this problem for models of source code, where we want the network to be robust to source-code modifications that preserve code functionality. (1) We define a powerful adversary that can employ sequences of parametric, semantics-preserving program transformations; (…
▽ More
Deep neural networks are vulnerable to adversarial examples - small input perturbations that result in incorrect predictions. We study this problem for models of source code, where we want the network to be robust to source-code modifications that preserve code functionality. (1) We define a powerful adversary that can employ sequences of parametric, semantics-preserving program transformations; (2) we show how to perform adversarial training to learn models robust to such adversaries; (3) we conduct an evaluation on different languages and architectures, demonstrating significant quantitative gains in robustness.
△ Less
Submitted 11 June, 2020; v1 submitted 7 February, 2020;
originally announced February 2020.
-
Proving Data-Poisoning Robustness in Decision Trees
Authors:
Samuel Drews,
Aws Albarghouthi,
Loris D'Antoni
Abstract:
Machine learning models are brittle, and small changes in the training data can result in different predictions. We study the problem of proving that a prediction is robust to data poisoning, where an attacker can inject a number of malicious elements into the training set to influence the learned model. We target decision-tree models, a popular and simple class of machine learning models that und…
▽ More
Machine learning models are brittle, and small changes in the training data can result in different predictions. We study the problem of proving that a prediction is robust to data poisoning, where an attacker can inject a number of malicious elements into the training set to influence the learned model. We target decision-tree models, a popular and simple class of machine learning models that underlies many complex learning techniques. We present a sound verification technique based on abstract interpretation and implement it in a tool called Antidote. Antidote abstractly trains decision trees for an intractably large space of possible poisoned datasets. Due to the soundness of our abstraction, Antidote can produce proofs that, for a given input, the corresponding prediction would not have changed had the training set been tampered with or not. We demonstrate the effectiveness of Antidote on a number of popular datasets.
△ Less
Submitted 23 June, 2020; v1 submitted 2 December, 2019;
originally announced December 2019.
-
Synthesizing Action Sequences for Modifying Model Decisions
Authors:
Goutham Ramakrishnan,
Yun Chan Lee,
Aws Albarghouthi
Abstract:
When a model makes a consequential decision, e.g., denying someone a loan, it needs to additionally generate actionable, realistic feedback on what the person can do to favorably change the decision. We cast this problem through the lens of program synthesis, in which our goal is to synthesize an optimal (realistically cheapest or simplest) sequence of actions that if a person executes successfull…
▽ More
When a model makes a consequential decision, e.g., denying someone a loan, it needs to additionally generate actionable, realistic feedback on what the person can do to favorably change the decision. We cast this problem through the lens of program synthesis, in which our goal is to synthesize an optimal (realistically cheapest or simplest) sequence of actions that if a person executes successfully can change their classification. We present a novel and general approach that combines search-based program synthesis and test-time adversarial attacks to construct action sequences over a domain-specific set of actions. We demonstrate the effectiveness of our approach on a number of deep neural networks.
△ Less
Submitted 9 October, 2019; v1 submitted 30 September, 2019;
originally announced October 2019.
-
A Static Analysis-based Cross-Architecture Performance Prediction Using Machine Learning
Authors:
Newsha Ardalani,
Urmish Thakker,
Aws Albarghouthi,
Karu Sankaralingam
Abstract:
Porting code from CPU to GPU is costly and time-consuming; Unless much time is invested in development and optimization, it is not obvious, a priori, how much speed-up is achievable or how much room is left for improvement. Knowing the potential speed-up a priori can be very useful: It can save hundreds of engineering hours, help programmers with prioritization and algorithm selection. We aim to a…
▽ More
Porting code from CPU to GPU is costly and time-consuming; Unless much time is invested in development and optimization, it is not obvious, a priori, how much speed-up is achievable or how much room is left for improvement. Knowing the potential speed-up a priori can be very useful: It can save hundreds of engineering hours, help programmers with prioritization and algorithm selection. We aim to address this problem using machine learning in a supervised setting, using solely the single-threaded source code of the program, without having to run or profile the code. We propose a static analysis-based cross-architecture performance prediction framework (Static XAPP) which relies solely on program properties collected using static analysis of the CPU source code and predicts whether the potential speed-up is above or below a given threshold. We offer preliminary results that show we can achieve 94% accuracy in binary classification, in average, across different thresholds
△ Less
Submitted 18 June, 2019;
originally announced June 2019.
-
Efficient Synthesis with Probabilistic Constraints
Authors:
Samuel Drews,
Aws Albarghouthi,
Loris D'Antoni
Abstract:
We consider the problem of synthesizing a program given a probabilistic specification of its desired behavior. Specifically, we study the recent paradigm of distribution-guided inductive synthesis (DIGITS), which iteratively calls a synthesizer on finite sample sets from a given distribution. We make theoretical and algorithmic contributions: (i) We prove the surprising result that DIGITS only req…
▽ More
We consider the problem of synthesizing a program given a probabilistic specification of its desired behavior. Specifically, we study the recent paradigm of distribution-guided inductive synthesis (DIGITS), which iteratively calls a synthesizer on finite sample sets from a given distribution. We make theoretical and algorithmic contributions: (i) We prove the surprising result that DIGITS only requires a polynomial number of synthesizer calls in the size of the sample set, despite its ostensibly exponential behavior. (ii) We present a property-directed version of DIGITS that further reduces the number of synthesizer calls, drastically improving synthesis performance on a range of benchmarks.
△ Less
Submitted 20 May, 2019;
originally announced May 2019.
-
Scaling-Up In-Memory Datalog Processing: Observations and Techniques
Authors:
Zhiwei Fan,
Jianqiao Zhu,
Zuyu Zhang,
Aws Albarghouthi,
Paraschos Koutris,
Jignesh Patel
Abstract:
Recursive query processing has experienced a recent resurgence, as a result of its use in many modern application domains, including data integration, graph analytics, security, program analysis, networking and decision making. Due to the large volumes of data being processed, several research efforts, across multiple communities, have explored how to scale up recursive queries, typically expresse…
▽ More
Recursive query processing has experienced a recent resurgence, as a result of its use in many modern application domains, including data integration, graph analytics, security, program analysis, networking and decision making. Due to the large volumes of data being processed, several research efforts, across multiple communities, have explored how to scale up recursive queries, typically expressed in Datalog. Our experience with these tools indicated that their performance does not translate across domains (e.g., a tool design for large-scale graph analytics does not exhibit the same performance on program-analysis tasks, and vice versa). As a result, we designed and implemented a general-purpose Datalog engine, called RecStep, on top of a parallel single-node relational system. In this paper, we outline the different techniques we use in RecStep, and the contribution of each technique to overall performance. We also present results from a detailed set of experiments comparing RecStep with a number of other Datalog systems using both graph analytics and program-analysis tasks, summarizing pros and cons of existing techniques based on the analysis of our observations. We show that RecStep generally outperforms the state-of-the-art parallel Datalog engines on complex and large-scale Datalog program evaluation, by a 4-6X margin. An additional insight from our work is that we show that it is possible to build a high-performance Datalog system on top of a relational engine, an idea that has been dismissed in past work in this area.
△ Less
Submitted 10 December, 2018;
originally announced December 2018.
-
Trace Abstraction Modulo Probability
Authors:
Calvin Smith,
Justin Hsu,
Aws Albarghouthi
Abstract:
We propose trace abstraction modulo probability, a proof technique for verifying high-probability accuracy guarantees of probabilistic programs. Our proofs overapproximate the set of program traces using failure automata, finite-state automata that upper bound the probability of failing to satisfy a target specification. We automate proof construction by reducing probabilistic reasoning to logical…
▽ More
We propose trace abstraction modulo probability, a proof technique for verifying high-probability accuracy guarantees of probabilistic programs. Our proofs overapproximate the set of program traces using failure automata, finite-state automata that upper bound the probability of failing to satisfy a target specification. We automate proof construction by reducing probabilistic reasoning to logical reasoning: we use program synthesis methods to select axioms for sampling instructions, and then apply Craig interpolation to prove that traces fail the target specification with only a small probability. Our method handles programs with unknown inputs, parameterized distributions, infinite state spaces, and parameterized specifications. We evaluate our technique on a range of randomized algorithms drawn from the differential privacy literature and beyond. To our knowledge, our approach is the first to automatically establish accuracy properties of these algorithms.
△ Less
Submitted 29 October, 2018;
originally announced October 2018.
-
Neural-Augmented Static Analysis of Android Communication
Authors:
**man Zhao,
Aws Albarghouthi,
Vaibhav Rastogi,
Somesh Jha,
Damien Octeau
Abstract:
We address the problem of discovering communication links between applications in the popular Android mobile operating system, an important problem for security and privacy in Android. Any scalable static analysis in this complex setting is bound to produce an excessive amount of false-positives, rendering it impractical. To improve precision, we propose to augment static analysis with a trained n…
▽ More
We address the problem of discovering communication links between applications in the popular Android mobile operating system, an important problem for security and privacy in Android. Any scalable static analysis in this complex setting is bound to produce an excessive amount of false-positives, rendering it impractical. To improve precision, we propose to augment static analysis with a trained neural-network model that estimates the probability that a communication link truly exists. We describe a neural-network architecture that encodes abstractions of communicating objects in two applications and estimates the probability with which a link indeed exists. At the heart of our architecture are type-directed encoders (TDE), a general framework for elegantly constructing encoders of a compound data type by recursively composing encoders for its constituent types. We evaluate our approach on a large corpus of Android applications, and demonstrate that it achieves very high accuracy. Further, we conduct thorough interpretability studies to understand the internals of the learned neural networks.
△ Less
Submitted 11 September, 2018;
originally announced September 2018.
-
Constraint-Based Synthesis of Coupling Proofs
Authors:
Aws Albarghouthi,
Justin Hsu
Abstract:
Proof by coupling is a classical technique for proving properties about pairs of randomized algorithms by carefully relating (or coupling) two probabilistic executions. In this paper, we show how to automatically construct such proofs for probabilistic programs. First, we present f-coupled postconditions, an abstraction describing two correlated program executions. Second, we show how properties o…
▽ More
Proof by coupling is a classical technique for proving properties about pairs of randomized algorithms by carefully relating (or coupling) two probabilistic executions. In this paper, we show how to automatically construct such proofs for probabilistic programs. First, we present f-coupled postconditions, an abstraction describing two correlated program executions. Second, we show how properties of f-coupled postconditions can imply various probabilistic properties of the original programs. Third, we demonstrate how to reduce the proof-search problem to a purely logical synthesis problem of the form $\exists f\ldotp \forall X\ldotp φ$, making probabilistic reasoning unnecessary. We develop a prototype implementation to automatically build coupling proofs for probabilistic properties, including uniformity and independence of program expressions.
△ Less
Submitted 11 April, 2018;
originally announced April 2018.
-
Synthesizing Coupling Proofs of Differential Privacy
Authors:
Aws Albarghouthi,
Justin Hsu
Abstract:
Differential privacy has emerged as a promising probabilistic formulation of privacy, generating intense interest within academia and industry. We present a push-button, automated technique for verifying $\varepsilon$-differential privacy of sophisticated randomized algorithms. We make several conceptual, algorithmic, and practical contributions: (i) Inspired by the recent advances on approximate…
▽ More
Differential privacy has emerged as a promising probabilistic formulation of privacy, generating intense interest within academia and industry. We present a push-button, automated technique for verifying $\varepsilon$-differential privacy of sophisticated randomized algorithms. We make several conceptual, algorithmic, and practical contributions: (i) Inspired by the recent advances on approximate couplings and randomness alignment, we present a new proof technique called coupling strategies, which casts differential privacy proofs as a winning strategy in a game where we have finite privacy resources to expend. (ii) To discover a winning strategy, we present a constraint-based formulation of the problem as a set of Horn modulo couplings (HMC) constraints, a novel combination of first-order Horn clauses and probabilistic constraints. (iii) We present a technique for solving HMC constraints by transforming probabilistic constraints into logical constraints with uninterpreted functions. (iv) Finally, we implement our technique in the FairSquare verifier and provide the first automated privacy proofs for a number of challenging algorithms from the differential privacy literature, including Report Noisy Max, the Exponential Mechanism, and the Sparse Vector Mechanism.
△ Less
Submitted 8 November, 2017; v1 submitted 15 September, 2017;
originally announced September 2017.
-
Quantifying Program Bias
Authors:
Aws Albarghouthi,
Loris D'Antoni,
Samuel Drews,
Aditya Nori
Abstract:
With the range and sensitivity of algorithmic decisions expanding at a break-neck speed, it is imperative that we aggressively investigate whether programs are biased. We propose a novel probabilistic program analysis technique and apply it to quantifying bias in decision-making programs. Specifically, we (i) present a sound and complete automated verification technique for proving quantitative pr…
▽ More
With the range and sensitivity of algorithmic decisions expanding at a break-neck speed, it is imperative that we aggressively investigate whether programs are biased. We propose a novel probabilistic program analysis technique and apply it to quantifying bias in decision-making programs. Specifically, we (i) present a sound and complete automated verification technique for proving quantitative properties of probabilistic programs; (ii) show that certain notions of bias, recently proposed in the fairness literature, can be phrased as quantitative correctness properties; and (iii) present FairSquare, the first verification tool for quantifying program bias, and evaluate it on a range of decision-making programs.
△ Less
Submitted 6 March, 2017; v1 submitted 17 February, 2017;
originally announced February 2017.
-
Fairness as a Program Property
Authors:
Aws Albarghouthi,
Loris D'Antoni,
Samuel Drews,
Aditya Nori
Abstract:
We explore the following question: Is a decision-making program fair, for some useful definition of fairness? First, we describe how several algorithmic fairness questions can be phrased as program verification problems. Second, we discuss an automated verification technique for proving or disproving fairness of decision-making programs with respect to a probabilistic model of the population.
We explore the following question: Is a decision-making program fair, for some useful definition of fairness? First, we describe how several algorithmic fairness questions can be phrased as program verification problems. Second, we discuss an automated verification technique for proving or disproving fairness of decision-making programs with respect to a probabilistic model of the population.
△ Less
Submitted 19 October, 2016;
originally announced October 2016.
-
Spatial Interpolants
Authors:
Aws Albarghouthi,
Josh Berdine,
Byron Cook,
Zachary Kincaid
Abstract:
We propose Splinter, a new technique for proving properties of heap-manipulating programs that marries (1) a new separation logic-based analysis for heap reasoning with (2) an interpolation-based technique for refining heap-shape invariants with data invariants. Splinter is property directed, precise, and produces counterexample traces when a property does not hold. Using the novel notion of spati…
▽ More
We propose Splinter, a new technique for proving properties of heap-manipulating programs that marries (1) a new separation logic-based analysis for heap reasoning with (2) an interpolation-based technique for refining heap-shape invariants with data invariants. Splinter is property directed, precise, and produces counterexample traces when a property does not hold. Using the novel notion of spatial interpolants modulo theories, Splinter can infer complex invariants over general recursive predicates, e.g., of the form all elements in a linked list are even or a binary tree is sorted. Furthermore, we treat interpolation as a black box, which gives us the freedom to encode data manipulation in any suitable theory for a given program (e.g., bit vectors, arrays, or linear arithmetic), so that our technique immediately benefits from any future advances in SMT solving and interpolation.
△ Less
Submitted 16 January, 2015;
originally announced January 2015.