Search | arXiv e-print repository

Is Watermarking LLM-Generated Code Robust?

Authors: Tarun Suresh, Shubham Ugare, Gagandeep Singh, Sasa Misailovic

Abstract: We present the first study of the robustness of existing watermarking techniques on Python code generated by large language models. Although existing works showed that watermarking can be robust for natural language, we show that it is easy to remove these watermarks on code by semantic-preserving transformations. We present the first study of the robustness of existing watermarking techniques on Python code generated by large language models. Although existing works showed that watermarking can be robust for natural language, we show that it is easy to remove these watermarks on code by semantic-preserving transformations. △ Less

Submitted 28 June, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

arXiv:2403.13989 [pdf, other]

FastFlip: Compositional Error Injection Analysis

Authors: Keyur Joshi, Rahul Singh, Tommaso Bassetto, Sarita Adve, Darko Marinov, Sasa Misailovic

Abstract: Instruction-level error injection analyses aim to find instructions where errors often lead to unacceptable outcomes like Silent Data Corruptions (SDCs). These analyses require significant time, which is especially problematic if developers wish to regularly analyze software that evolves over time. We present FastFlip, a combination of empirical error injection and symbolic SDC propagation analy… ▽ More Instruction-level error injection analyses aim to find instructions where errors often lead to unacceptable outcomes like Silent Data Corruptions (SDCs). These analyses require significant time, which is especially problematic if developers wish to regularly analyze software that evolves over time. We present FastFlip, a combination of empirical error injection and symbolic SDC propagation analyses that enables fast, compositional error injection analysis of evolving programs. FastFlip calculates how SDCs propagate across program sections and correctly accounts for unexpected side effects that can occur due to errors. Using FastFlip, we analyze five benchmarks, plus two modified versions of each benchmark. FastFlip speeds up the analysis of incrementally modified programs by $3.2\times$ (geomean). FastFlip selects a set of instructions to protect against SDCs that minimizes the runtime cost of protection while protecting against a developer-specified target fraction of all SDC-causing errors. △ Less

Submitted 26 March, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

arXiv:2403.01632 [pdf, other]

SynCode: LLM Generation with Grammar Augmentation

Authors: Shubham Ugare, Tarun Suresh, Hangoo Kang, Sasa Misailovic, Gagandeep Singh

Abstract: LLMs are widely used in complex AI applications. These applications underscore the need for LLM outputs to adhere to a specific format, for their integration with other components in the systems. Typically the format rules e.g., for data serialization formats such as JSON, YAML, or Code in Programming Language are expressed as context-free grammar (CFG). Due to the hallucinations and unreliability… ▽ More LLMs are widely used in complex AI applications. These applications underscore the need for LLM outputs to adhere to a specific format, for their integration with other components in the systems. Typically the format rules e.g., for data serialization formats such as JSON, YAML, or Code in Programming Language are expressed as context-free grammar (CFG). Due to the hallucinations and unreliability of LLMs, instructing LLMs to adhere to specified syntax becomes an increasingly important challenge. We present SynCode, a novel framework for efficient and general syntactical decoding with LLMs, to address this challenge. SynCode leverages the CFG of a formal language, utilizing an offline-constructed efficient lookup table called DFA mask store based on the discrete finite automaton (DFA) of the language grammar terminals. We demonstrate SynCode's soundness and completeness given the CFG of the formal language, presenting its ability to retain syntactically valid tokens while rejecting invalid ones. SynCode seamlessly integrates with any language defined by CFG, as evidenced by experiments focusing on generating JSON, Python, and Go outputs. Our experiments evaluating the effectiveness of SynCode for JSON generation demonstrate that SynCode eliminates all syntax errors and significantly outperforms state-of-the-art baselines. Furthermore, our results underscore how SynCode significantly reduces 96.07% of syntax errors in generated Python and Go code, showcasing its substantial impact on enhancing syntactical precision in LLM generation. Our code is available at https://github.com/uiuc-focal-lab/syncode △ Less

Submitted 29 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

arXiv:2305.19521 [pdf, other]

Incremental Randomized Smoothing Certification

Authors: Shubham Ugare, Tarun Suresh, Debangshu Banerjee, Gagandeep Singh, Sasa Misailovic

Abstract: Randomized smoothing-based certification is an effective approach for obtaining robustness certificates of deep neural networks (DNNs) against adversarial attacks. This method constructs a smoothed DNN model and certifies its robustness through statistical sampling, but it is computationally expensive, especially when certifying with a large number of samples. Furthermore, when the smoothed model… ▽ More Randomized smoothing-based certification is an effective approach for obtaining robustness certificates of deep neural networks (DNNs) against adversarial attacks. This method constructs a smoothed DNN model and certifies its robustness through statistical sampling, but it is computationally expensive, especially when certifying with a large number of samples. Furthermore, when the smoothed model is modified (e.g., quantized or pruned), certification guarantees may not hold for the modified DNN, and recertifying from scratch can be prohibitively expensive. We present the first approach for incremental robustness certification for randomized smoothing, IRS. We show how to reuse the certification guarantees for the original smoothed model to certify an approximated model with very few samples. IRS significantly reduces the computational cost of certifying modified DNNs while maintaining strong robustness guarantees. We experimentally demonstrate the effectiveness of our approach, showing up to 3x certification speedup over the certification that applies randomized smoothing of the approximate model from scratch. △ Less

Submitted 10 April, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

Comments: ICLR 2024

arXiv:2304.01874 [pdf, other]

doi 10.1145/3591299

Incremental Verification of Neural Networks

Authors: Shubham Ugare, Debangshu Banerjee, Sasa Misailovic, Gagandeep Singh

Abstract: Complete verification of deep neural networks (DNNs) can exactly determine whether the DNN satisfies a desired trustworthy property (e.g., robustness, fairness) on an infinite set of inputs or not. Despite the tremendous progress to improve the scalability of complete verifiers over the years on individual DNNs, they are inherently inefficient when a deployed DNN is updated to improve its inferenc… ▽ More Complete verification of deep neural networks (DNNs) can exactly determine whether the DNN satisfies a desired trustworthy property (e.g., robustness, fairness) on an infinite set of inputs or not. Despite the tremendous progress to improve the scalability of complete verifiers over the years on individual DNNs, they are inherently inefficient when a deployed DNN is updated to improve its inference speed or accuracy. The inefficiency is because the expensive verifier needs to be run from scratch on the updated DNN. To improve efficiency, we propose a new, general framework for incremental and complete DNN verification based on the design of novel theory, data structure, and algorithms. Our contributions implemented in a tool named IVAN yield an overall geometric mean speedup of 2.4x for verifying challenging MNIST and CIFAR10 classifiers and a geometric mean speedup of 3.8x for the ACAS-XU classifiers over the state-of-the-art baselines. △ Less

Submitted 11 June, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

Comments: PLDI 2023

arXiv:2303.11291 [pdf, other]

doi 10.1109/JIOT.2024.3365957

Mobiprox: Supporting Dynamic Approximate Computing on Mobiles

Authors: Matevž Fabjančič, Octavian Machidon, Hashim Sharif, Yifan Zhao, Saša Misailović, Veljko Pejović

Abstract: Runtime-tunable context-dependent network compression would make mobile deep learning (DL) adaptable to often varying resource availability, input "difficulty", or user needs. The existing compression techniques significantly reduce the memory, processing, and energy tax of DL, yet, the resulting models tend to be permanently impaired, sacrificing the inference power for reduced resource usage. Th… ▽ More Runtime-tunable context-dependent network compression would make mobile deep learning (DL) adaptable to often varying resource availability, input "difficulty", or user needs. The existing compression techniques significantly reduce the memory, processing, and energy tax of DL, yet, the resulting models tend to be permanently impaired, sacrificing the inference power for reduced resource usage. The existing tunable compression approaches, on the other hand, require expensive re-training, do not support arbitrary strategies for adapting the compression and do not provide mobile-ready implementations. In this paper we present Mobiprox, a framework enabling mobile DL with flexible precision. Mobiprox implements tunable approximations of tensor operations and enables runtime-adaptable approximation of individual network layers. A profiler and a tuner included with Mobiprox identify the most promising neural network approximation configurations leading to the desired inference quality with the minimal use of resources. Furthermore, we develop control strategies that depending on contextual factors, such as the input data difficulty, dynamically adjust the approximation levels across a mobile DL model's layers. We implement Mobiprox in Android OS and through experiments in diverse mobile domains, including human activity recognition and spoken keyword detection, demonstrate that it can save up to 15% system-wide energy with a minimal impact on the inference accuracy. △ Less

Submitted 22 February, 2024; v1 submitted 16 March, 2023; originally announced March 2023.

Comments: 14 pages, 8 figures. IEEE Internet of Things Journal (2024)

arXiv:2208.02232 [pdf, other]

GAS: Generating Fast and Accurate Surrogate Models for Autonomous Vehicle Systems

Authors: Keyur Joshi, Chiao Hsieh, Sayan Mitra, Sasa Misailovic

Abstract: Modern autonomous vehicle systems use complex perception and control components. These components can rapidly change during development of such systems, requiring constant re-testing. Unfortunately, high-fidelity simulations of these complex systems for evaluating vehicle safety are costly. The complexity also hinders the creation of less computationally intensive surrogate models. We present GA… ▽ More Modern autonomous vehicle systems use complex perception and control components. These components can rapidly change during development of such systems, requiring constant re-testing. Unfortunately, high-fidelity simulations of these complex systems for evaluating vehicle safety are costly. The complexity also hinders the creation of less computationally intensive surrogate models. We present GAS, the first approach for creating surrogate models of complete (perception, control, and dynamics) autonomous vehicle systems containing complex perception and/or control components. GAS's two-stage approach first replaces complex perception components with a perception model. Then, GAS constructs a polynomial surrogate model of the complete vehicle system using Generalized Polynomial Chaos (GPC). We demonstrate the use of these surrogate models in two applications. First, we estimate the probability that the vehicle will enter an unsafe state over time. Second, we perform global sensitivity analysis of the vehicle system with respect to its state in a previous time step. GAS's approach also allows for reuse of the perception model when vehicle control and dynamics characteristics are altered during vehicle development, saving significant time. We consider five scenarios concerning crop management vehicles that must not crash into adjacent crops, self driving cars that must stay within their lane, and unmanned aircraft that must avoid collision. Each of the systems in these scenarios contain a complex perception or control component. Using GAS, we generate surrogate models for these systems, and evaluate the generated models in the applications described above. GAS's surrogate models provide an average speedup of $3.7\times$ for safe state probability estimation (minimum $2.1\times$) and $1.4\times$ for sensitivity analysis (minimum $1.3\times$), while still maintaining high accuracy. △ Less

Submitted 13 July, 2023; v1 submitted 3 August, 2022; originally announced August 2022.

arXiv:2207.11177 [pdf, other]

Provable Defense Against Geometric Transformations

Authors: Rem Yang, Jacob Laurel, Sasa Misailovic, Gagandeep Singh

Abstract: Geometric image transformations that arise in the real world, such as scaling and rotation, have been shown to easily deceive deep neural networks (DNNs). Hence, training DNNs to be certifiably robust to these perturbations is critical. However, no prior work has been able to incorporate the objective of deterministic certified robustness against geometric transformations into the training procedu… ▽ More Geometric image transformations that arise in the real world, such as scaling and rotation, have been shown to easily deceive deep neural networks (DNNs). Hence, training DNNs to be certifiably robust to these perturbations is critical. However, no prior work has been able to incorporate the objective of deterministic certified robustness against geometric transformations into the training procedure, as existing verifiers are exceedingly slow. To address these challenges, we propose the first provable defense for deterministic certified geometric robustness. Our framework leverages a novel GPU-optimized verifier that can certify images between 60$\times$ to 42,600$\times$ faster than existing geometric robustness verifiers, and thus unlike existing works, is fast enough for use in training. Across multiple datasets, our results show that networks trained via our framework consistently achieve state-of-the-art deterministic certified geometric robustness and clean accuracy. Furthermore, for the first time, we verify the geometric robustness of a neural network for the challenging, real-world setting of autonomous driving. △ Less

Submitted 6 May, 2023; v1 submitted 22 July, 2022; originally announced July 2022.

Comments: ICLR 2023 notable-top-25%

arXiv:2111.05534 [pdf, other]

doi 10.1109/TCAD.2022.3197508

Verifying Controllers with Convolutional Neural Network-based Perception: A Case for Intelligible, Safe, and Precise Abstractions

Authors: Chiao Hsieh, Keyur Joshi, Sasa Misailovic, Sayan Mitra

Abstract: Convolutional Neural Networks (CNN) for object detection, lane detection, and segmentation now sit at the head of most autonomy pipelines, and yet, their safety analysis remains an important challenge. Formal analysis of perception models is fundamentally difficult because their correctness is hard if not impossible to specify. We present a technique for inferring intelligible and safe abstraction… ▽ More Convolutional Neural Networks (CNN) for object detection, lane detection, and segmentation now sit at the head of most autonomy pipelines, and yet, their safety analysis remains an important challenge. Formal analysis of perception models is fundamentally difficult because their correctness is hard if not impossible to specify. We present a technique for inferring intelligible and safe abstractions for perception models from system-level safety requirements, data, and program analysis of the modules that are downstream from perception. The technique can help tradeoff safety, size, and precision, in creating abstractions and the subsequent verification. We apply the method to two significant case studies based on high-fidelity simulations (a) a vision-based lane kee** controller for an autonomous vehicle and (b) a controller for an agricultural robot. We show how the generated abstractions can be composed with the downstream modules and then the resulting abstract system can be verified using program analysis tools like CBMC. Detailed evaluations of the impacts of size, safety requirements, and the environmental parameters (e.g., lighting, road surface, plant type) on the precision of the generated abstractions suggest that the approach can help guide the search for corner cases and safe operating envelops. △ Less

Submitted 10 November, 2021; originally announced November 2021.

Comments: 12 pages, 9 figures, submitted to HSCC 2022

arXiv:1809.05859 [pdf, other]

Exploiting Errors for Efficiency: A Survey from Circuits to Algorithms

Authors: Phillip Stanley-Marbell, Armin Alaghi, Michael Carbin, Eva Darulova, Lara Dolecek, Andreas Gerstlauer, Ghayoor Gillani, Djordje Jevdjic, Thierry Moreau, Mattia Cacciotti, Alexandros Daglis, Natalie Enright Jerger, Babak Falsafi, Sasa Misailovic, Adrian Sampson, Damien Zufferey

Abstract: When a computational task tolerates a relaxation of its specification or when an algorithm tolerates the effects of noise in its execution, hardware, programming languages, and system software can trade deviations from correct behavior for lower resource usage. We present, for the first time, a synthesis of research results on computing systems that only make as many errors as their users can tole… ▽ More When a computational task tolerates a relaxation of its specification or when an algorithm tolerates the effects of noise in its execution, hardware, programming languages, and system software can trade deviations from correct behavior for lower resource usage. We present, for the first time, a synthesis of research results on computing systems that only make as many errors as their users can tolerate, from across the disciplines of computer aided design of circuits, digital system design, computer architecture, programming languages, operating systems, and information theory. Rather than over-provisioning resources at each layer to avoid errors, it can be more efficient to exploit the masking of errors occurring at one layer which can prevent them from propagating to a higher layer. We survey tradeoffs for individual layers of computing systems from the circuit level to the operating system level and illustrate the potential benefits of end-to-end approaches using two illustrative examples. To tie together the survey, we present a consistent formalization of terminology, across the layers, which does not significantly deviate from the terminology traditionally used by research communities in their layer of focus. △ Less

Submitted 16 September, 2018; originally announced September 2018.

Comments: 35 pages

Showing 1–10 of 10 results for author: Misailovic, S