Search | arXiv e-print repository

Constrained Decoding for Code Language Models via Efficient Left and Right Quotienting of Context-Sensitive Grammars

Authors: Daniel Melcer, Nathan Fulton, Sanjay Krishna Gouda, Haifeng Qian

Abstract: Large Language Models are powerful tools for program synthesis and advanced auto-completion, but come with no guarantee that their output code is syntactically correct. This paper contributes an incremental parser that allows early rejection of syntactically incorrect code, as well as efficient detection of complete programs for fill-in-the-middle (FItM) tasks. We develop Earley-style parsers that… ▽ More Large Language Models are powerful tools for program synthesis and advanced auto-completion, but come with no guarantee that their output code is syntactically correct. This paper contributes an incremental parser that allows early rejection of syntactically incorrect code, as well as efficient detection of complete programs for fill-in-the-middle (FItM) tasks. We develop Earley-style parsers that operate over left and right quotients of arbitrary context-free grammars, and we extend our incremental parsing and quotient operations to several context-sensitive features present in the grammars of many common programming languages. The result of these contributions is an efficient, general, and well-grounded method for left and right quotient parsing. To validate our theoretical contributions -- and the practical effectiveness of certain design decisions -- we evaluate our method on the particularly difficult case of FItM completion for Python 3. Our results demonstrate that constrained generation can significantly reduce the incidence of syntax errors in recommended code. △ Less

Submitted 27 February, 2024; originally announced February 2024.

Comments: 20 pages, Code available at https://github.com/amazon-science/incremental-parsing

arXiv:2210.14868 [pdf, other]

Multi-lingual Evaluation of Code Generation Models

Authors: Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, Sujan Kumar Gonugondla, Hantian Ding, Varun Kumar, Nathan Fulton, Arash Farahani, Siddhartha Jain, Robert Giaquinto, Haifeng Qian, Murali Krishna Ramanathan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang

Abstract: We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X. These datasets cover over 10 programming languages and are generated using a scalable conversion framework that transpiles prompts and test cases from the original Python datasets into the corresponding data in the target language. Using these benchmarks, we are able to assess the perform… ▽ More We present new benchmarks on evaluation code generation models: MBXP and Multilingual HumanEval, and MathQA-X. These datasets cover over 10 programming languages and are generated using a scalable conversion framework that transpiles prompts and test cases from the original Python datasets into the corresponding data in the target language. Using these benchmarks, we are able to assess the performance of code generation models in a multi-lingual fashion, and discovered generalization ability of language models on out-of-domain languages, advantages of multi-lingual models over mono-lingual, the ability of few-shot prompting to teach the model new languages, and zero-shot translation abilities even on mono-lingual settings. Furthermore, we use our code generation model to perform large-scale bootstrap** to obtain synthetic canonical solutions in several languages, which can be used for other code-related evaluations such as code insertion, robustness, or summarization tasks. Overall, our benchmarks represents a significant step towards a deeper understanding of language models' code generation abilities. We publicly release our code and datasets at https://github.com/amazon-research/mxeval. △ Less

Submitted 28 March, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

Comments: Code and data release: https://github.com/amazon-research/mxeval

arXiv:2106.01850 [pdf, ps, other]

Relational Analysis of Sensor Attacks on Cyber-Physical Systems

Authors: Jian Xiang, Nathan Fulton, Stephen Chong

Abstract: Cyber-physical systems, such as self-driving cars or autonomous aircraft, must defend against attacks that target sensor hardware. Analyzing system design can help engineers understand how a compromised sensor could impact the system's behavior; however, designing security analyses for cyber-physical systems is difficult due to their combination of discrete dynamics, continuous dynamics, and nonde… ▽ More Cyber-physical systems, such as self-driving cars or autonomous aircraft, must defend against attacks that target sensor hardware. Analyzing system design can help engineers understand how a compromised sensor could impact the system's behavior; however, designing security analyses for cyber-physical systems is difficult due to their combination of discrete dynamics, continuous dynamics, and nondeterminism. This paper contributes a framework for modeling and analyzing sensor attacks on cyber-physical systems, using the formalism of hybrid programs. We formalize and analyze two relational properties of a system's robustness. These relational properties respectively express (1) whether a system's safety property can be influenced by sensor attacks, and (2) whether a system's high-integrity state can be affected by sensor attacks. We characterize these relational properties by defining an equivalence relation between a system under attack and the original unattacked system. That is, the system satisfies the robustness properties if executions of the attacked system are appropriately related to executions of the unattacked system. We present two techniques for reasoning about the equivalence relation and thus proving the relational properties for a system. One proof technique decomposes large proof obligations to smaller proof obligations. The other proof technique adapts the self-composition technique from the literature on secure information-flow, allowing us to reduce reasoning about the equivalence of two systems to reasoning about properties of a single system. This technique allows us to reuse existing tools for reasoning about properties of hybrid programs, but is challenging due to the combination of discrete dynamics, continuous dynamics, and nondeterminism. To evaluate, we present three case studies motivated by real design flaws in existing cyber-physical systems. △ Less

Submitted 3 June, 2021; originally announced June 2021.

Comments: This is an extended version of the paper with the same title that appeared in the 2021 Computer Security Foundations Symposium

arXiv:2009.11403 [pdf, other]

CertRL: Formalizing Convergence Proofs for Value and Policy Iteration in Coq

Authors: Koundinya Vajjha, Avraham Shinnar, Vasily Pestun, Barry Trager, Nathan Fulton

Abstract: Reinforcement learning algorithms solve sequential decision-making problems in probabilistic environments by optimizing for long-term reward. The desire to use reinforcement learning in safety-critical settings inspires a recent line of work on formally constrained reinforcement learning; however, these methods place the implementation of the learning algorithm in their Trusted Computing Base. The… ▽ More Reinforcement learning algorithms solve sequential decision-making problems in probabilistic environments by optimizing for long-term reward. The desire to use reinforcement learning in safety-critical settings inspires a recent line of work on formally constrained reinforcement learning; however, these methods place the implementation of the learning algorithm in their Trusted Computing Base. The crucial correctness property of these implementations is a guarantee that the learning algorithm converges to an optimal policy. This paper begins the work of closing this gap by develo** a Coq formalization of two canonical reinforcement learning algorithms: value and policy iteration for finite state Markov decision processes. The central results are a formalization of Bellman's optimality principle and its proof, which uses a contraction property of Bellman optimality operator to establish that a sequence converges in the infinite horizon limit. The CertRL development exemplifies how the Giry monad and mechanized metric coinduction streamline optimality proofs for reinforcement learning algorithms. The CertRL library provides a general framework for proving properties about Markov decision processes and reinforcement learning algorithms, paving the way for further work on formalization of reinforcement learning algorithms. △ Less

Submitted 15 December, 2020; v1 submitted 23 September, 2020; originally announced September 2020.

ACM Class: D.2.4; I.2.8

arXiv:2007.01223 [pdf, other]

Verifiably Safe Exploration for End-to-End Reinforcement Learning

Authors: Nathan Hunt, Nathan Fulton, Sara Magliacane, Nghia Hoang, Subhro Das, Armando Solar-Lezama

Abstract: Deploying deep reinforcement learning in safety-critical settings requires develo** algorithms that obey hard constraints during exploration. This paper contributes a first approach toward enforcing formal safety constraints on end-to-end policies with visual inputs. Our approach draws on recent advances in object detection and automated reasoning for hybrid dynamical systems. The approach is ev… ▽ More Deploying deep reinforcement learning in safety-critical settings requires develo** algorithms that obey hard constraints during exploration. This paper contributes a first approach toward enforcing formal safety constraints on end-to-end policies with visual inputs. Our approach draws on recent advances in object detection and automated reasoning for hybrid dynamical systems. The approach is evaluated on a novel benchmark that emphasizes the challenge of safely exploring in the presence of hard constraints. Our benchmark draws from several proposed problem sets for safe learning and includes problems that emphasize challenges such as reward signals that are not aligned with safety constraints. On each of these benchmark problems, our algorithm completely avoids unsafe behavior while remaining competitive at optimizing for as much reward as is safe. We also prove that our method of enforcing the safety constraints preserves all safe policies from the original environment. △ Less

Submitted 2 July, 2020; originally announced July 2020.

ACM Class: F.3.1; I.2.8

arXiv:2006.09181 [pdf, other]

Formal Verification of End-to-End Learning in Cyber-Physical Systems: Progress and Challenges

Authors: Nathan Fulton, Nathan Hunt, Nghia Hoang, Subhro Das

Abstract: Autonomous systems -- such as self-driving cars, autonomous drones, and automated trains -- must come with strong safety guarantees. Over the past decade, techniques based on formal methods have enjoyed some success in providing strong correctness guarantees for large software systems including operating system kernels, cryptographic protocols, and control software for drones. These successes sugg… ▽ More Autonomous systems -- such as self-driving cars, autonomous drones, and automated trains -- must come with strong safety guarantees. Over the past decade, techniques based on formal methods have enjoyed some success in providing strong correctness guarantees for large software systems including operating system kernels, cryptographic protocols, and control software for drones. These successes suggest it might be possible to ensure the safety of autonomous systems by constructing formal, computer-checked correctness proofs. This paper identifies three assumptions underlying existing formal verification techniques, explains how each of these assumptions limits the applicability of verification in autonomous systems, and summarizes preliminary work toward improving the strength of evidence provided by formal verification. △ Less

Submitted 15 June, 2020; originally announced June 2020.

Comments: 7 pages, 4 figures. NeurIPS Workshop on Safety and Robustness in Decision Making, 2019

arXiv:1902.05632 [pdf, other]

doi 10.1007/978-3-030-17462-0_28

Verifiably Safe Off-Model Reinforcement Learning

Authors: Nathan Fulton, Andre Platzer

Abstract: The desire to use reinforcement learning in safety-critical settings has inspired a recent interest in formal methods for learning algorithms. Existing formal methods for learning and optimization primarily consider the problem of constrained learning or constrained optimization. Given a single correct model and associated safety constraint, these approaches guarantee efficient learning while prov… ▽ More The desire to use reinforcement learning in safety-critical settings has inspired a recent interest in formal methods for learning algorithms. Existing formal methods for learning and optimization primarily consider the problem of constrained learning or constrained optimization. Given a single correct model and associated safety constraint, these approaches guarantee efficient learning while provably avoiding behaviors outside the safety constraint. Acting well given an accurate environmental model is an important pre-requisite for safe learning, but is ultimately insufficient for systems that operate in complex heterogeneous environments. This paper introduces verification-preserving model updates, the first approach toward obtaining formal safety guarantees for reinforcement learning in settings where multiple environmental models must be taken into account. Through a combination of design-time model updates and runtime model falsification, we provide a first approach toward obtaining formal safety proofs for autonomous systems acting in heterogeneous environments. △ Less

Submitted 14 February, 2019; originally announced February 2019.

Comments: TACAS 2019

Showing 1–7 of 7 results for author: Fulton, N