-
Mechanised Hypersafety Proofs about Structured Data: Extended Version
Authors:
Vladimir Gladshtein,
Qiyuan Zhao,
Willow Ahrens,
Saman Amarasinghe,
Ilya Sergey
Abstract:
Arrays are a fundamental abstraction to represent collections of data. It is often possible to exploit structural properties of the data stored in an array (e.g., repetition or sparsity) to develop a specialised representation optimised for space efficiency. Formally reasoning about correctness of manipulations with such structured data is challenging, as they are often composed of multiple loops…
▽ More
Arrays are a fundamental abstraction to represent collections of data. It is often possible to exploit structural properties of the data stored in an array (e.g., repetition or sparsity) to develop a specialised representation optimised for space efficiency. Formally reasoning about correctness of manipulations with such structured data is challenging, as they are often composed of multiple loops with non-trivial invariants.
In this work, we observe that specifications for structured data manipulations can be phrased as hypersafety properties, i.e., predicates that relate traces of $k$ programs. To turn this observation into an effective verification methodology, we developed the Logic for Graceful Tensor Manipulation (LGTM), a new Hoare-style relational separation logic for specifying and verifying computations over structured data. The key enabling idea of LGTM is that of parametrised hypersafety specifications that allow the number $k$ of the program components to depend on the program variables. We implemented LGTM as a foundational embedding into Coq, mechanising its rules, meta-theory, and the proof of soundness. Furthermore, we developed a library of domain-specific tactics that automate computer-aided hypersafety reasoning, resulting in pleasantly short proof scripts that enjoy a high degree of reuse. We argue for the effectiveness of relational reasoning about structured data in LGTM by specifying and mechanically proving correctness of 13 case studies including computations on compressed arrays and efficient operations over multiple kinds of sparse tensors.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Small Scale Reflection for the Working Lean User
Authors:
Vladimir Gladshtein,
George Pîrlea,
Ilya Sergey
Abstract:
We present the design and implementation of the Small Scale Reflection proof methodology and tactic language (a.k.a. SSR) for the Lean 4 proof assistant. Like its Coq predecessor SSReflect, our Lean 4 implementation, dubbed LeanSSR, provides powerful rewriting principles and means for effective management of hypotheses in the proof context. Unlike SSReflect for Coq, LeanSSR does not require explic…
▽ More
We present the design and implementation of the Small Scale Reflection proof methodology and tactic language (a.k.a. SSR) for the Lean 4 proof assistant. Like its Coq predecessor SSReflect, our Lean 4 implementation, dubbed LeanSSR, provides powerful rewriting principles and means for effective management of hypotheses in the proof context. Unlike SSReflect for Coq, LeanSSR does not require explicit switching between the logical and symbolic representation of a goal, allowing for even more concise proof scripts that seamlessly combine deduction steps with proofs by computation.
In this paper, we first provide a gentle introduction to the principles of structuring mechanised proofs using LeanSSR. Next, we show how the native support for metaprogramming in Lean 4 makes it possible to develop LeanSSR entirely within the proof assistant, greatly improving the overall experience of both tactic implementers and proof engineers. Finally, we demonstrate the utility of LeanSSR by conducting two case studies: (a) porting a collection of Coq lemmas about sequences from the widely used Mathematical Components library and (b) reimplementing proofs in the finite set library of Lean's mathlib4. Both case studies show significant reduction in proof sizes.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Greybox Fuzzing of Distributed Systems
Authors:
Ruijie Meng,
George Pîrlea,
Abhik Roychoudhury,
Ilya Sergey
Abstract:
Grey-box fuzzing is the lightweight approach of choice for finding bugs in sequential programs. It provides a balance between efficiency and effectiveness by conducting a biased random search over the domain of program inputs using a feedback function from observed test executions. For distributed system testing, however, the state-of-practice is represented today by only black-box tools that do n…
▽ More
Grey-box fuzzing is the lightweight approach of choice for finding bugs in sequential programs. It provides a balance between efficiency and effectiveness by conducting a biased random search over the domain of program inputs using a feedback function from observed test executions. For distributed system testing, however, the state-of-practice is represented today by only black-box tools that do not attempt to infer and exploit any knowledge of the system's past behaviours to guide the search for bugs.
In this work, we present Mallory: the first framework for grey-box fuzz-testing of distributed systems. Unlike popular black-box distributed system fuzzers, such as Jepsen, that search for bugs by randomly injecting network partitions and node faults or by following human-defined schedules, Mallory is adaptive. It exercises a novel metric to learn how to maximize the number of observed system behaviors by choosing different sequences of faults, thus increasing the likelihood of finding new bugs. Our approach relies on timeline-driven testing. Mallory dynamically constructs Lamport timelines of the system behaviour and further abstracts these timelines into happens-before summaries, which serve as a feedback function guiding the fuzz campaign. Subsequently, Mallory reactively learns a policy using Q-learning, enabling it to introduce faults guided by its real-time observation of the summaries.
We have evaluated Mallory on a diverse set of widely-used industrial distributed systems. Compared to the start-of-the-art black-box fuzzer Jepsen, Mallory explores more behaviours and takes less time to find bugs. Mallory discovered 22 zero-day bugs (of which 18 were confirmed by developers), including 10 new vulnerabilities, in rigorously-tested distributed systems such as Braft, Dqlite, and Redis. 6 new CVEs have been assigned.
△ Less
Submitted 12 August, 2023; v1 submitted 4 May, 2023;
originally announced May 2023.
-
HIPPODROME: Data Race Repair using Static Analysis Summaries
Authors:
Andreea Costea,
Abhishek Tiwari,
Sigmund Chianasta,
Kishore R,
Abhik Roychoudhury,
Ilya Sergey
Abstract:
Implementing bug-free concurrent programs is a challenging task in modern software development. State-of-the-art static analyses find hundreds of concurrency bugs in production code, scaling to large codebases. Yet, fixing these bugs in constantly changing codebases represents a daunting effort for programmers, particularly because a fix in the concurrent code can introduce other bugs in a subtle…
▽ More
Implementing bug-free concurrent programs is a challenging task in modern software development. State-of-the-art static analyses find hundreds of concurrency bugs in production code, scaling to large codebases. Yet, fixing these bugs in constantly changing codebases represents a daunting effort for programmers, particularly because a fix in the concurrent code can introduce other bugs in a subtle way.
In this work, we show how to harness compositional static analysis for concurrency bug detection, to enable a new Automated Program Repair (APR) technique for data races in large concurrent Java codebases. The key innovation of our work is an algorithm that translates procedure summaries inferred by the analysis tool for the purpose of bug reporting, into small local patches that fix concurrency bugs (without introducing new ones). This synergy makes it possible to extend the virtues of compositional static concurrency analysis to APR, making our approach effective (it can detect and fix many more bugs than existing tools for data race repair), scalable (it takes seconds to analyse and suggest fixes for sizeable codebases), and usable (generally, it does not require annotations from the users and can perform continuous automated repair). Our study conducted on popular open-source projects has confirmed that our tool automatically produces concurrency fixes similar to those proposed by the developers in the past.
△ Less
Submitted 6 August, 2021; v1 submitted 5 August, 2021;
originally announced August 2021.
-
Compiling a Higher-Order Smart Contract Language to LLVM
Authors:
Vaivaswatha Nagaraj,
Jacob Johannsen,
Anton Trunov,
George Pîrlea,
Amrit Kumar,
Ilya Sergey
Abstract:
Scilla is a higher-order polymorphic typed intermediate level language for implementing smart contracts. In this talk, we describe a Scilla compiler targeting LLVM, with a focus on map** Scilla types, values, and its functional language constructs to LLVM-IR.
The compiled LLVM-IR, when executed with LLVM's JIT framework, achieves a speedup of about 10x over the reference interpreter on a typic…
▽ More
Scilla is a higher-order polymorphic typed intermediate level language for implementing smart contracts. In this talk, we describe a Scilla compiler targeting LLVM, with a focus on map** Scilla types, values, and its functional language constructs to LLVM-IR.
The compiled LLVM-IR, when executed with LLVM's JIT framework, achieves a speedup of about 10x over the reference interpreter on a typical Scilla contract. This reduced latency is crucial in the setting of blockchains, where smart contracts are executed as parts of transactions, to achieve peak transactions processed per second. Experiments on the Ackermann function achieved a speedup of more than 45x. This talk abstract is aimed at both programming language researchers looking to implement an LLVM based compiler for their functional language, as well as at LLVM practitioners.
△ Less
Submitted 12 August, 2020;
originally announced August 2020.
-
Certifying Certainty and Uncertainty in Approximate Membership Query Structures -- Extended Version
Authors:
Kiran Gopinathan,
Ilya Sergey
Abstract:
Approximate Membership Query structures (AMQs) rely on randomisation for time- and space-efficiency, while introducing a possibility of false positive and false negative answers. Correctness proofs of such structures involve subtle reasoning about bounds on probabilities of getting certain outcomes. Because of these subtleties, a number of unsound arguments in such proofs have been made over the y…
▽ More
Approximate Membership Query structures (AMQs) rely on randomisation for time- and space-efficiency, while introducing a possibility of false positive and false negative answers. Correctness proofs of such structures involve subtle reasoning about bounds on probabilities of getting certain outcomes. Because of these subtleties, a number of unsound arguments in such proofs have been made over the years. In this work, we address the challenge of building rigorous and reusable computer-assisted proofs about probabilistic specifications of AMQs. We describe the framework for systematic decomposition of AMQs and their properties into a series of interfaces and reusable components. We implement our framework as a library in the Coq proof assistant and showcase it by encoding in it a number of non-trivial AMQs, such as Bloom filters, counting filters, quotient filters and blocked constructions, and mechanising the proofs of their probabilistic specifications. We demonstrate how AMQs encoded in our framework guarantee the absence of false negatives by construction. We also show how the proofs about probabilities of false positives for complex AMQs can be obtained by means of verified reduction to the implementations of their simpler counterparts. Finally, we provide a library of domain-specific theorems and tactics that allow a high degree of automation in probabilistic proofs.
△ Less
Submitted 28 April, 2020;
originally announced April 2020.
-
QED at Large: A Survey of Engineering of Formally Verified Software
Authors:
Talia Ringer,
Karl Palmskog,
Ilya Sergey,
Milos Gligoric,
Zachary Tatlock
Abstract:
Development of formal proofs of correctness of programs can increase actual and perceived reliability and facilitate better understanding of program specifications and their underlying assumptions. Tools supporting such development have been available for over 40 years, but have only recently seen wide practical use. Projects based on construction of machine-checked formal proofs are now reaching…
▽ More
Development of formal proofs of correctness of programs can increase actual and perceived reliability and facilitate better understanding of program specifications and their underlying assumptions. Tools supporting such development have been available for over 40 years, but have only recently seen wide practical use. Projects based on construction of machine-checked formal proofs are now reaching an unprecedented scale, comparable to large software projects, which leads to new challenges in proof development and maintenance. Despite its increasing importance, the field of proof engineering is seldom considered in its own right; related theories, techniques, and tools span many fields and venues. This survey of the literature presents a holistic understanding of proof engineering for program correctness, covering impact in practice, foundations, proof automation, proof organization, and practical proof development.
△ Less
Submitted 13 March, 2020;
originally announced March 2020.
-
Concise Read-Only Specifications for Better Synthesis of Programs with Pointers -- Extended Version
Authors:
Andreea Costea,
Amy Zhu,
Nadia Polikarpova,
Ilya Sergey
Abstract:
In program synthesis there is a well-known trade-off between concise and strong specifications: if a specification is too verbose, it might be harder to write than the program; if it is too weak, the synthesised program might not match the user's intent. In this work we explore the use of annotations for restricting memory access permissions in program synthesis, and show that they can make specif…
▽ More
In program synthesis there is a well-known trade-off between concise and strong specifications: if a specification is too verbose, it might be harder to write than the program; if it is too weak, the synthesised program might not match the user's intent. In this work we explore the use of annotations for restricting memory access permissions in program synthesis, and show that they can make specifications much stronger while remaining surprisingly concise. Specifically, we enhance Synthetic Separation Logic (SSL), a framework for synthesis of heap-manipulating programs, with the logical mechanism of read-only borrows. We observe that this minimalistic and conservative SSL extension benefits the synthesis in several ways, making it more (a) expressive (stronger correctness guarantees are achieved with a modest annotation overhead), (b) effective (it produces more concise and easier-to-read programs), (c) efficient (faster synthesis), and (d) robust (synthesis efficiency is less affected by the choice of the search heuristic). We explain the intuition and provide formal treatment for read-only borrows. We substantiate the claims (a)--(d) by describing our quantitative evaluation of the borrowing-aware synthesis implementation on a series of standard benchmark specifications for various heap-manipulating programs.
△ Less
Submitted 29 January, 2020;
originally announced January 2020.
-
Running on Fumes--Preventing Out-of-Gas Vulnerabilities in Ethereum Smart Contracts using Static Resource Analysis
Authors:
Elvira Albert,
Pablo Gordillo,
Albert Rubio,
Ilya Sergey
Abstract:
Gas is a measurement unit of the computational effort that it will take to execute every single operation that takes part in the Ethereum blockchain platform. Each instruction executed by the Ethereum Virtual Machine (EVM) has an associated gas consumption specified by Ethereum. If a transaction exceeds the amount of gas allotted by the user (known as gas limit), an out-of-gas exception is raised.…
▽ More
Gas is a measurement unit of the computational effort that it will take to execute every single operation that takes part in the Ethereum blockchain platform. Each instruction executed by the Ethereum Virtual Machine (EVM) has an associated gas consumption specified by Ethereum. If a transaction exceeds the amount of gas allotted by the user (known as gas limit), an out-of-gas exception is raised. There is a wide family of contract vulnerabilities due to out-of-gas behaviours. We report on the design and implementation of GASTAP, a Gas-Aware Smart contracT Analysis Platform, which takes as input a smart contract (either in EVM, disassembled EVM, or in Solidity source code) and automatically infers sound gas upper bounds for all its public functions. Our bounds ensure that if the gas limit paid by the user is higher than our inferred gas bounds, the contract is free of out-of-gas vulnerabilities.
△ Less
Submitted 7 August, 2019; v1 submitted 22 November, 2018;
originally announced November 2018.
-
A True Positives Theorem for a Static Race Detector - Extended Version
Authors:
Nikos Gorogiannis,
Peter W. O'Hearn,
Ilya Sergey
Abstract:
RacerD is a static race detector that has been proven to be effective in engineering practice: it has seen thousands of data races fixed by developers before reaching production, and has supported the migration of Facebook's Android app rendering infrastructure from a single-threaded to a multi-threaded architecture. We prove a True Positives Theorem stating that, under certain assumptions, an ide…
▽ More
RacerD is a static race detector that has been proven to be effective in engineering practice: it has seen thousands of data races fixed by developers before reaching production, and has supported the migration of Facebook's Android app rendering infrastructure from a single-threaded to a multi-threaded architecture. We prove a True Positives Theorem stating that, under certain assumptions, an idealized theoretical version of the analysis never reports a false positive. We also provide an empirical evaluation of an implementation of this analysis, versus the original RacerD.
The theorem was motivated in the first case by the desire to understand the observation from production that RacerD was providing remarkably accurate signal to developers, and then the theorem guided further analyzer design decisions. Technically, our result can be seen as saying that the analysis computes an under-approximation of an over-approximation, which is the reverse of the more usual (over of under) situation in static analysis. Until now, static analyzers that are effective in practice but unsound have often been regarded as ad hoc; in contrast, we suggest that, in the future, theorems of this variety might be generally useful in understanding, justifying and designing effective static analyses for bug catching.
△ Less
Submitted 8 November, 2018;
originally announced November 2018.
-
Exploiting The Laws of Order in Smart Contracts
Authors:
Aashish Kolluri,
Ivica Nikolic,
Ilya Sergey,
Aquinas Hobor,
Prateek Saxena
Abstract:
We investigate a family of bugs in blockchain-based smart contracts, which we call event-ordering (or EO) bugs. These bugs are intimately related to the dynamic ordering of contract events, i.e., calls of its functions on the blockchain, and enable potential exploits of millions of USD worth of Ether. Known examples of such bugs and prior techniques to detect them have been restricted to a small n…
▽ More
We investigate a family of bugs in blockchain-based smart contracts, which we call event-ordering (or EO) bugs. These bugs are intimately related to the dynamic ordering of contract events, i.e., calls of its functions on the blockchain, and enable potential exploits of millions of USD worth of Ether. Known examples of such bugs and prior techniques to detect them have been restricted to a small number of event orderings, typicall 1 or 2. Our work provides a new formulation of this general class of EO bugs as finding concurrency properties arising in long permutations of such events. The technical challenge in detecting our formulation of EO bugs is the inherent combinatorial blowup in path and state space analysis, even for simple contracts. We propose the first use of partial-order reduction techniques, using happen-before relations extracted automatically for contracts, along with several other optimizations built on a dynamic symbolic execution technique. We build an automatic tool called ETHRACER that requires no hints from users and runs directly on Ethereum bytecode. It flags 7-11% of over ten thousand contracts analyzed in roughly 18.5 minutes per contract, providing compact event traces that human analysts can run as witnesses. These witnesses are so compact that confirmations require only a few minutes of human effort. Half of the flagged contracts have subtle EO bugs, including in ERC-20 contracts that carry hundreds of millions of dollars worth of Ether. Thus, ETHRACER is effective at detecting a subtle yet dangerous class of bugs which existing tools miss.
△ Less
Submitted 27 October, 2018;
originally announced October 2018.
-
Structuring the Synthesis of Heap-Manipulating Programs - Extended Version
Authors:
Nadia Polikarpova,
Ilya Sergey
Abstract:
This paper describes a deductive approach to synthesizing imperative programs with pointers from declarative specifications expressed in Separation Logic. Our synthesis algorithm takes as input a pair of assertions---a pre- and a postcondition---which describe two states of the symbolic heap, and derives a program that transforms one state into the other, guided by the shape of the heap. The progr…
▽ More
This paper describes a deductive approach to synthesizing imperative programs with pointers from declarative specifications expressed in Separation Logic. Our synthesis algorithm takes as input a pair of assertions---a pre- and a postcondition---which describe two states of the symbolic heap, and derives a program that transforms one state into the other, guided by the shape of the heap. The program synthesis algorithm rests on the novel framework of Synthetic Separation Logic (SSL), which generalises the classical notion of heap entailment $\mathcal{P} \vdash \mathcal{Q}$ to incorporate a possibility of transforming a heap satisfying an assertion $\mathcal{P}$ into a heap satisfying an assertion $\mathcal{Q}$. A synthesized program represents a proof term for a transforming entailment statement $\mathcal{P} \leadsto \mathcal{Q}$, and the synthesis procedure corresponds to a proof search. The derived programs are, thus, correct by construction, in the sense that they satisfy the ascribed pre/postconditions, and are accompanied by complete proof derivations, which can be checked independently.
We have implemented a proof search engine for SSL in a form the program synthesizer called SuSLik. For efficiency, the engine exploits properties of SSL rules, such as invertibility and commutativity of rule applications on separate heaps, to prune the space of derivations it has to consider. We explain and showcase the use of SSL on characteristic examples, describe the design of SuSLik, and report on our experience of using it to synthesize a series of benchmark programs manipulating heap-based linked data structures.
△ Less
Submitted 8 November, 2018; v1 submitted 18 July, 2018;
originally announced July 2018.
-
EthIR: A Framework for High-Level Analysis of Ethereum Bytecode
Authors:
Elvira Albert,
Pablo Gordillo,
Benjamin Livshits,
Albert Rubio,
Ilya Sergey
Abstract:
Analyzing Ethereum bytecode, rather than the source code from which it was generated, is a necessity when: (1) the source code is not available (e.g., the blockchain only stores the bytecode), (2) the information to be gathered in the analysis is only visible at the level of bytecode (e.g., gas consumption is specified at the level of EVM instructions), (3) the analysis results may be affected by…
▽ More
Analyzing Ethereum bytecode, rather than the source code from which it was generated, is a necessity when: (1) the source code is not available (e.g., the blockchain only stores the bytecode), (2) the information to be gathered in the analysis is only visible at the level of bytecode (e.g., gas consumption is specified at the level of EVM instructions), (3) the analysis results may be affected by optimizations performed by the compiler (thus the analysis should be done ideally after compilation). This paper presents EthIR, a framework for analyzing Ethereum bytecode, which relies on (an extension of) OYENTE, a tool that generates CFGs; EthIR produces from the CFGs, a rule-based representation (RBR) of the bytecode that enables the application of (existing) high-level analyses to infer properties of EVM code.
△ Less
Submitted 10 May, 2018;
originally announced May 2018.
-
Finding The Greedy, Prodigal, and Suicidal Contracts at Scale
Authors:
Ivica Nikolic,
Aashish Kolluri,
Ilya Sergey,
Prateek Saxena,
Aquinas Hobor
Abstract:
Smart contracts---stateful executable objects hosted on blockchains like Ethereum---carry billions of dollars worth of coins and cannot be updated once deployed. We present a new systematic characterization of a class of trace vulnerabilities, which result from analyzing multiple invocations of a contract over its lifetime. We focus attention on three example properties of such trace vulnerabiliti…
▽ More
Smart contracts---stateful executable objects hosted on blockchains like Ethereum---carry billions of dollars worth of coins and cannot be updated once deployed. We present a new systematic characterization of a class of trace vulnerabilities, which result from analyzing multiple invocations of a contract over its lifetime. We focus attention on three example properties of such trace vulnerabilities: finding contracts that either lock funds indefinitely, leak them carelessly to arbitrary users, or can be killed by anyone. We implemented MAIAN, the first tool for precisely specifying and reasoning about trace properties, which employs inter-procedural symbolic analysis and concrete validator for exhibiting real exploits. Our analysis of nearly one million contracts flags 34,200 (2,365 distinct) contracts vulnerable, in 10 seconds per contract. On a subset of3,759 contracts which we sampled for concrete validation and manual analysis, we reproduce real exploits at a true positive rate of 89%, yielding exploits for3,686 contracts. Our tool finds exploits for the infamous Parity bug that indirectly locked 200 million dollars worth in Ether, which previous analyses failed to capture.
△ Less
Submitted 14 March, 2018; v1 submitted 16 February, 2018;
originally announced February 2018.
-
Paxos Consensus, Deconstructed and Abstracted (Extended Version)
Authors:
Álvaro García-Pérez,
Alexey Gotsman,
Yuri Meshman,
Ilya Sergey
Abstract:
Lamport's Paxos algorithm is a classic consensus protocol for state machine replication in environments that admit crash failures. Many versions of Paxos exploit the protocol's intrinsic properties for the sake of gaining better run-time performance, thus widening the gap between the original description of the algorithm, which was proven correct, and its real-world implementations. In this work,…
▽ More
Lamport's Paxos algorithm is a classic consensus protocol for state machine replication in environments that admit crash failures. Many versions of Paxos exploit the protocol's intrinsic properties for the sake of gaining better run-time performance, thus widening the gap between the original description of the algorithm, which was proven correct, and its real-world implementations. In this work, we address the challenge of specifying and verifying complex Paxos-based systems by (a) devising composable specifications for implementations of Paxos's single-decree version, and (b) engineering disciplines to reason about protocol-aware, semantics-preserving optimisations to single-decree Paxos. In a nutshell, our approach elaborates on the deconstruction of single-decree Paxos by Boichat et al. We provide novel non-deterministic specifications for each module in the deconstruction and prove that the implementations refine the corresponding specifications, such that the proofs of the modules that remain unchanged can be reused across different implementations. We further reuse this result and show how to obtain a verified implementation of Multi-Paxos from a verified implementation of single-decree Paxos, by a series of novel protocol-aware transformations of the network semantics, which we prove to be behaviour-preserving.
△ Less
Submitted 16 February, 2018;
originally announced February 2018.
-
Scilla: a Smart Contract Intermediate-Level LAnguage
Authors:
Ilya Sergey,
Amrit Kumar,
Aquinas Hobor
Abstract:
This paper outlines key design principles of Scilla---an intermediate-level language for verified smart contracts.
Scilla provides a clean separation between the communication aspect of smart contracts on a blockchain, allowing for the rich interaction patterns, and a programming component, which enjoys principled semantics and is amenable to formal verification. Scilla is not meant to be a high…
▽ More
This paper outlines key design principles of Scilla---an intermediate-level language for verified smart contracts.
Scilla provides a clean separation between the communication aspect of smart contracts on a blockchain, allowing for the rich interaction patterns, and a programming component, which enjoys principled semantics and is amenable to formal verification. Scilla is not meant to be a high-level programming language, and we are going to use it as a translation target for high-level languages, such as Solidity, for performing program analysis and verification, before further compilation to an executable bytecode.
We describe the automata-based model of Scilla, present its programming component and show how contract definitions in terms of automata streamline the process of mechanised verification of their safety and temporal properties.
△ Less
Submitted 2 January, 2018;
originally announced January 2018.
-
A Concurrent Perspective on Smart Contracts
Authors:
Ilya Sergey,
Aquinas Hobor
Abstract:
In this paper, we explore remarkable similarities between multi-transactional behaviors of smart contracts in cryptocurrencies such as Ethereum and classical problems of shared-memory concurrency. We examine two real-world examples from the Ethereum blockchain and analyzing how they are vulnerable to bugs that are closely reminiscent to those that often occur in traditional concurrent programs. We…
▽ More
In this paper, we explore remarkable similarities between multi-transactional behaviors of smart contracts in cryptocurrencies such as Ethereum and classical problems of shared-memory concurrency. We examine two real-world examples from the Ethereum blockchain and analyzing how they are vulnerable to bugs that are closely reminiscent to those that often occur in traditional concurrent programs. We then elaborate on the relation between observable contract behaviors and well-studied concurrency topics, such as atomicity, interference, synchronization, and resource ownership. The described contracts-as-concurrent-objects analogy provides deeper understanding of potential threats for smart contracts, indicate better engineering practices, and enable applications of existing state-of-the-art formal verification techniques.
△ Less
Submitted 17 February, 2017;
originally announced February 2017.
-
Operational Aspects of C/C++ Concurrency
Authors:
Anton Podkopaev,
Ilya Sergey,
Aleksandar Nanevski
Abstract:
In this work, we present a family of operational semantics that gradually approximates the realistic program behaviors in the C/C++11 memory model. Each semantics in our framework is built by elaborating and combining two simple ingredients: viewfronts and operation buffers. Viewfronts allow us to express the spatial aspect of thread interaction, i.e., which values a thread can read, while operati…
▽ More
In this work, we present a family of operational semantics that gradually approximates the realistic program behaviors in the C/C++11 memory model. Each semantics in our framework is built by elaborating and combining two simple ingredients: viewfronts and operation buffers. Viewfronts allow us to express the spatial aspect of thread interaction, i.e., which values a thread can read, while operation buffers enable manipulation with the temporal execution aspect, i.e., determining the order in which the results of certain operations can be observed by concurrently running threads.
Starting from a simple abstract state machine, through a series of gradual refinements of the abstract state, we capture such language aspects and synchronization primitives as release/acquire atomics, sequentially-consistent and non-atomic memory accesses, also providing a semantics for relaxed atomics, while avoiding the Out-of-Thin-Air problem. To the best of our knowledge, this is the first formal and executable operational semantics of C11 capable of expressing all essential concurrent aspects of the standard.
We illustrate our approach via a number of characteristic examples, relating the observed behaviors to those of standard litmus test programs from the literature. We provide an executable implementation of the semantics in PLT Redex, along with a number of implemented litmus tests and examples, and showcase our prototype on a large case study: randomized testing and debugging of a realistic Read-Copy-Update data structure.
△ Less
Submitted 9 July, 2016; v1 submitted 4 June, 2016;
originally announced June 2016.
-
Concurrent Data Structures Linked in Time
Authors:
Germán Andrés Delbianco,
Ilya Sergey,
Aleksandar Nanevski,
Anindya Banerjee
Abstract:
Arguments about correctness of a concurrent data structure are typically carried out by using the notion of linearizability and specifying the linearization points of the data structure's procedures. Such arguments are often cumbersome as the linearization points' position in time can be dynamic (depend on the interference, run-time values and events from the past, or even future), non-local (appe…
▽ More
Arguments about correctness of a concurrent data structure are typically carried out by using the notion of linearizability and specifying the linearization points of the data structure's procedures. Such arguments are often cumbersome as the linearization points' position in time can be dynamic (depend on the interference, run-time values and events from the past, or even future), non-local (appear in procedures other than the one considered), and whose position in the execution trace may only be determined after the considered procedure has already terminated.
In this paper we propose a new method, based on a separation-style logic, for reasoning about concurrent objects with such linearization points. We embrace the dynamic nature of linearization points, and encode it as part of the data structure's auxiliary state, so that it can be dynamically modified in place by auxiliary code, as needed when some appropriate run-time event occurs. We name the idea linking-in-time, because it reduces temporal reasoning to spatial reasoning. For example, modifying a temporal position of a linearization point can be modeled similarly to a pointer update in separation logic. Furthermore, the auxiliary state provides a convenient way to concisely express the properties essential for reasoning about clients of such concurrent objects. We illustrate the method by verifying (mechanically in Coq) an intricate optimal snapshot algorithm due to Jayanti, as well as some clients.
△ Less
Submitted 18 January, 2017; v1 submitted 27 April, 2016;
originally announced April 2016.
-
Hoare-style Specifications as Correctness Conditions for Non-linearizable Concurrent Objects
Authors:
Ilya Sergey,
Aleksandar Nanevski,
Anindya Banerjee,
German Andres Delbianco
Abstract:
Designing scalable concurrent objects, which can be efficiently used on multicore processors, often requires one to abandon standard specification techniques, such as linearizability, in favor of more relaxed consistency requirements. However, the variety of alternative correctness conditions makes it difficult to choose which one to employ in a particular case, and to compose them when using obje…
▽ More
Designing scalable concurrent objects, which can be efficiently used on multicore processors, often requires one to abandon standard specification techniques, such as linearizability, in favor of more relaxed consistency requirements. However, the variety of alternative correctness conditions makes it difficult to choose which one to employ in a particular case, and to compose them when using objects whose behaviors are specified via different criteria. The lack of syntactic verification methods for most of these criteria poses challenges in their systematic adoption and application.
In this paper, we argue for using Hoare-style program logics as an alternative and uniform approach for specification and compositional formal verification of safety properties for concurrent objects and their client programs. Through a series of case studies, we demonstrate how an existing program logic for concurrency can be employed off-the-shelf to capture important state and history invariants, allowing one to explicitly quantify over interference of environment threads and provide intuitive and expressive Hoare-style specifications for several non-linearizable concurrent objects that were previously specified only via dedicated correctness criteria. We illustrate the adequacy of our specifications by verifying a number of concurrent client scenarios, that make use of the previously specified concurrent objects, capturing the essence of such correctness conditions as concurrency-aware linearizability, quiescent, and quantitative quiescent consistency. All examples described in this paper are verified mechanically in Coq.
△ Less
Submitted 21 July, 2016; v1 submitted 21 September, 2015;
originally announced September 2015.
-
Specifying and Verifying Concurrent Algorithms with Histories and Subjectivity
Authors:
Ilya Sergey,
Aleksandar Nanevski,
Anindya Banerjee
Abstract:
We present a lightweight approach to Hoare-style specifications for fine-grained concurrency, based on a notion of time-stamped histories that abstractly capture atomic changes in the program state. Our key observation is that histories form a partial commutative monoid, a structure fundamental for representation of concurrent resources. This insight provides us with a unifying mechanism that allo…
▽ More
We present a lightweight approach to Hoare-style specifications for fine-grained concurrency, based on a notion of time-stamped histories that abstractly capture atomic changes in the program state. Our key observation is that histories form a partial commutative monoid, a structure fundamental for representation of concurrent resources. This insight provides us with a unifying mechanism that allows us to treat histories just like heaps in separation logic. For example, both are subject to the same assertion logic and inference rules (e.g., the frame rule). Moreover, the notion of ownership transfer, which usually applies to heaps, has an equivalent in histories. It can be used to formally represent hel**---an important design pattern for concurrent algorithms whereby one thread can execute code on behalf of another. Specifications in terms of histories naturally abstract granularity, in the sense that sophisticated fine-grained algorithms can be given the same specifications as their simplified coarse-grained counterparts, making them equally convenient for client-side reasoning. We illustrate our approach on a number of examples and validate all of them in Coq.
△ Less
Submitted 1 October, 2014;
originally announced October 2014.
-
Pushdown flow analysis with abstract garbage collection
Authors:
J. Ian Johnson,
Ilya Sergey,
Christopher Earl,
Matthew Might,
David Van Horn
Abstract:
In the static analysis of functional programs, pushdown flow analysis and abstract garbage collection push the boundaries of what we can learn about programs statically. This work illuminates and poses solutions to theoretical and practical challenges that stand in the way of combining the power of these techniques. Pushdown flow analysis grants unbounded yet computable polyvariance to the analysi…
▽ More
In the static analysis of functional programs, pushdown flow analysis and abstract garbage collection push the boundaries of what we can learn about programs statically. This work illuminates and poses solutions to theoretical and practical challenges that stand in the way of combining the power of these techniques. Pushdown flow analysis grants unbounded yet computable polyvariance to the analysis of return-flow in higher-order programs. Abstract garbage collection grants unbounded polyvariance to abstract addresses which become unreachable between invocations of the abstract contexts in which they were created. Pushdown analysis solves the problem of precisely analyzing recursion in higher-order languages; abstract garbage collection is essential in solving the "stickiness" problem. Alone, our benchmarks demonstrate that each method can reduce analysis times and boost precision by orders of magnitude. We combine these methods. The challenge in marrying these techniques is not subtle: computing the reachable control states of a pushdown system relies on limiting access during transition to the top of the stack; abstract garbage collection, on the other hand, needs full access to the entire stack to compute a root set, just as concrete collection does. Conditional pushdown systems were developed for just such a conundrum, but existing methods are ill-suited for the dynamic nature of garbage collection. We show fully precise and approximate solutions to the feasible paths problem for pushdown garbage-collecting control-flow analysis. Experiments reveal synergistic interplay between garbage collection and pushdown techniques, and the fusion demonstrates "better-than-both-worlds" precision.
△ Less
Submitted 19 June, 2014;
originally announced June 2014.
-
Introspective Pushdown Analysis of Higher-Order Programs
Authors:
Christopher Earl,
Ilya Sergey,
Matthew Might,
David Van Horn
Abstract:
In the static analysis of functional programs, pushdown flow analysis and abstract garbage collection skirt just inside the boundaries of soundness and decidability. Alone, each method reduces analysis times and boosts precision by orders of magnitude. This work illuminates and conquers the theoretical challenges that stand in the way of combining the power of these techniques. The challenge in ma…
▽ More
In the static analysis of functional programs, pushdown flow analysis and abstract garbage collection skirt just inside the boundaries of soundness and decidability. Alone, each method reduces analysis times and boosts precision by orders of magnitude. This work illuminates and conquers the theoretical challenges that stand in the way of combining the power of these techniques. The challenge in marrying these techniques is not subtle: computing the reachable control states of a pushdown system relies on limiting access during transition to the top of the stack; abstract garbage collection, on the other hand, needs full access to the entire stack to compute a root set, just as concrete collection does. \emph{Introspective} pushdown systems resolve this conflict. Introspective pushdown systems provide enough access to the stack to allow abstract garbage collection, but they remain restricted enough to compute control-state reachability, thereby enabling the sound and precise product of pushdown analysis and abstract garbage collection. Experiments reveal synergistic interplay between the techniques, and the fusion demonstrates "better-than-both-worlds" precision.
△ Less
Submitted 7 July, 2012;
originally announced July 2012.