-
Domain Reasoning in TopKAT
Authors:
Cheng Zhang,
Arthur Azevedo de Amorim,
Marco Gaboardi
Abstract:
TopKAT is the algebraic theory of Kleene algebra with tests (KAT) extended with a top element. Compared to KAT, one pleasant feature of TopKAT is that, in relational models, the top element allows us to express the domain and codomain of a relation. This enables several applications in program logics, such as proving under-approximate specifications or reachability properties of imperative program…
▽ More
TopKAT is the algebraic theory of Kleene algebra with tests (KAT) extended with a top element. Compared to KAT, one pleasant feature of TopKAT is that, in relational models, the top element allows us to express the domain and codomain of a relation. This enables several applications in program logics, such as proving under-approximate specifications or reachability properties of imperative programs. However, while TopKAT inherits many pleasant features of KATs, such as having a decidable equational theory, it is incomplete with respect to relational models. In other words, there are properties that hold true of all relational TopKATs but cannot be proved with the axioms of TopKAT. This issue is potentially worrisome for program-logic applications, in which relational models play a key role.
In this paper, we further investigate the completeness properties of TopKAT with respect to relational models. We show that TopKAT is complete with respect to (co)domain comparison of KAT terms, but incomplete when comparing the (co)domain of arbitrary TopKAT terms. Since the encoding of under-approximate specifications in TopKAT hinges on this type of formula, the aforementioned incompleteness results have a limited impact when using TopKAT to reason about such specifications.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
SECOMP: Formally Secure Compilation of Compartmentalized C Programs
Authors:
Jérémy Thibault,
Roberto Blanco,
Dongjae Lee,
Sven Argo,
Arthur Azevedo de Amorim,
Aïna Linn Georges,
Catalin Hritcu,
Andrew Tolmach
Abstract:
Undefined behavior in C often causes devastating security vulnerabilities. One practical mitigation is compartmentalization, which allows developers to structure large programs into mutually distrustful compartments with clearly specified privileges and interactions. In this paper we introduce SECOMP, a compiler for compartmentalized C code that comes with machine-checked proofs guaranteeing that…
▽ More
Undefined behavior in C often causes devastating security vulnerabilities. One practical mitigation is compartmentalization, which allows developers to structure large programs into mutually distrustful compartments with clearly specified privileges and interactions. In this paper we introduce SECOMP, a compiler for compartmentalized C code that comes with machine-checked proofs guaranteeing that the scope of undefined behavior is restricted to the compartments that encounter it and become dynamically compromised. These guarantees are formalized as the preservation of safety properties against adversarial contexts, a secure compilation criterion similar to full abstraction, and this is the first time such a strong criterion is proven for a mainstream programming language. To achieve this we extend the languages of the CompCert verified C compiler with isolated compartments that can only interact via procedure calls and returns, as specified by cross-compartment interfaces. We adapt the passes and optimizations of CompCert as well as their correctness proofs to this compartment-aware setting. We then use compiler correctness as an ingredient in a larger secure compilation proof that involves several proof engineering novelties, needed to scale formally secure compilation up to a C compiler.
△ Less
Submitted 1 July, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
Pipelines and Beyond: Graph Types for ADTs with Futures
Authors:
Francis Rinaldi,
june wunder,
Arthur Aevedo De Amorim,
Stefan K. Muller
Abstract:
Parallel programs are frequently modeled as dependency or cost graphs, which can be used to detect various bugs, or simply to visualize the parallel structure of the code. However, such graphs reflect just one particular execution and are typically constructed in a post-hoc manner. Graph types, which were introduced recently to mitigate this problem, can be assigned statically to a program by a ty…
▽ More
Parallel programs are frequently modeled as dependency or cost graphs, which can be used to detect various bugs, or simply to visualize the parallel structure of the code. However, such graphs reflect just one particular execution and are typically constructed in a post-hoc manner. Graph types, which were introduced recently to mitigate this problem, can be assigned statically to a program by a type system and compactly represent the family of all graphs that could result from the program. Unfortunately, prior work is restricted in its treatment of futures, an increasingly common and especially dynamic form of parallelism. In short, each instance of a future must be statically paired with a vertex name. Previously, this led to the restriction that futures could not be placed in collections or be used to construct data structures. Doing so is not a niche exercise: such structures form the basis of numerous algorithms that use forms of pipelining to achieve performance not attainable without futures. All but the most limited of these examples are out of reach of prior graph type systems. In this paper, we propose a graph type system that allows for almost arbitrary combinations of futures and recursive data types. We do so by indexing datatypes with a type-level vertex structure, a codata structure that supplies unique vertex names to the futures in a data structure. We prove the soundness of the system in a parallel core calculus annotated with vertex structures and associated operations. Although the calculus is annotated, this is merely for convenience in defining the type system. We prove that it is possible to annotate arbitrary recursive types with vertex structures, and show using a prototype inference engine that these annotations can be inferred from OCaml-like source code for several complex parallel algorithms.
△ Less
Submitted 12 November, 2023;
originally announced November 2023.
-
On Pitts' Relational Properties of Domains
Authors:
Arthur Azevedo de Amorim
Abstract:
Andrew Pitts' framework of relational properties of domains is a powerful method for defining predicates or relations on domains, with applications ranging from reasoning principles for program equivalence to proofs of adequacy connecting denotational and operational semantics. Its main appeal is handling recursive definitions that are not obviously well-founded: as long as the corresponding domai…
▽ More
Andrew Pitts' framework of relational properties of domains is a powerful method for defining predicates or relations on domains, with applications ranging from reasoning principles for program equivalence to proofs of adequacy connecting denotational and operational semantics. Its main appeal is handling recursive definitions that are not obviously well-founded: as long as the corresponding domain is also defined recursively, and its recursion pattern lines up appropriately with the definition of the relations, the framework can guarantee their existence. Pitts' original development used the Knaster-Tarski fixed-point theorem as a key ingredient. In these notes, I show how his construction can be seen as an instance of other key fixed-point theorems: the inverse limit construction, the Banach fixed-point theorem and the Kleene fixed-point theorem. The connection underscores how Pitts' construction is intimately tied to the methods for constructing the base recursive domains themselves, and also to techniques based on guarded recursion, or step-indexing, that have become popular in the last two decades.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
Bunched Fuzz: Sensitivity for Vector Metrics
Authors:
june wunder,
Arthur Azevedo de Amorim,
Patrick Baillot,
Marco Gaboardi
Abstract:
Program sensitivity measures the distance between the outputs of a program when run on two related inputs. This notion, which plays a key role in areas such as data privacy and optimization, has been the focus of several program analysis techniques introduced in recent years. Among the most successful ones, we can highlight type systems inspired by linear logic, as pioneered by Reed and Pierce in…
▽ More
Program sensitivity measures the distance between the outputs of a program when run on two related inputs. This notion, which plays a key role in areas such as data privacy and optimization, has been the focus of several program analysis techniques introduced in recent years. Among the most successful ones, we can highlight type systems inspired by linear logic, as pioneered by Reed and Pierce in the Fuzz programming language. In Fuzz, each type is equipped with its own distance, and sensitivity analysis boils down to type checking. In particular, Fuzz features two product types, corresponding to two different notions of distance: the tensor product combines the distances of each component by adding them, while the with product takes their maximum.
In this work, we show that these products can be generalized to arbitrary $L^p$ distances, metrics that are often used in privacy and optimization. The original Fuzz products, tensor and with, correspond to the special cases $L^1$ and $L^\infty$. To ease the handling of such products, we extend the Fuzz type system with bunches -- as in the logic of bunched implications -- where the distances of different groups of variables can be combined using different $L^p$ distances. We show that our extension can be used to reason about quantitative properties of probabilistic programs.
△ Less
Submitted 1 February, 2023; v1 submitted 3 February, 2022;
originally announced February 2022.
-
On Incorrectness Logic and Kleene Algebra with Top and Tests
Authors:
Cheng Zhang,
Arthur Azevedo de Amorim,
Marco Gaboardi
Abstract:
Kleene algebra with tests (KAT) is a foundational equational framework for reasoning about programs, which has found applications in program transformations, networking and compiler optimizations, among many other areas. In his seminal work, Kozen proved that KAT subsumes propositional Hoare logic, showing that one can reason about the (partial) correctness of while programs by means of the equati…
▽ More
Kleene algebra with tests (KAT) is a foundational equational framework for reasoning about programs, which has found applications in program transformations, networking and compiler optimizations, among many other areas. In his seminal work, Kozen proved that KAT subsumes propositional Hoare logic, showing that one can reason about the (partial) correctness of while programs by means of the equational theory of KAT. In this work, we investigate the support that KAT provides for reasoning about incorrectness, instead, as embodied by Ohearn's recently proposed incorrectness logic. We show that KAT cannot directly express incorrectness logic. The main reason for this limitation can be traced to the fact that KAT cannot express explicitly the notion of codomain, which is essential to express incorrectness triples. To address this issue, we study Kleene Algebra with Top and Tests (TopKAT), an extension of KAT with a top element. We show that TopKAT is powerful enough to express a codomain operation, to express incorrectness triples, and to prove all the rules of incorrectness logic sound. This shows that one can reason about the incorrectness of while-like programs by means of the equational theory of TopKAT.
△ Less
Submitted 4 August, 2022; v1 submitted 17 August, 2021;
originally announced August 2021.
-
Probabilistic Relational Reasoning via Metrics
Authors:
Arthur Azevedo de Amorim,
Marco Gaboardi,
Justin Hsu,
Shin-ya Katsumata
Abstract:
The Fuzz programming language [Reed and Pierce, 2010] uses an elegant linear type system combined with a monad-like type to express and reason about probabilistic sensitivity properties, most notably $ε$-differential privacy. We show how to extend Fuzz to capture more general relational properties of probabilistic programs, with approximate, or $(ε, δ)$-differential privacy serving as a leading ex…
▽ More
The Fuzz programming language [Reed and Pierce, 2010] uses an elegant linear type system combined with a monad-like type to express and reason about probabilistic sensitivity properties, most notably $ε$-differential privacy. We show how to extend Fuzz to capture more general relational properties of probabilistic programs, with approximate, or $(ε, δ)$-differential privacy serving as a leading example. Our technical contributions are threefold. First, we introduce the categorical notion of comonadic lifting of a monad to model composition properties of probabilistic divergences. Then, we show how to express relational properties in terms of sensitivity properties via an adjunction we call the path construction. Finally, we instantiate our semantics to model the terminating fragment of Fuzz extended with types carrying information about other divergences between distributions.
△ Less
Submitted 18 April, 2019; v1 submitted 13 July, 2018;
originally announced July 2018.
-
When Good Components Go Bad: Formally Secure Compilation Despite Dynamic Compromise
Authors:
Carmine Abate,
Arthur Azevedo de Amorim,
Roberto Blanco,
Ana Nora Evans,
Guglielmo Fachini,
Catalin Hritcu,
Théo Laurent,
Benjamin C. Pierce,
Marco Stronati,
Jérémy Thibault,
Andrew Tolmach
Abstract:
We propose a new formal criterion for evaluating secure compilation schemes for unsafe languages, expressing end-to-end security guarantees for software components that may become compromised after encountering undefined behavior---for example, by accessing an array out of bounds.
Our criterion is the first to model dynamic compromise in a system of mutually distrustful components with clearly s…
▽ More
We propose a new formal criterion for evaluating secure compilation schemes for unsafe languages, expressing end-to-end security guarantees for software components that may become compromised after encountering undefined behavior---for example, by accessing an array out of bounds.
Our criterion is the first to model dynamic compromise in a system of mutually distrustful components with clearly specified privileges. It articulates how each component should be protected from all the others---in particular, from components that have encountered undefined behavior and become compromised. Each component receives secure compilation guarantees---in particular, its internal invariants are protected from compromised components---up to the point when this component itself becomes compromised, after which we assume an attacker can take complete control and use this component's privileges to attack other components. More precisely, a secure compilation chain must ensure that a dynamically compromised component cannot break the safety properties of the system at the target level any more than an arbitrary attacker-controlled component (with the same interface and privileges, but without undefined behaviors) already could at the source level.
To illustrate the model, we construct a secure compilation chain for a small unsafe language with buffers, procedures, and components, targeting a simple abstract machine with built-in compartmentalization. We give a machine-checked proof in Coq that this compiler satisfies our secure compilation criterion. Finally, we show that the protection guarantees offered by the compartmentalized abstract machine can be achieved at the machine-code level using either software fault isolation or a tag-based reference monitor.
△ Less
Submitted 29 November, 2019; v1 submitted 2 February, 2018;
originally announced February 2018.
-
Formally Secure Compilation of Unsafe Low-Level Components (Extended Abstract)
Authors:
Guglielmo Fachini,
Catalin Hritcu,
Marco Stronati,
Ana Nora Evans,
Théo Laurent,
Arthur Azevedo de Amorim,
Benjamin C. Pierce,
Andrew Tolmach
Abstract:
We propose a new formal criterion for secure compilation, providing strong security guarantees for components written in unsafe, low-level languages with C-style undefined behavior. Our criterion goes beyond recent proposals, which protect the trace properties of a single component against an adversarial context, to model dynamic compromise in a system of mutually distrustful components. Each comp…
▽ More
We propose a new formal criterion for secure compilation, providing strong security guarantees for components written in unsafe, low-level languages with C-style undefined behavior. Our criterion goes beyond recent proposals, which protect the trace properties of a single component against an adversarial context, to model dynamic compromise in a system of mutually distrustful components. Each component is protected from all the others until it receives an input that triggers an undefined behavior, causing it to become compromised and attack the remaining uncompromised components. To illustrate this model, we demonstrate a secure compilation chain for an unsafe language with buffers, procedures, and components, compiled to a simple RISC abstract machine with built-in compartmentalization. The protection guarantees offered by this abstract machine can be achieved at the machine-code level using either software fault isolation or tag-based reference monitoring. We are working on machine-checked proofs showing that this compiler satisfies our secure compilation criterion.
△ Less
Submitted 31 October, 2017; v1 submitted 19 October, 2017;
originally announced October 2017.
-
The Meaning of Memory Safety
Authors:
Arthur Azevedo de Amorim,
Catalin Hritcu,
Benjamin C. Pierce
Abstract:
We give a rigorous characterization of what it means for a programming language to be memory safe, capturing the intuition that memory safety supports local reasoning about state. We formalize this principle in two ways. First, we show how a small memory-safe language validates a noninterference property: a program can neither affect nor be affected by unreachable parts of the state. Second, we ex…
▽ More
We give a rigorous characterization of what it means for a programming language to be memory safe, capturing the intuition that memory safety supports local reasoning about state. We formalize this principle in two ways. First, we show how a small memory-safe language validates a noninterference property: a program can neither affect nor be affected by unreachable parts of the state. Second, we extend separation logic, a proof system for heap-manipulating programs, with a memory-safe variant of its frame rule. The new rule is stronger because it applies even when parts of the program are buggy or malicious, but also weaker because it demands a stricter form of separation between parts of the program state. We also consider a number of pragmatically motivated variations on memory safety and the reasoning principles they support. As an application of our characterization, we evaluate the security of a previously proposed dynamic monitor for memory safety of heap-allocated data.
△ Less
Submitted 6 April, 2018; v1 submitted 20 May, 2017;
originally announced May 2017.
-
A Semantic Account of Metric Preservation
Authors:
Arthur Azevedo de Amorim,
Marco Gaboardi,
Justin Hsu,
Shin-ya Katsumata,
Ikram Cherigui
Abstract:
Program sensitivity measures how robust a program is to small changes in its input, and is a fundamental notion in domains ranging from differential privacy to cyber-physical systems. A natural way to formalize program sensitivity is in terms of metrics on the input and output spaces, requiring that an $r$-sensitive function map inputs that are at distance $d$ to outputs that are at distance at mo…
▽ More
Program sensitivity measures how robust a program is to small changes in its input, and is a fundamental notion in domains ranging from differential privacy to cyber-physical systems. A natural way to formalize program sensitivity is in terms of metrics on the input and output spaces, requiring that an $r$-sensitive function map inputs that are at distance $d$ to outputs that are at distance at most $r \cdot d$. Program sensitivity is thus an analogue of Lipschitz continuity for programs.
Reed and Pierce introduced Fuzz, a functional language with a linear type system that can express program sensitivity. They show soundness operationally, in the form of a metric preservation property. Inspired by their work, we study program sensitivity and metric preservation from a denotational point of view. In particular, we introduce metric CPOs, a novel semantic structure for reasoning about computation on metric spaces, by endowing CPOs with a compatible notion of distance. This structure is useful for reasoning about metric properties of programs, and specifically about program sensitivity. We demonstrate metric CPOs by giving a model for the deterministic fragment of Fuzz.
△ Less
Submitted 23 October, 2022; v1 submitted 1 February, 2017;
originally announced February 2017.
-
Beyond Good and Evil: Formalizing the Security Guarantees of Compartmentalizing Compilation
Authors:
Yannis Juglaret,
Catalin Hritcu,
Arthur Azevedo de Amorim,
Boris Eng,
Benjamin C. Pierce
Abstract:
Compartmentalization is good security-engineering practice. By breaking a large software system into mutually distrustful components that run with minimal privileges, restricting their interactions to conform to well-defined interfaces, we can limit the damage caused by low-level attacks such as control-flow hijacking. When used to defend against such attacks, compartmentalization is often impleme…
▽ More
Compartmentalization is good security-engineering practice. By breaking a large software system into mutually distrustful components that run with minimal privileges, restricting their interactions to conform to well-defined interfaces, we can limit the damage caused by low-level attacks such as control-flow hijacking. When used to defend against such attacks, compartmentalization is often implemented cooperatively by a compiler and a low-level compartmentalization mechanism. However, the formal guarantees provided by such compartmentalizing compilation have seen surprisingly little investigation.
We propose a new security property, secure compartmentalizing compilation (SCC), that formally characterizes the guarantees provided by compartmentalizing compilation and clarifies its attacker model. We reconstruct our property by starting from the well-established notion of fully abstract compilation, then identifying and lifting three important limitations that make standard full abstraction unsuitable for compartmentalization. The connection to full abstraction allows us to prove SCC by adapting established proof techniques; we illustrate this with a compiler from a simple unsafe imperative language with procedures to a compartmentalized abstract machine.
△ Less
Submitted 15 April, 2017; v1 submitted 14 February, 2016;
originally announced February 2016.
-
Towards a Fully Abstract Compiler Using Micro-Policies: Secure Compilation for Mutually Distrustful Components
Authors:
Yannis Juglaret,
Catalin Hritcu,
Arthur Azevedo de Amorim,
Benjamin C. Pierce,
Antal Spector-Zabusky,
Andrew Tolmach
Abstract:
Secure compilation prevents all low-level attacks on compiled code and allows for sound reasoning about security in the source language. In this work we propose a new attacker model for secure compilation that extends the well-known notion of full abstraction to ensure protection for mutually distrustful components. We devise a compiler chain (compiler, linker, and loader) and a novel security mon…
▽ More
Secure compilation prevents all low-level attacks on compiled code and allows for sound reasoning about security in the source language. In this work we propose a new attacker model for secure compilation that extends the well-known notion of full abstraction to ensure protection for mutually distrustful components. We devise a compiler chain (compiler, linker, and loader) and a novel security monitor that together defend against this strong attacker model. The monitor is implemented using a recently proposed, generic tag-based protection framework called micro-policies, which comes with hardware support for efficient caching and with a formal verification methodology. Our monitor protects the abstractions of a simple object-oriented language---class isolation, the method call discipline, and type safety---against arbitrary low-level attackers.
△ Less
Submitted 2 October, 2015;
originally announced October 2015.
-
A Verified Information-Flow Architecture
Authors:
Arthur Azevedo de Amorim,
Nathan Collins,
André DeHon,
Delphine Demange,
Catalin Hritcu,
David Pichardie,
Benjamin C. Pierce,
Randy Pollack,
Andrew Tolmach
Abstract:
SAFE is a clean-slate design for a highly secure computer system, with pervasive mechanisms for tracking and limiting information flows. At the lowest level, the SAFE hardware supports fine-grained programmable tags, with efficient and flexible propagation and combination of tags as instructions are executed. The operating system virtualizes these generic facilities to present an information-flow…
▽ More
SAFE is a clean-slate design for a highly secure computer system, with pervasive mechanisms for tracking and limiting information flows. At the lowest level, the SAFE hardware supports fine-grained programmable tags, with efficient and flexible propagation and combination of tags as instructions are executed. The operating system virtualizes these generic facilities to present an information-flow abstract machine that allows user programs to label sensitive data with rich confidentiality policies. We present a formal, machine-checked model of the key hardware and software mechanisms used to dynamically control information flow in SAFE and an end-to-end proof of noninterference for this model.
We use a refinement proof methodology to propagate the noninterference property of the abstract machine down to the concrete machine level. We use an intermediate layer in the refinement chain that factors out the details of the information-flow control policy and devise a code generator for compiling such information-flow policies into low-level monitor code. Finally, we verify the correctness of this generator using a dedicated Hoare logic that abstracts from low-level machine instructions into a reusable set of verified structured code generators.
△ Less
Submitted 6 March, 2016; v1 submitted 22 September, 2015;
originally announced September 2015.
-
Really Natural Linear Indexed Type Checking
Authors:
Arthur Azevedo de Amorim,
Emilio Jesús Gallego Arias,
Marco Gaboardi,
Justin Hsu
Abstract:
Recent works have shown the power of linear indexed type systems for enforcing complex program properties. These systems combine linear types with a language of type-level indices, allowing more fine-grained analyses. Such systems have been fruitfully applied in diverse domains, including implicit complexity and differential privacy. A natural way to enhance the expressiveness of this approach is…
▽ More
Recent works have shown the power of linear indexed type systems for enforcing complex program properties. These systems combine linear types with a language of type-level indices, allowing more fine-grained analyses. Such systems have been fruitfully applied in diverse domains, including implicit complexity and differential privacy. A natural way to enhance the expressiveness of this approach is by allowing the indices to depend on runtime information, in the spirit of dependent types. This approach is used in DFuzz, a language for differential privacy. The DFuzz type system relies on an index language supporting real and natural number arithmetic over constants and variables. Moreover, DFuzz uses a subty** mechanism to make types more flexible. By themselves, linearity, dependency, and subty** each require delicate handling when performing type checking or type inference; their combination increases this challenge substantially, as the features can interact in non-trivial ways. In this paper, we study the type-checking problem for DFuzz. We show how we can reduce type checking for (a simple extension of) DFuzz to constraint solving over a first-order theory of naturals and real numbers which, although undecidable, can often be handled in practice by standard numeric solvers.
△ Less
Submitted 16 March, 2015;
originally announced March 2015.
-
Testing Noninterference, Quickly
Authors:
Catalin Hritcu,
Leonidas Lampropoulos,
Antal Spector-Zabusky,
Arthur Azevedo de Amorim,
Maxime Dénès,
John Hughes,
Benjamin C. Pierce,
Dimitrios Vytiniotis
Abstract:
Information-flow control mechanisms are difficult both to design and to prove correct. To reduce the time wasted on doomed proof attempts due to broken definitions, we advocate modern random testing techniques for finding counterexamples during the design process. We show how to use QuickCheck, a property-based random-testing tool, to guide the design of increasingly complex information-flow abstr…
▽ More
Information-flow control mechanisms are difficult both to design and to prove correct. To reduce the time wasted on doomed proof attempts due to broken definitions, we advocate modern random testing techniques for finding counterexamples during the design process. We show how to use QuickCheck, a property-based random-testing tool, to guide the design of increasingly complex information-flow abstract machines, leading up to a sophisticated register machine with a novel and highly permissive flow-sensitive dynamic enforcement mechanism that is sound in the presence of first-class public labels. We find that both sophisticated strategies for generating well-distributed random programs and readily falsifiable formulations of noninterference properties are critically important for efficient testing. We propose several approaches and evaluate their effectiveness on a collection of injected bugs of varying subtlety. We also present an effective technique for shrinking large counterexamples to minimal, easily comprehensible ones. Taken together, our best methods enable us to quickly and automatically generate simple counterexamples for more than 45 bugs. Moreover, we show how testing guides the discovery of the sophisticated invariants needed for the noninterference proof of our most complex machine.
△ Less
Submitted 25 July, 2015; v1 submitted 1 September, 2014;
originally announced September 2014.