Search | arXiv e-print repository

Wasm SpecTec: Engineering a Formal Language Standard

Authors: Joachim Breitner, Philippa Gardner, Jaehyun Lee, Sam Lindley, Matija Pretnar, Xiaojia Rao, Andreas Rossberg, Sukyoung Ryu, Wonho Shin, Conrad Watt, Dongjun Youn

Abstract: WebAssembly (Wasm) is a low-level bytecode language and virtual machine, intended as a compilation target for a wide range of programming languages, which is seeing increasing adoption across diverse ecosystems. As a young technology, Wasm continues to evolve -- it reached version 2.0 last year and another major update is expected soon. For a new feature to be standardised in Wasm, four key arte… ▽ More WebAssembly (Wasm) is a low-level bytecode language and virtual machine, intended as a compilation target for a wide range of programming languages, which is seeing increasing adoption across diverse ecosystems. As a young technology, Wasm continues to evolve -- it reached version 2.0 last year and another major update is expected soon. For a new feature to be standardised in Wasm, four key artefacts must be presented: a formal (mathematical) specification of the feature, an accompanying prose pseudocode description, an implementation in the official reference interpreter, and a suite of unit tests. This rigorous process helps to avoid errors in the design and implementation of new Wasm features, and Wasm's distinctive formal specification in particular has facilitated machine-checked proofs of various correctness properties for the language. However, manually crafting all of these artefacts requires expert knowledge combined with repetitive and tedious labor, which is a burden on the language's standardization process and authoring of the specification. This paper presents Wasm SpecTec, a technology to express the formal specification of Wasm through a domain-specific language. This DSL allows all of Wasm's currently handwritten specification artefacts to be error-checked and generated automatically from a single source of truth, and is designed to be easy to write, read, compare, and review. We believe that Wasm SpecTec's automation and meta-level error checking will significantly ease the current burden of the language's specification authors. We demonstrate the current capabilities of Wasm SpecTec by showcasing its proficiency in generating various artefacts, and describe our work towards replacing the manually written official Wasm specification document with specifications generated by Wasm SpecTec. △ Less

Submitted 13 November, 2023; originally announced November 2023.

Comments: 5 pages, 7 figures

arXiv:1910.11724 [pdf]

Embracing a mechanized formalization gap

Authors: Antal Spector-Zabusky, Joachim Breitner, Yao Li, Stephanie Weirich

Abstract: If a code base is so big and complicated that complete mechanical verification is intractable, can we still apply and benefit from verification methods? We show that by allowing a deliberate mechanized formalization gap we can shrink and simplify the model until it is manageable, while still retaining a meaningful, declaratively documented connection to the original, unmodified source code. Concre… ▽ More If a code base is so big and complicated that complete mechanical verification is intractable, can we still apply and benefit from verification methods? We show that by allowing a deliberate mechanized formalization gap we can shrink and simplify the model until it is manageable, while still retaining a meaningful, declaratively documented connection to the original, unmodified source code. Concretely, we translate core parts of the Haskell compiler GHC into Coq, using hs-to-coq, and verify invariants related to the use of term variables. △ Less

Submitted 25 October, 2019; originally announced October 2019.

Comments: Submitted to CPP'20

arXiv:1806.03541 [pdf, other]

Functional Pearl: Theorem Proving for All (Equational Reasoning in Liquid Haskell)

Authors: Niki Vazou, Joachim Breitner, Will Kunkel, David Van Horn, Graham Hutton

Abstract: Equational reasoning is one of the key features of pure functional languages such as Haskell. To date, however, such reasoning always took place externally to Haskell, either manually on paper, or mechanised in a theorem prover. This article shows how equational reasoning can be performed directly and seamlessly within Haskell itself, and be checked using Liquid Haskell. In particular, language le… ▽ More Equational reasoning is one of the key features of pure functional languages such as Haskell. To date, however, such reasoning always took place externally to Haskell, either manually on paper, or mechanised in a theorem prover. This article shows how equational reasoning can be performed directly and seamlessly within Haskell itself, and be checked using Liquid Haskell. In particular, language learners --- to whom external theorem provers are out of reach --- can benefit from having their proofs mechanically checked. Concretely, we show how the equational proofs and derivations from Graham's textbook can be recast as proofs in Haskell (spoiler: they look essentially the same). △ Less

Submitted 9 June, 2018; originally announced June 2018.

Comments: Submitted to Haskell'18

arXiv:1806.03476 [pdf, other]

Type variables in patterns

Authors: Richard A. Eisenberg, Joachim Breitner, Simon Peyton Jones

Abstract: For many years, GHC has implemented an extension to Haskell that allows type variables to be bound in type signatures and patterns, and to scope over terms. This extension was never properly specified. We rectify that oversight here. With the formal specification in hand, the otherwise-labyrinthine path toward a design for binding type variables in patterns becomes blindingly clear. We thus extend… ▽ More For many years, GHC has implemented an extension to Haskell that allows type variables to be bound in type signatures and patterns, and to scope over terms. This extension was never properly specified. We rectify that oversight here. With the formal specification in hand, the otherwise-labyrinthine path toward a design for binding type variables in patterns becomes blindingly clear. We thus extend ScopedTypeVariables to bind type variables explicitly, obviating the Proxy workaround to the dustbin of history. △ Less

Submitted 9 June, 2018; originally announced June 2018.

Comments: Submitted to Haskell'18

arXiv:1805.08106 [pdf, other]

The sufficiently smart compiler is a theorem prover

Authors: Joachim Breitner

Abstract: That the Haskell Compiler GHC is capable of proving non-trivial equalities between Haskell code, by virtue of its aggressive optimizer, in particular the term rewriting engine in the simplifier. We demonstrate this with a surprising little code in a GHC plugin, explains the knobs we had to turn, discuss the limits of the approach and related applications of the same idea, namely testing that promi… ▽ More That the Haskell Compiler GHC is capable of proving non-trivial equalities between Haskell code, by virtue of its aggressive optimizer, in particular the term rewriting engine in the simplifier. We demonstrate this with a surprising little code in a GHC plugin, explains the knobs we had to turn, discuss the limits of the approach and related applications of the same idea, namely testing that promises from Haskell libraries with domain-specific optimizations hold. △ Less

Submitted 21 May, 2018; originally announced May 2018.

Comments: Published in the IFL 2017 pre-proceedings

arXiv:1803.07130 [pdf, other]

A promise checked is a promise kept: Inspection Testing

Authors: Joachim Breitner

Abstract: Occasionally, developers need to ensure that the compiler treats their code in a specific way that is only visible by inspecting intermediate or final compilation artifacts. This is particularly common with carefully crafted compositional libraries, where certain usage patterns are expected to trigger an intricate sequence of compiler optimizations -- stream fusion is a well-known example. The d… ▽ More Occasionally, developers need to ensure that the compiler treats their code in a specific way that is only visible by inspecting intermediate or final compilation artifacts. This is particularly common with carefully crafted compositional libraries, where certain usage patterns are expected to trigger an intricate sequence of compiler optimizations -- stream fusion is a well-known example. The developer of such a library has to manually inspect build artifacts and check for the expected properties. Because this is too tedious to do often, it will likely go unnoticed if the property is broken by a change to the library code, its dependencies or the compiler. The lack of automation has led to released versions of such libraries breaking their documented promises. This indicates that there is an unrecognized need for a new testing paradigm, inspection testing, where the programmer declaratively describes non-functional properties of an compilation artifact and the compiler checks these properties. We define inspection testing abstractly, implement it in the context of Haskell and show that it increases the quality of such libraries. △ Less

Submitted 3 June, 2018; v1 submitted 19 March, 2018; originally announced March 2018.

Comments: 15 pages. Submitted to Haskell'18. Includes an additional appendix

arXiv:1803.06960 [pdf, other]

Ready, Set, Verify! Applying hs-to-coq to real-world Haskell code

Authors: Joachim Breitner, Antal Spector-Zabusky, Yao Li, Christine Rizkallah, John Wiegley, Stephanie Weirich

Abstract: Good tools can bring mechanical verification to programs written in mainstream functional languages. We use hs-to-coq to translate significant portions of Haskell's containers library into Coq, and verify it against specifications that we derive from a variety of sources including type class laws, the library's test suite, and interfaces from Coq's standard library. Our work shows that it is feasi… ▽ More Good tools can bring mechanical verification to programs written in mainstream functional languages. We use hs-to-coq to translate significant portions of Haskell's containers library into Coq, and verify it against specifications that we derive from a variety of sources including type class laws, the library's test suite, and interfaces from Coq's standard library. Our work shows that it is feasible to verify mature, widely-used, highly optimized, and unmodified Haskell code. We also learn more about the theory of weight-balanced trees, extend hs-to-coq to handle partiality, and -- since we found no bugs -- attest to the superb quality of well-tested functional code. △ Less

Submitted 19 March, 2018; v1 submitted 19 March, 2018; originally announced March 2018.

Comments: 30 pages, submitted to ICFP'18

arXiv:1711.09286 [pdf, other]

doi 10.1145/3167092

Total Haskell is Reasonable Coq

Authors: Antal Spector-Zabusky, Joachim Breitner, Christine Rizkallah, Stephanie Weirich

Abstract: We would like to use the Coq proof assistant to mechanically verify properties of Haskell programs. To that end, we present a tool, named hs-to-coq, that translates total Haskell programs into Coq programs via a shallow embedding. We apply our tool in three case studies -- a lawful Monad instance, "Hutton's razor", and an existing data structure library -- and prove their correctness. These exampl… ▽ More We would like to use the Coq proof assistant to mechanically verify properties of Haskell programs. To that end, we present a tool, named hs-to-coq, that translates total Haskell programs into Coq programs via a shallow embedding. We apply our tool in three case studies -- a lawful Monad instance, "Hutton's razor", and an existing data structure library -- and prove their correctness. These examples show that this approach is viable: both that hs-to-coq applies to existing Haskell code, and that the output it produces is amenable to verification. △ Less

Submitted 25 November, 2017; originally announced November 2017.

Comments: 13 pages plus references. Published at CPP'18, In Proceedings of 7th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP'18). ACM, New York, NY, USA, 2018

arXiv:1709.09699 [pdf, other]

Analytic Formulas for Renyi Entropy of Hidden Markov Models

Authors: Joachim Breitner, Maciej Skorski

Abstract: Determining entropy rates of stochastic processes is a fundamental and difficult problem, with closed-form solutions known only for specific cases. This paper pushes the state-of-the-art by solving the problem for Hidden Markov Models (HMMs) and Renyi entropies. While the problem for Markov chains reduces to studying the growth of a matrix product, computations for HMMs involve \emph{products of… ▽ More Determining entropy rates of stochastic processes is a fundamental and difficult problem, with closed-form solutions known only for specific cases. This paper pushes the state-of-the-art by solving the problem for Hidden Markov Models (HMMs) and Renyi entropies. While the problem for Markov chains reduces to studying the growth of a matrix product, computations for HMMs involve \emph{products of random matrices}. As a result, this case is much harder and no explicit formulas have been known so far. We show how to circumvent this issue for Renyi entropy of integer orders, reducing the problem again to a \emph{single matrix products} where the matrix is formed from transition and emission probabilities by means of tensor product. To obtain results in the asymptotic setting, we use a novel technique for determining the growth of non-negative matrix powers. The classical approach is the Frobenius-Perron theory, but it requires positivity assumptions; we instead work directly with the spectral formula. As a consequence, our results do not suffer from limitations such as irreducibility and aperiodicity. This improves our understanding of the entropy rate even for standard (unhidden) Markov chains. A recently published side-channel attack against RSA was proven effective using our result, specialized to order 2. △ Less

Submitted 27 September, 2017; originally announced September 2017.

arXiv:1705.09704 [pdf, other]

doi 10.1145/3110247

Lock-step simulation is child's play

Authors: Joachim Breitner, Chris Smith

Abstract: Implementing multi-player networked games by broadcasting the player's input and letting each client calculate the game state - a scheme known as lock-step simulation - is an established technique. However, ensuring that every client in this scheme obtains a consistent state is infamously hard and in general requires great discipline from the game programmer. The thesis of this report is that in t… ▽ More Implementing multi-player networked games by broadcasting the player's input and letting each client calculate the game state - a scheme known as lock-step simulation - is an established technique. However, ensuring that every client in this scheme obtains a consistent state is infamously hard and in general requires great discipline from the game programmer. The thesis of this report is that in the realm of functional programming - in particular with Haskell's purity and static pointers - this hard problem becomes almost trivially easy. We support this thesis by implementing lock-step simulation under very adverse conditions. We extended the educational programming environment CodeWorld, which is used to teach math and programming to middle school students, with the ability to create and run interactive, networked multi-user games. Despite providing a very abstract and high-level interface, and without requiring △ Less

Submitted 26 May, 2017; originally announced May 2017.

Comments: Conditionally accepted as an experience report at ICFP'17

arXiv:1405.3099 [pdf, ps, other]

The Correctness of Launchbury's Natural Semantics for Lazy Evaluation

Authors: Joachim Breitner

Abstract: In his seminal paper "A Natural Semantics for Lazy Evaluation", John Launchbury proves his semantics correct with respect to a denotational semantics. We machine-checked the proof and found it to fail, and provide two ways to fix it: One by taking a detour via a modified natural semantics with an explicit stack, and one by adjusting the denotational semantics of heaps. In his seminal paper "A Natural Semantics for Lazy Evaluation", John Launchbury proves his semantics correct with respect to a denotational semantics. We machine-checked the proof and found it to fail, and provide two ways to fix it: One by taking a detour via a modified natural semantics with an explicit stack, and one by adjusting the denotational semantics of heaps. △ Less

Submitted 13 May, 2014; originally announced May 2014.

Comments: 22 pages

ACM Class: D.3.1; F.3.2

arXiv:1306.1340 [pdf, ps, other]

Certified HLints with Isabelle/HOLCF-Prelude

Authors: Joachim Breitner, Brian Huffman, Neil Mitchell, Christian Sternagel

Abstract: We present the HOLCF-Prelude, a formalization of a large part of Haskell's standard prelude in Isabelle/HOLCF. Applying this formalization to the hints suggested by HLint allows us to certify them formally. We present the HOLCF-Prelude, a formalization of a large part of Haskell's standard prelude in Isabelle/HOLCF. Applying this formalization to the hints suggested by HLint allows us to certify them formally. △ Less

Submitted 6 June, 2013; originally announced June 2013.

Comments: 1st International Workshop on Haskell And Rewriting Techniques, HART 2013, 5 pages

arXiv:1207.2017 [pdf, ps, other]

dup -- Explicit un-sharing in Haskell

Authors: Joachim Breitner

Abstract: We propose two operations to prevent sharing in Haskell that do not require modifying the data generating code, demonstrate their use and usefulness, and compare them to other approaches to preventing sharing. Our claims are supported by a formal semantics and a prototype implementation. We propose two operations to prevent sharing in Haskell that do not require modifying the data generating code, demonstrate their use and usefulness, and compare them to other approaches to preventing sharing. Our claims are supported by a formal semantics and a prototype implementation. △ Less

Submitted 9 July, 2012; originally announced July 2012.

Comments: 9 pages, 9 figures

ACM Class: D.1.1; D.3.3; E.2

arXiv:1204.2974 [pdf, ps, other]

Tackling the testing migration problem with SAT-Solvers

Authors: Joachim Breitner

Abstract: We show that it is feasible to formulate the testing migration problem as a practically solvable PMAX-SAT instance, when package dependencies and conflicts are pre-processed sensibly. We show that it is feasible to formulate the testing migration problem as a practically solvable PMAX-SAT instance, when package dependencies and conflicts are pre-processed sensibly. △ Less

Submitted 13 April, 2012; originally announced April 2012.

Comments: 13 pages

ACM Class: D.2.9

arXiv:1106.3478 [pdf, ps, other]

Conditional Elimination through Code Duplication

Authors: Joachim Breitner

Abstract: We propose an optimizing transformation which reduces program runtime at the expense of program size by eliminating conditional jumps. We propose an optimizing transformation which reduces program runtime at the expense of program size by eliminating conditional jumps. △ Less

Submitted 15 June, 2011; originally announced June 2011.

Comments: 11 pages, 5 figures

ACM Class: D.3.4

Showing 1–15 of 15 results for author: Breitner, J