-
Grokking the Sequent Calculus (Functional Pearl)
Authors:
David Binder,
Marco Tzschentke,
Marius Müller,
Klaus Ostermann
Abstract:
The sequent calculus is a proof system which was designed as a more symmetric alternative to natural deduction. The λμμ-calculus is a term assignment system for the sequent calculus and a great foundation for compiler intermediate languages due to its first-class representation of evaluation contexts. Unfortunately, only experts of the sequent calculus can appreciate its beauty. To remedy this, we…
▽ More
The sequent calculus is a proof system which was designed as a more symmetric alternative to natural deduction. The λμμ-calculus is a term assignment system for the sequent calculus and a great foundation for compiler intermediate languages due to its first-class representation of evaluation contexts. Unfortunately, only experts of the sequent calculus can appreciate its beauty. To remedy this, we present the first introduction to the λμμ-calculus which is not directed at type theorists or logicians but at compiler hackers and programming-language enthusiasts. We do this by writing a compiler from a small but interesting surface language to the λμμ-calculus as a compiler intermediate language.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Deriving Dependently-Typed OOP from First Principles -- Extended Version with Additional Appendices
Authors:
David Binder,
Ingo Skupin,
Tim Süberkrüb,
Klaus Ostermann
Abstract:
The expression problem describes how most types can easily be extended with new ways to produce the type or new ways to consume the type, but not both. When abstract syntax trees are defined as an algebraic data type, for example, they can easily be extended with new consumers, such as print or eval, but adding a new constructor requires the modification of all existing pattern matches. The expres…
▽ More
The expression problem describes how most types can easily be extended with new ways to produce the type or new ways to consume the type, but not both. When abstract syntax trees are defined as an algebraic data type, for example, they can easily be extended with new consumers, such as print or eval, but adding a new constructor requires the modification of all existing pattern matches. The expression problem is one way to elucidate the difference between functional or data-oriented programs (easily extendable by new consumers) and object-oriented programs (easily extendable by new producers). This difference between programs which are extensible by new producers or new consumers also exists for dependently typed programming, but with one core difference: Dependently-typed programming almost exclusively follows the functional programming model and not the object-oriented model, which leaves an interesting space in the programming language landscape unexplored. In this paper, we explore the field of dependently-typed object-oriented programming by deriving it from first principles using the principle of duality. That is, we do not extend an existing object-oriented formalism with dependent types in an ad-hoc fashion, but instead start from a familiar data-oriented language and derive its dual fragment by the systematic use of defunctionalization and refunctionalization. Our central contribution is a dependently typed calculus which contains two dual language fragments. We provide type- and semantics-preserving transformations between these two language fragments: defunctionalization and refunctionalization. We have implemented this language and these transformations and use this implementation to explain the various ways in which constructions in dependently typed programming can be explained as special instances of the phenomenon of duality.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Data-Codata Symmetry and its Interaction with Evaluation Order
Authors:
David Binder,
Julian Jabs,
Ingo Skupin,
Klaus Ostermann
Abstract:
Data types and codata types are, as the names suggest, often seen as duals of each other. However, most programming languages do not support both of them in their full generality, or if they do, they are still seen as distinct constructs with separately defined type-checking, compilation, etc. Rendel et al. were the first to propose variants of two standard program transformations, de- and refunct…
▽ More
Data types and codata types are, as the names suggest, often seen as duals of each other. However, most programming languages do not support both of them in their full generality, or if they do, they are still seen as distinct constructs with separately defined type-checking, compilation, etc. Rendel et al. were the first to propose variants of two standard program transformations, de- and refunctionalization, as a test to gauge and improve the symmetry between data and codata types. However, in previous works, codata and data were still seen as separately defined language constructs, with de- and refunctionalization being defined as similar but separate algorithms. These works also glossed over interactions between the aforementioned transformations and evaluation order, which leads to a loss of desirable $η$ expansion equalities. We argue that the failure of complete symmetry is due to the inherent asymmetry of natural deduction as the logical foundation of the language design. Natural deduction is asymmetric in that its focus is on producers (proofs) of types, whereas consumers (contexts, continuations, refutations) have a second-class status. Inspired by existing sequent-calculus-based language designs, we present the first language design that is fully symmetric in that the issues of polarity (data type vs codata types) and evaluation order (call-by-value vs call-by-name) are untangled and become independent attributes of a single form of type declaration. Both attributes, polarity and evaluation order, can be changed independently by one algorithm each. In particular, defunctionalization and refunctionalization are now one algorithm. Evaluation order can be defined and changed individually for each type, independently from polarity. By allowing only certain combinations of evaluation order and polarity, the aforementioned $η$ laws can be restored.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Denotational validation of higher-order Bayesian inference
Authors:
Adam Ścibior,
Ohad Kammar,
Matthijs Vákár,
Sam Staton,
Hongseok Yang,
Yufei Cai,
Klaus Ostermann,
Sean K. Moss,
Chris Heunen,
Zoubin Ghahramani
Abstract:
We present a modular semantic account of Bayesian inference algorithms for probabilistic programming languages, as used in data science and machine learning. Sophisticated inference algorithms are often explained in terms of composition of smaller parts. However, neither their theoretical justification nor their implementation reflects this modularity. We show how to conceptualise and analyse such…
▽ More
We present a modular semantic account of Bayesian inference algorithms for probabilistic programming languages, as used in data science and machine learning. Sophisticated inference algorithms are often explained in terms of composition of smaller parts. However, neither their theoretical justification nor their implementation reflects this modularity. We show how to conceptualise and analyse such inference algorithms as manipulating intermediate representations of probabilistic programs using higher-order functions and inductive types, and their denotational semantics. Semantic accounts of continuous distributions use measurable spaces. However, our use of higher-order functions presents a substantial technical difficulty: it is impossible to define a measurable space structure over the collection of measurable functions between arbitrary measurable spaces that is compatible with standard operations on those functions, such as function application. We overcome this difficulty using quasi-Borel spaces, a recently proposed mathematical structure that supports both function spaces and continuous distributions. We define a class of semantic structures for representing probabilistic programs, and semantic validity criteria for transformations of these representations in terms of distribution preservation. We develop a collection of building blocks for composing representations. We use these building blocks to validate common inference algorithms such as Sequential Monte Carlo and Markov Chain Monte Carlo. To emphasize the connection between the semantic manipulation and its traditional measure theoretic origins, we use Kock's synthetic measure theory. We demonstrate its usefulness by proving a quasi-Borel counterpart to the Metropolis-Hastings-Green theorem.
△ Less
Submitted 8 November, 2017;
originally announced November 2017.
-
A Module-System Discipline for Model-Driven Software Development
Authors:
Sebastian Erdweg,
Klaus Ostermann
Abstract:
Model-driven development is a pragmatic approach to software development that embraces domain-specific languages (DSLs), where models correspond to DSL programs. A distinguishing feature of model-driven development is that clients of a model can select from an open set of alternative semantics of the model by applying different model transformation. However, in existing model-driven frameworks, de…
▽ More
Model-driven development is a pragmatic approach to software development that embraces domain-specific languages (DSLs), where models correspond to DSL programs. A distinguishing feature of model-driven development is that clients of a model can select from an open set of alternative semantics of the model by applying different model transformation. However, in existing model-driven frameworks, dependencies between models, model transformations, and generated code artifacts are either implicit or globally declared in build scripts, which impedes modular reasoning, separate compilation, and programmability in general. We propose the design of a new module system that incorporates models and model transformations as modules. A programmer can apply transformations in import statements, thus declaring a dependency on generated code artifacts. Our design enables modular reasoning and separate compilation by preventing hidden dependencies, and it supports mixing modeling artifacts with conventional code artifacts as well as higher-order transformations. We have formalized our design and the aforementioned properties and have validated it by an implementation and case studies that show that our module system successfully integrates model-driven development into conventional programming languages.
△ Less
Submitted 31 March, 2017;
originally announced March 2017.
-
A Theory of Changes for Higher-Order Languages - Incrementalizing λ-Calculi by Static Differentiation
Authors:
Yufei Cai,
Paolo G. Giarrusso,
Tillmann Rendel,
Klaus Ostermann
Abstract:
If the result of an expensive computation is invalidated by a small change to the input, the old result should be updated incrementally instead of reexecuting the whole computation. We incrementalize programs through their derivative. A derivative maps changes in the program's input directly to changes in the program's output, without reexecuting the original program. We present a program transfor…
▽ More
If the result of an expensive computation is invalidated by a small change to the input, the old result should be updated incrementally instead of reexecuting the whole computation. We incrementalize programs through their derivative. A derivative maps changes in the program's input directly to changes in the program's output, without reexecuting the original program. We present a program transformation taking programs to their derivatives, which is fully static and automatic, supports first-class functions, and produces derivatives amenable to standard optimization.
We prove the program transformation correct in Agda for a family of simply-typed λ-calculi, parameterized by base types and primitives. A precise interface specifies what is required to incrementalize the chosen primitives.
We investigate performance by a case study: We implement in Scala the program transformation, a plugin and improve performance of a nontrivial program by orders of magnitude.
△ Less
Submitted 2 December, 2013;
originally announced December 2013.
-
Reify Your Collection Queries for Modularity and Speed!
Authors:
Paolo G. Giarrusso,
Klaus Ostermann,
Michael Eichberg,
Ralf Mitschke,
Tillmann Rendel,
Christian Kästner
Abstract:
Modularity and efficiency are often contradicting requirements, such that programers have to trade one for the other. We analyze this dilemma in the context of programs operating on collections. Performance-critical code using collections need often to be hand-optimized, leading to non-modular, brittle, and redundant code. In principle, this dilemma could be avoided by automatic collection-specifi…
▽ More
Modularity and efficiency are often contradicting requirements, such that programers have to trade one for the other. We analyze this dilemma in the context of programs operating on collections. Performance-critical code using collections need often to be hand-optimized, leading to non-modular, brittle, and redundant code. In principle, this dilemma could be avoided by automatic collection-specific optimizations, such as fusion of collection traversals, usage of indexing, or reordering of filters. Unfortunately, it is not obvious how to encode such optimizations in terms of ordinary collection APIs, because the program operating on the collections is not reified and hence cannot be analyzed.
We propose SQuOpt, the Scala Query Optimizer--a deep embedding of the Scala collections API that allows such analyses and optimizations to be defined and executed within Scala, without relying on external tools or compiler extensions. SQuOpt provides the same "look and feel" (syntax and static ty** guarantees) as the standard collections API. We evaluate SQuOpt by re-implementing several code analyses of the Findbugs tool using SQuOpt, show average speedups of 12x with a maximum of 12800x and hence demonstrate that SQuOpt can reconcile modularity and efficiency in real-world applications.
△ Less
Submitted 23 October, 2012;
originally announced October 2012.