-
A Fully Compositional Theory of Sequential Digital Circuits: Denotational, Operational and Algebraic Semantics
Authors:
Dan R. Ghica,
George Kaye,
David Sprunger
Abstract:
Digital circuits, despite having been studied for nearly a century and used at scale for about half that time, have until recently evaded a fully compositional theoretical understanding, in which arbitrary circuits may be freely composed together without consulting their internals. Recent work remedied this theoretical shortcoming by showing how digital circuits can be presented compositionally as…
▽ More
Digital circuits, despite having been studied for nearly a century and used at scale for about half that time, have until recently evaded a fully compositional theoretical understanding, in which arbitrary circuits may be freely composed together without consulting their internals. Recent work remedied this theoretical shortcoming by showing how digital circuits can be presented compositionally as morphisms in a freely generated symmetric traced category. However, this was done informally; in this paper we refine and expand the previous work in several ways, culminating in the presentation of three sound and complete semantics for digital circuits: denotational, operational and algebraic. For the denotational semantics, we establish a correspondence between stream functions with certain properties and circuits constructed syntactically. For the operational semantics, we present the reductions required to model how a circuit processes a value, including the addition of a new reduction for eliminating non-delay-guarded feedback; this leads to an adequate notion of observational equivalence for digital circuits. Finally, we define a new family of equations for translating circuits into bisimilar circuits of a 'normal form', leading to a complete algebraic semantics for sequential circuits
△ Less
Submitted 29 January, 2024; v1 submitted 25 January, 2022;
originally announced January 2022.
-
Functorial String Diagrams for Reverse-Mode Automatic Differentiation
Authors:
Mario Alvarez-Picallo,
Dan R. Ghica,
David Sprunger,
Fabio Zanasi
Abstract:
We enhance the calculus of string diagrams for monoidal categories with hierarchical features in order to capture closed monoidal (and cartesian closed) structure. Using this new syntax we formulate an automatic differentiation algorithm for (applied) simply typed lambda calculus in the style of [Pearlmutter and Siskind 2008] and we prove for the first time its soundness. To give an efficient yet…
▽ More
We enhance the calculus of string diagrams for monoidal categories with hierarchical features in order to capture closed monoidal (and cartesian closed) structure. Using this new syntax we formulate an automatic differentiation algorithm for (applied) simply typed lambda calculus in the style of [Pearlmutter and Siskind 2008] and we prove for the first time its soundness. To give an efficient yet principled implementation of the AD algorithm we define a sound and complete representation of hierarchical string diagrams as a class of hierarchical hypergraphs we call hypernets.
△ Less
Submitted 28 July, 2021;
originally announced July 2021.
-
Reparametrizing gradient descent
Authors:
David Sprunger
Abstract:
In this work, we propose an optimization algorithm which we call norm-adapted gradient descent. This algorithm is similar to other gradient-based optimization algorithms like Adam or Adagrad in that it adapts the learning rate of stochastic gradient descent at each iteration. However, rather than using statistical properties of observed gradients, norm-adapted gradient descent relies on a first-or…
▽ More
In this work, we propose an optimization algorithm which we call norm-adapted gradient descent. This algorithm is similar to other gradient-based optimization algorithms like Adam or Adagrad in that it adapts the learning rate of stochastic gradient descent at each iteration. However, rather than using statistical properties of observed gradients, norm-adapted gradient descent relies on a first-order estimate of the effect of a standard gradient descent update step, much like the Newton-Raphson method in many dimensions. Our algorithm can also be compared to quasi-Newton methods, but we seek roots rather than stationary points. Seeking roots can be justified by the fact that for models with sufficient capacity measured by nonnegative loss functions, roots coincide with global optima. This work presents several experiments where we have used our algorithm; in these results, it appears norm-adapted descent is particularly strong in regression settings but is also capable of training classifiers.
△ Less
Submitted 9 October, 2020;
originally announced October 2020.
-
The differential calculus of causal functions
Authors:
David Sprunger,
Bart Jacobs
Abstract:
Causal functions of sequences occur throughout computer science, from theory to hardware to machine learning. Mealy machines, synchronous digital circuits, signal flow graphs, and recurrent neural networks all have behaviour that can be described by causal functions. In this work, we examine a differential calculus of causal functions which includes many of the familiar properties of standard mult…
▽ More
Causal functions of sequences occur throughout computer science, from theory to hardware to machine learning. Mealy machines, synchronous digital circuits, signal flow graphs, and recurrent neural networks all have behaviour that can be described by causal functions. In this work, we examine a differential calculus of causal functions which includes many of the familiar properties of standard multivariable differential calculus. These causal functions operate on infinite sequences, but this work gives a different notion of an infinite-dimensional derivative than either the Fréchet or Gateaux derivative used in functional analysis. In addition to showing many standard properties of differentiation, we show causal differentiation obeys a unique recurrence rule. We use this recurrence rule to compute the derivative of a simple recurrent neural network called an Elman network by hand and describe how the computed derivative can be used to train the network.
△ Less
Submitted 23 April, 2019;
originally announced April 2019.
-
Differentiable Causal Computations via Delayed Trace
Authors:
David Sprunger,
Shin-ya Katsumata
Abstract:
We investigate causal computations taking sequences of inputs to sequences of outputs where the $n$th output depends on the first $n$ inputs only. We model these in category theory via a construction taking a Cartesian category $C$ to another category $St(C)$ with a novel trace-like operation called "delayed trace", which misses yanking and dinaturality axioms of the usual trace. The delayed trace…
▽ More
We investigate causal computations taking sequences of inputs to sequences of outputs where the $n$th output depends on the first $n$ inputs only. We model these in category theory via a construction taking a Cartesian category $C$ to another category $St(C)$ with a novel trace-like operation called "delayed trace", which misses yanking and dinaturality axioms of the usual trace. The delayed trace operation provides a feedback mechanism in $St(C)$ with an implicit guardedness guarantee.
When $C$ is equipped with a Cartesian differential operator, we construct a differential operator for $St(C)$ using an abstract version of backpropagation through time, a technique from machine learning based on unrolling of functions. This obtains a swath of properties for backpropagation through time, including a chain rule and Schwartz theorem. Our differential operator is also able to compute the derivative of a stateful network without requiring the network to be unrolled.
△ Less
Submitted 4 March, 2019;
originally announced March 2019.
-
Relational Differential Dynamic Logic
Authors:
Juraj Kolčák,
Ichiro Hasuo,
Jérémy Dubut,
Shin-ya Katsumata,
David Sprunger,
Akihisa Yamada
Abstract:
In the field of quality assurance of hybrid systems (that combine continuous physical dynamics and discrete digital control), Platzer's differential dynamic logic (dL) is widely recognized as a deductive verification method with solid mathematical foundations and sophisticated tool support. Motivated by benchmarks provided by our industry partner, we study a relational extension of dL, aiming to f…
▽ More
In the field of quality assurance of hybrid systems (that combine continuous physical dynamics and discrete digital control), Platzer's differential dynamic logic (dL) is widely recognized as a deductive verification method with solid mathematical foundations and sophisticated tool support. Motivated by benchmarks provided by our industry partner, we study a relational extension of dL, aiming to formally prove statements such as "an earlier deployment of the emergency brake decreases the collision speed." A main technical challenge here is to relate two states of two dynamics at different time points. Our main contribution is a theory of suitable simulations (a relational extension of differential invariants that are central proof methods in dL), and a derived technique of time stretching. The latter features particularly high applicability, since the user does not have to synthesize a simulation out of the air. We derive new inference rules for dL from these notions, and demonstrate their use over a couple of automotive case studies.
△ Less
Submitted 12 March, 2020; v1 submitted 28 February, 2019;
originally announced March 2019.
-
Quantitative bisimulations using coreflections and open morphisms
Authors:
Jérémy Dubut,
Ichiro Hasuo,
Shin-ya Katsumata,
David Sprunger
Abstract:
We investigate a canonical way of defining bisimilarity of systems when their semantics is given by a coreflection, typically in a category of transition systems. We use the fact, from Joyal et al., that coreflections preserve open morphisms situations in the sense that a coreflection induces a path subcategory in the category of systems in such a way that open bisimilarity with respect to the ind…
▽ More
We investigate a canonical way of defining bisimilarity of systems when their semantics is given by a coreflection, typically in a category of transition systems. We use the fact, from Joyal et al., that coreflections preserve open morphisms situations in the sense that a coreflection induces a path subcategory in the category of systems in such a way that open bisimilarity with respect to the induced path category coincides with usual bisimilarity of their semantics. We prove that this method is particularly well-suited for systems with quantitative information: we canonically recover the path category of probabilistic systems from Cheng et al., and of timed systems from Nielsen et al., and, finally, we propose a new canonical path category for hybrid systems.
△ Less
Submitted 24 September, 2018;
originally announced September 2018.
-
Neural Nets via Forward State Transformation and Backward Loss Transformation
Authors:
Bart Jacobs,
David Sprunger
Abstract:
This article studies (multilayer perceptron) neural networks with an emphasis on the transformations involved --- both forward and backward --- in order to develop a semantical/logical perspective that is in line with standard program semantics. The common two-pass neural network training algorithms make this viewpoint particularly fitting. In the forward direction, neural networks act as state tr…
▽ More
This article studies (multilayer perceptron) neural networks with an emphasis on the transformations involved --- both forward and backward --- in order to develop a semantical/logical perspective that is in line with standard program semantics. The common two-pass neural network training algorithms make this viewpoint particularly fitting. In the forward direction, neural networks act as state transformers. In the reverse direction, however, neural networks change losses of outputs to losses of inputs, thereby acting like a (real-valued) predicate transformer. In this way, backpropagation is functorial by construction, as shown earlier in recent other work. We illustrate this perspective by training a simple instance of a neural network.
△ Less
Submitted 25 March, 2018;
originally announced March 2018.
-
Eigenvalues and Transduction of Morphic Sequences: Extended Version
Authors:
David Sprunger,
William Tune,
Jörg Endrullis,
Lawrence S. Moss
Abstract:
We study finite state transduction of automatic and morphic sequences. Dekking proved that morphic sequences are closed under transduction and in particular morphic images. We present a simple proof of this fact, and use the construction in the proof to show that non-erasing transductions preserve a condition called alpha-substitutivity. Roughly, a sequence is alpha-substitutive if the sequence ca…
▽ More
We study finite state transduction of automatic and morphic sequences. Dekking proved that morphic sequences are closed under transduction and in particular morphic images. We present a simple proof of this fact, and use the construction in the proof to show that non-erasing transductions preserve a condition called alpha-substitutivity. Roughly, a sequence is alpha-substitutive if the sequence can be obtained as the limit of iterating a substitution with dominant eigenvalue alpha. Our results culminate in the following fact: for multiplicatively independent real numbers alpha and beta, if v is an alpha-substitutive sequence and w is a beta-substitutive sequence, then v and w have no common non-erasing transducts except for the ultimately periodic sequences. We rely on Cobham's theorem for substitutions, a recent result of Durand.
△ Less
Submitted 5 June, 2014;
originally announced June 2014.