-
Probabilistic programming interfaces for random graphs: Markov categories, graphons, and nominal sets
Authors:
Nathanael L. Ackerman,
Cameron E. Freer,
Younesse Kaddar,
Jacek Karwowski,
Sean K. Moss,
Daniel M. Roy,
Sam Staton,
Hongseok Yang
Abstract:
We study semantic models of probabilistic programming languages over graphs, and establish a connection to graphons from graph theory and combinatorics. We show that every well-behaved equational theory for our graph probabilistic programming language corresponds to a graphon, and conversely, every graphon arises in this way.
We provide three constructions for showing that every graphon arises f…
▽ More
We study semantic models of probabilistic programming languages over graphs, and establish a connection to graphons from graph theory and combinatorics. We show that every well-behaved equational theory for our graph probabilistic programming language corresponds to a graphon, and conversely, every graphon arises in this way.
We provide three constructions for showing that every graphon arises from an equational theory. The first is an abstract construction, using Markov categories and monoidal indeterminates. The second and third are more concrete. The second is in terms of traditional measure theoretic probability, which covers 'black-and-white' graphons. The third is in terms of probability monads on the nominal sets of Gabbay and Pitts. Specifically, we use a variation of nominal sets induced by the theory of graphs, which covers Erdős-Rényi graphons. In this way, we build new models of graph probabilistic programming from graphons.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
A Family of Exact Goodness-of-Fit Tests for High-Dimensional Discrete Distributions
Authors:
Feras A. Saad,
Cameron E. Freer,
Nathanael L. Ackerman,
Vikash K. Mansinghka
Abstract:
The objective of goodness-of-fit testing is to assess whether a dataset of observations is likely to have been drawn from a candidate probability distribution. This paper presents a rank-based family of goodness-of-fit tests that is specialized to discrete distributions on high-dimensional domains. The test is readily implemented using a simulation-based, linear-time procedure. The testing procedu…
▽ More
The objective of goodness-of-fit testing is to assess whether a dataset of observations is likely to have been drawn from a candidate probability distribution. This paper presents a rank-based family of goodness-of-fit tests that is specialized to discrete distributions on high-dimensional domains. The test is readily implemented using a simulation-based, linear-time procedure. The testing procedure can be customized by the practitioner using knowledge of the underlying data domain. Unlike most existing test statistics, the proposed test statistic is distribution-free and its exact (non-asymptotic) sampling distribution is known in closed form. We establish consistency of the test against all alternatives by showing that the test statistic is distributed as a discrete uniform if and only if the samples were drawn from the candidate distribution. We illustrate its efficacy for assessing the sample quality of approximate sampling algorithms over combinatorially large spaces with intractable probabilities, including random partitions in Dirichlet process mixture models and random lattices in Ising models.
△ Less
Submitted 26 February, 2019;
originally announced February 2019.
-
The Beta-Bernoulli process and algebraic effects
Authors:
Sam Staton,
Dario Stein,
Hongseok Yang,
Nathanael L. Ackerman,
Cameron E. Freer,
Daniel M. Roy
Abstract:
In this paper we use the framework of algebraic effects from programming language theory to analyze the Beta-Bernoulli process, a standard building block in Bayesian models. Our analysis reveals the importance of abstract data types, and two types of program equations, called commutativity and discardability. We develop an equational theory of terms that use the Beta-Bernoulli process, and show th…
▽ More
In this paper we use the framework of algebraic effects from programming language theory to analyze the Beta-Bernoulli process, a standard building block in Bayesian models. Our analysis reveals the importance of abstract data types, and two types of program equations, called commutativity and discardability. We develop an equational theory of terms that use the Beta-Bernoulli process, and show that the theory is complete with respect to the measure-theoretic semantics, and also in the syntactic sense of Post. Our analysis has a potential for being generalized to other stochastic processes relevant to Bayesian modelling, yielding new understanding of these processes from the perspective of programming.
△ Less
Submitted 15 May, 2018; v1 submitted 26 February, 2018;
originally announced February 2018.
-
On the computability of graphons
Authors:
Nathanael L. Ackerman,
Jeremy Avigad,
Cameron E. Freer,
Daniel M. Roy,
Jason M. Rute
Abstract:
We investigate the relative computability of exchangeable binary relational data when presented in terms of the distribution of an invariant measure on graphs, or as a graphon in either $L^1$ or the cut distance. We establish basic computable equivalences, and show that $L^1$ representations contain fundamentally more computable information than the other representations, but that $0'$ suffices to…
▽ More
We investigate the relative computability of exchangeable binary relational data when presented in terms of the distribution of an invariant measure on graphs, or as a graphon in either $L^1$ or the cut distance. We establish basic computable equivalences, and show that $L^1$ representations contain fundamentally more computable information than the other representations, but that $0'$ suffices to move between computable such representations. We show that $0'$ is necessary in general, but that in the case of random-free graphons, no oracle is necessary. We also provide an example of an $L^1$-computable random-free graphon that is not weakly isomorphic to any graphon with an a.e. continuous version.
△ Less
Submitted 31 January, 2018;
originally announced January 2018.
-
Feedback computability on Cantor space
Authors:
Nathanael L. Ackerman,
Cameron E. Freer,
Robert S. Lubarsky
Abstract:
We introduce the notion of feedback computable functions from $2^ω$ to $2^ω$, extending feedback Turing computation in analogy with the standard notion of computability for functions from $2^ω$ to $2^ω$. We then show that the feedback computable functions are precisely the effectively Borel functions. With this as motivation we define the notion of a feedback computable function on a structure, in…
▽ More
We introduce the notion of feedback computable functions from $2^ω$ to $2^ω$, extending feedback Turing computation in analogy with the standard notion of computability for functions from $2^ω$ to $2^ω$. We then show that the feedback computable functions are precisely the effectively Borel functions. With this as motivation we define the notion of a feedback computable function on a structure, independent of any coding of the structure as a real. We show that this notion is absolute, and as an example characterize those functions that are computable from a Gandy ordinal with some finite subset distinguished.
△ Less
Submitted 29 April, 2019; v1 submitted 3 August, 2017;
originally announced August 2017.
-
On computability and disintegration
Authors:
Nathanael L. Ackerman,
Cameron E. Freer,
Daniel M. Roy
Abstract:
We show that the disintegration operator on a complete separable metric space along a projection map, restricted to measures for which there is a unique continuous disintegration, is strongly Weihrauch equivalent to the limit operator Lim. When a measure does not have a unique continuous disintegration, we may still obtain a disintegration when some basis of continuity sets has the Vitali covering…
▽ More
We show that the disintegration operator on a complete separable metric space along a projection map, restricted to measures for which there is a unique continuous disintegration, is strongly Weihrauch equivalent to the limit operator Lim. When a measure does not have a unique continuous disintegration, we may still obtain a disintegration when some basis of continuity sets has the Vitali covering property with respect to the measure; the disintegration, however, may depend on the choice of sets. We show that, when the basis is computable, the resulting disintegration is strongly Weihrauch reducible to Lim, and further exhibit a single distribution realizing this upper bound.
△ Less
Submitted 10 May, 2016; v1 submitted 9 September, 2015;
originally announced September 2015.
-
On the computability of conditional probability
Authors:
Nathanael L. Ackerman,
Cameron E. Freer,
Daniel M. Roy
Abstract:
As inductive inference and machine learning methods in computer science see continued success, researchers are aiming to describe ever more complex probabilistic models and inference algorithms. It is natural to ask whether there is a universal computational procedure for probabilistic inference. We investigate the computability of conditional probability, a fundamental notion in probability theor…
▽ More
As inductive inference and machine learning methods in computer science see continued success, researchers are aiming to describe ever more complex probabilistic models and inference algorithms. It is natural to ask whether there is a universal computational procedure for probabilistic inference. We investigate the computability of conditional probability, a fundamental notion in probability theory and a cornerstone of Bayesian statistics. We show that there are computable joint distributions with noncomputable conditional distributions, ruling out the prospect of general inference algorithms, even inefficient ones. Specifically, we construct a pair of computable random variables in the unit interval such that the conditional distribution of the first variable given the second encodes the halting problem. Nevertheless, probabilistic inference is possible in many common modeling settings, and we prove several results giving broadly applicable conditions under which conditional distributions are computable. In particular, conditional distributions become computable when measurements are corrupted by independent computable noise with a sufficiently smooth bounded density.
△ Less
Submitted 16 November, 2019; v1 submitted 17 May, 2010;
originally announced May 2010.