Search | arXiv e-print repository

Intensive Care as One Big Sequence Modeling Problem

Abstract: Reinforcement Learning in Healthcare is typically concerned with narrow self-contained tasks such as sepsis prediction or anesthesia control. However, previous research has demonstrated the potential of generalist models (the prime example being Large Language Models) to outperform task-specific approaches due to their capability for implicit transfer learning. To enable training of foundation mod… ▽ More Reinforcement Learning in Healthcare is typically concerned with narrow self-contained tasks such as sepsis prediction or anesthesia control. However, previous research has demonstrated the potential of generalist models (the prime example being Large Language Models) to outperform task-specific approaches due to their capability for implicit transfer learning. To enable training of foundation models for Healthcare as well as leverage the capabilities of state of the art Transformer architectures, we propose the paradigm of Healthcare as Sequence Modeling, in which interaction between the patient and the healthcare provider is represented as an event stream and tasks like diagnosis and treatment selection are modeled as prediction of future events in the stream. To explore this paradigm experimentally we develop MIMIC-SEQ, a sequence modeling benchmark derived by translating heterogenous clinical records from MIMIC-IV dataset into a uniform event stream format, train a baseline model and explore its capabilities. △ Less

Submitted 24 May, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

arXiv:2308.00651 [pdf, ps, other]

Absolute continuity, supports and idempotent splitting in categorical probability

Authors: Tobias Fritz, Tomáš Gonda, Antonio Lorenzin, Paolo Perrone, Dario Stein

Abstract: Markov categories have recently turned out to be a powerful high-level framework for probability and statistics. They accommodate purely categorical definitions of notions like conditional probability and almost sure equality, as well as proofs of fundamental results such as the Hewitt-Savage 0/1 Law, the de Finetti Theorem and the Ergodic Decomposition Theorem. In this work, we develop additional… ▽ More Markov categories have recently turned out to be a powerful high-level framework for probability and statistics. They accommodate purely categorical definitions of notions like conditional probability and almost sure equality, as well as proofs of fundamental results such as the Hewitt-Savage 0/1 Law, the de Finetti Theorem and the Ergodic Decomposition Theorem. In this work, we develop additional relevant notions from probability theory in the setting of Markov categories. This comprises improved versions of previously introduced definitions of absolute continuity and supports, as well as a detailed study of idempotents and idempotent splitting in Markov categories. Our main result on idempotent splitting is that every idempotent measurable Markov kernel between standard Borel spaces splits through another standard Borel space, and we derive this as an instance of a general categorical criterion for idempotent splitting in Markov categories. △ Less

Submitted 6 September, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

Comments: 84 pages (including 18 page appendix and many string diagrams). v2: Corollary 4.4.10 and results needed to establish it were added

MSC Class: 18M05; 60A05

arXiv:2303.14049 [pdf, other]

doi 10.4230/LIPIcs.CALCO.2023.16

Weakly Markov categories and weakly affine monads

Authors: Tobias Fritz, Fabio Gadducci, Paolo Perrone, Davide Trotta

Abstract: Introduced in the 1990s in the context of the algebraic approach to graph rewriting, gs-monoidal categories are symmetric monoidal categories where each object is equipped with the structure of a commutative comonoid. They arise for example as Kleisli categories of commutative monads on cartesian categories, and as such they provide a general framework for effectful computation. Recently proposed… ▽ More Introduced in the 1990s in the context of the algebraic approach to graph rewriting, gs-monoidal categories are symmetric monoidal categories where each object is equipped with the structure of a commutative comonoid. They arise for example as Kleisli categories of commutative monads on cartesian categories, and as such they provide a general framework for effectful computation. Recently proposed in the context of categorical probability, Markov categories are gs-monoidal categories where the monoidal unit is also terminal, and they arise for example as Kleisli categories of commutative affine monads, where affine means that the monad preserves the monoidal unit. The aim of this paper is to study a new condition on the gs-monoidal structure, resulting in the concept of weakly Markov categories, which is intermediate between gs-monoidal categories and Markov ones. In a weakly Markov category, the morphisms to the monoidal unit are not necessarily unique, but form a group. As we show, these categories exhibit a rich theory of conditional independence for morphisms, generalising the known theory for Markov categories. We also introduce the corresponding notion for commutative monads, which we call weakly affine, and for which we give two equivalent characterisations. The paper argues that these monads are relevant to the study of categorical probability. A case at hand is the monad of finite non-zero measures, which is weakly affine but not affine. Such structures allow to investigate probability without normalisation within an elegant categorical framework. △ Less

Submitted 25 August, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

Comments: CALCO 2023

MSC Class: 18M30; 18M35; 60A05

Journal ref: 10th Conference on Algebra and Coalgebra in Computer Science (CALCO 2023), 16:1--16:17

arXiv:2301.07353 [pdf, ps, other]

doi 10.1109/TIT.2024.3352088

Matrix majorization in large samples

Authors: Muhammad Usman Farooq, Tobias Fritz, Erkka Haapasalo, Marco Tomamichel

Abstract: One tuple of probability vectors is more informative than another tuple when there exists a single stochastic matrix transforming the probability vectors of the first tuple into the probability vectors of the other. This is called matrix majorization. Solving an open problem raised by Mu et al, we show that if certain monotones - namely multivariate extensions of Rényi divergences - are strictly o… ▽ More One tuple of probability vectors is more informative than another tuple when there exists a single stochastic matrix transforming the probability vectors of the first tuple into the probability vectors of the other. This is called matrix majorization. Solving an open problem raised by Mu et al, we show that if certain monotones - namely multivariate extensions of Rényi divergences - are strictly ordered between the two tuples, then for sufficiently large $n$, there exists a stochastic matrix taking the $n$-fold Kronecker power of each input distribution to the $n$-fold Kronecker power of the corresponding output distribution. The same conditions, with non-strict ordering for the monotones, are also necessary for such matrix majorization in large samples. Our result also gives conditions for the existence of a sequence of statistical maps that asymptotically (with vanishing error) convert a single copy of each input distribution to the corresponding output distribution with the help of a catalyst that is returned unchanged. Allowing for transformation with arbitrarily small error, we find conditions that are both necessary and sufficient for such catalytic matrix majorization. We derive our results by building on a general algebraic theory of preordered semirings recently developed by one of the authors. This also allows us to recover various existing results on majorization in large samples and in the catalytic regime as well as relative majorization in a unified manner. △ Less

Submitted 8 January, 2024; v1 submitted 18 January, 2023; originally announced January 2023.

Comments: 59 pages, 3 figures. Comparing to the earlier version, some typos and terminology were fixed and a further corollary (Corollary 46) was added

Journal ref: IEEE Transactions on Information Theory 70(5), 3118-3144 (2024)

arXiv:2211.02507 [pdf, ps, other]

doi 10.1017/S0960129523000324

Dilations and information flow axioms in categorical probability

Authors: Tobias Fritz, Tomáš Gonda, Nicholas Gauguin Houghton-Larsen, Antonio Lorenzin, Paolo Perrone, Dario Stein

Abstract: We study the positivity and causality axioms for Markov categories as properties of dilations and information flow in Markov categories, and in variations thereof for arbitrary semicartesian monoidal categories. These help us show that being a positive Markov category is merely an additional property of a symmetric monoidal category (rather than extra structure). We also characterize the positivit… ▽ More We study the positivity and causality axioms for Markov categories as properties of dilations and information flow in Markov categories, and in variations thereof for arbitrary semicartesian monoidal categories. These help us show that being a positive Markov category is merely an additional property of a symmetric monoidal category (rather than extra structure). We also characterize the positivity of representable Markov categories and prove that causality implies positivity, but not conversely. Finally, we note that positivity fails for quasi-Borel spaces and interpret this failure as a privacy property of probabilistic name generation. △ Less

Submitted 9 June, 2023; v1 submitted 4 November, 2022; originally announced November 2022.

Comments: 42 pages

MSC Class: 18M05; 18D10; 60A05; 68Q55; 03B70

Journal ref: Mathematical Structures in Computer Science 33(10), 913-957 (2023)

arXiv:2207.05740 [pdf, ps, other]

The d-separation criterion in Categorical Probability

Authors: Tobias Fritz, Andreas Klingler

Abstract: The d-separation criterion detects the compatibility of a joint probability distribution with a directed acyclic graph through certain conditional independences. In this work, we study this problem in the context of categorical probability theory by introducing a categorical definition of causal models, a categorical notion of d-separation, and proving an abstract version of the d-separation crite… ▽ More The d-separation criterion detects the compatibility of a joint probability distribution with a directed acyclic graph through certain conditional independences. In this work, we study this problem in the context of categorical probability theory by introducing a categorical definition of causal models, a categorical notion of d-separation, and proving an abstract version of the d-separation criterion. This approach has two main benefits. First, categorical d-separation is a very intuitive criterion based on topological connectedness. Second, our results apply both to measure-theoretic probability (with standard Borel spaces) and beyond probability theory, including to deterministic and possibilistic networks. It therefore provides a clean proof of the equivalence of local and global Markov properties with causal compatibility for continuous and mixed random variables as well as deterministic and possibilistic variables. △ Less

Submitted 20 February, 2023; v1 submitted 12 July, 2022; originally announced July 2022.

Comments: 42 pages, v2: more examples and an extended introduction, v3: corrected typo in Def. 4

MSC Class: Primary: 18M30; 62A09; Secondary: 18M35; 60A05; 62D20

Journal ref: J. Mach. Learn. Res. 24(46), 1-49 (2023)

arXiv:2205.06892 [pdf, ps, other]

doi 10.1007/s10485-023-09750-z

From Gs-monoidal to Oplax Cartesian Categories: Constructions and Functorial Completeness

Authors: Tobias Fritz, Fabio Gadducci, Davide Trotta, Andrea Corradini

Abstract: Originally introduced in the context of the algebraic approach to term graph rewriting, the notion of gs-monoidal category has surfaced a few times under different monikers in the last decades. They can be thought of as symmetric monoidal categories whose arrows are generalised relations, with enough structure to talk about domains and partial functions, but less structure than cartesian bicategor… ▽ More Originally introduced in the context of the algebraic approach to term graph rewriting, the notion of gs-monoidal category has surfaced a few times under different monikers in the last decades. They can be thought of as symmetric monoidal categories whose arrows are generalised relations, with enough structure to talk about domains and partial functions, but less structure than cartesian bicategories. The aim of this paper is threefold. The first goal is to extend the original definition of gs-monoidality by enriching it with a preorder on arrows, giving rise to what we call oplax cartesian categories. Second, we show that (preorder-enriched) gs-monoidal categories naturally arise both as Kleisli categories and as span categories, and the relation between the resulting formalisms is explored. Finally, we present two theorems concerning Yoneda embeddings on the one hand and functorial completeness on the other, the latter inducing a completeness result also for lax functors from oplax cartesian categories to $\mathbf{Rel}$. △ Less

Submitted 29 September, 2023; v1 submitted 13 May, 2022; originally announced May 2022.

ACM Class: F.3; F.4

Journal ref: Appl. Categ. Structures 31, 42 (2023)

arXiv:2204.02284 [pdf, ps, other]

doi 10.1007/s10485-023-09717-0

Free gs-monoidal categories and free Markov categories

Authors: Tobias Fritz, Wendong Liang

Abstract: Categorical probability has recently seen significant advances through the formalism of Markov categories, within which several classical theorems have been proven in entirely abstract categorical terms. Closely related to Markov categories are gs-monoidal categories, also known as CD categories. These omit a condition that implements the normalization of probability. Extending work of Corradini a… ▽ More Categorical probability has recently seen significant advances through the formalism of Markov categories, within which several classical theorems have been proven in entirely abstract categorical terms. Closely related to Markov categories are gs-monoidal categories, also known as CD categories. These omit a condition that implements the normalization of probability. Extending work of Corradini and Gadducci, we construct free gs-monoidal and free Markov categories generated by a collection of morphisms of arbitrary arity and coarity. For free gs-monoidal categories, this comes in the form of an explicit combinatorial description of their morphisms as structured cospans of labeled hypergraphs. These can be thought of as a formalization of gs-monoidal string diagrams ($=$term graphs) as a combinatorial data structure. We formulate the appropriate $2$-categorical universal property based on ideas of Walters and prove that our categories satisfy it. We expect our free categories to be relevant for computer implementations and we also argue that they can be used as statistical causal models generalizing Bayesian networks. △ Less

Submitted 8 February, 2023; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: 35 pages. v3: minor revision, to appear in Appl. Categ. Structures

MSC Class: 18M30 (Primary); 18M35; 60A05; 62D20; 68Q42 (Secondary)

Journal ref: Appl. Categ. Structures 31, 21 (2023)

arXiv:2105.02639 [pdf, ps, other]

doi 10.31390/josa.2.4.06

De Finetti's Theorem in Categorical Probability

Authors: Tobias Fritz, Tomáš Gonda, Paolo Perrone

Abstract: We present a novel proof of de Finetti's Theorem characterizing permutation-invariant probability measures of infinite sequences of variables, so-called exchangeable measures. The proof is phrased in the language of Markov categories, which provide an abstract categorical framework for probability and information flow. The diagrammatic and abstract nature of the arguments makes the proof intuitive… ▽ More We present a novel proof of de Finetti's Theorem characterizing permutation-invariant probability measures of infinite sequences of variables, so-called exchangeable measures. The proof is phrased in the language of Markov categories, which provide an abstract categorical framework for probability and information flow. The diagrammatic and abstract nature of the arguments makes the proof intuitive and easy to follow. We also show how the usual measure-theoretic version of de Finetti's Theorem for standard Borel spaces is an instance of this result. △ Less

Submitted 16 September, 2021; v1 submitted 6 May, 2021; originally announced May 2021.

Comments: 26 pages. v3: referee's suggestions incorporated

MSC Class: 60A05; 60G09 (Primary) 18M35; 18M05; 62A01 (Secondary)

Journal ref: J. Stoch. Anal. 2(4), 6 (2021)

arXiv:2010.07416 [pdf, ps, other]

doi 10.1016/j.tcs.2023.113896

Representable Markov Categories and Comparison of Statistical Experiments in Categorical Probability

Authors: Tobias Fritz, Tomáš Gonda, Paolo Perrone, Eigil Fjeldgren Rischel

Abstract: Markov categories are a recent categorical approach to the mathematical foundations of probability and statistics. Here, this approach is advanced by stating and proving equivalent conditions for second-order stochastic dominance, a widely used way of comparing probability distributions by their spread. Furthermore, we lay foundation for the theory of comparing statistical experiments within Marko… ▽ More Markov categories are a recent categorical approach to the mathematical foundations of probability and statistics. Here, this approach is advanced by stating and proving equivalent conditions for second-order stochastic dominance, a widely used way of comparing probability distributions by their spread. Furthermore, we lay foundation for the theory of comparing statistical experiments within Markov categories by stating and proving the classical Blackwell-Sherman-Stein Theorem. Our version not only offers new insight into the proof, but its abstract nature also makes the result more general, automatically specializing to the standard Blackwell-Sherman-Stein Theorem in measure-theoretic probability as well as a Bayesian version that involves prior-dependent garbling. Along the way, we define and characterize representable Markov categories, within which one can talk about Markov kernels to or from spaces of distributions. We do so by exploring the relation between Markov categories and Kleisli categories of probability monads. △ Less

Submitted 8 May, 2023; v1 submitted 14 October, 2020; originally announced October 2020.

Comments: 63 pages, color used in text and diagrams. v3: To be published in Theoretical Computer Science. Section 6 on strongly representable Markov categories removed to streamline the narrative, plus other minor changes

MSC Class: 60A05; 62B15 (Primary) 18C20; 18M05; 62A01 (Secondary)

Journal ref: Theoretical Computer Science 961, 113896 (2023)

arXiv:1910.03752 [pdf, ps, other]

doi 10.1017/S0960129521000414

Probability, valuations, hyperspace: Three monads on Top and the support as a morphism

Authors: Tobias Fritz, Paolo Perrone, Sharwin Rezagholi

Abstract: We consider three monads on Top, the category of topological spaces, which formalize topological aspects of probability and possibility in categorical terms. The first one is the Hoare hyperspace monad H, which assigns to every space its space of closed subsets equipped with the lower Vietoris topology. The second is the monad V of continuous valuations, also known as the extended probabilistic po… ▽ More We consider three monads on Top, the category of topological spaces, which formalize topological aspects of probability and possibility in categorical terms. The first one is the Hoare hyperspace monad H, which assigns to every space its space of closed subsets equipped with the lower Vietoris topology. The second is the monad V of continuous valuations, also known as the extended probabilistic powerdomain. We construct both monads in a unified way in terms of double dualization. This reveals a close analogy between them, and allows us to prove that the operation of taking the support of a continuous valuation is a morphism of monads from V to H. In particular, this implies that every H-algebra (topological complete semilattice) is also a V-algebra. Third, we show that V can be restricted to a submonad of tau-smooth probability measures on Top. By composing these two morphisms of monads, we obtain that taking the support of a tau-smooth probability measure is also a morphism of monads. △ Less

Submitted 16 September, 2021; v1 submitted 8 October, 2019; originally announced October 2019.

Comments: 65 pages

MSC Class: 28B99; 54C99; 18C15; 46M99

Journal ref: Mathematical Structures in Computer Science 31(8), 850-897 (2021)

arXiv:1908.07021 [pdf, ps, other]

doi 10.1016/j.aim.2020.107239

A synthetic approach to Markov kernels, conditional independence and theorems on sufficient statistics

Authors: Tobias Fritz

Abstract: We develop Markov categories as a framework for synthetic probability and statistics, following work of Golubtsov as well as Cho and Jacobs. This means that we treat the following concepts in purely abstract categorical terms: conditioning and disintegration; various versions of conditional independence and its standard properties; conditional products; almost surely; sufficient statistics; versio… ▽ More We develop Markov categories as a framework for synthetic probability and statistics, following work of Golubtsov as well as Cho and Jacobs. This means that we treat the following concepts in purely abstract categorical terms: conditioning and disintegration; various versions of conditional independence and its standard properties; conditional products; almost surely; sufficient statistics; versions of theorems on sufficient statistics due to Fisher--Neyman, Basu, and Bahadur. Besides the conceptual clarity offered by our categorical setup, its main advantage is that it provides a uniform treatment of various types of probability theory, including discrete probability theory, measure-theoretic probability with general measurable spaces, Gaussian probability, stochastic processes of either of these kinds, and many others. △ Less

Submitted 31 May, 2020; v1 submitted 19 August, 2019; originally announced August 2019.

Comments: 98 pages. v6: fixed error in Section 7. v7: incorporates referee's comments. v8: minor correction

MSC Class: 60A05; 62A01 (Primary); 62B05; 18D10; 68Q55 (Secondary)

Journal ref: Adv. Math. 370, 107239 (2020)

arXiv:1810.06037 [pdf, ps, other]

doi 10.1016/j.entcs.2020.09.007

Monads, partial evaluations, and rewriting

Authors: Tobias Fritz, Paolo Perrone

Abstract: Monads can be interpreted as encoding formal expressions, or formal operations in the sense of universal algebra. We give a construction which formalizes the idea of "evaluating an expression partially": for example, "2+3" can be obtained as a partial evaluation of "2+2+1". This construction can be given for any monad, and it is linked to the famous bar construction, of which it gives an operation… ▽ More Monads can be interpreted as encoding formal expressions, or formal operations in the sense of universal algebra. We give a construction which formalizes the idea of "evaluating an expression partially": for example, "2+3" can be obtained as a partial evaluation of "2+2+1". This construction can be given for any monad, and it is linked to the famous bar construction, of which it gives an operational interpretation: the bar construction induces a simplicial set, and its 1-cells are partial evaluations. We study the properties of partial evaluations for general monads. We prove that whenever the monad is weakly cartesian, partial evaluations can be composed via the usual Kan filler property of simplicial sets, of which we give an interpretation in terms of substitution of terms. In terms of rewritings, partial evaluations give an abstract reduction system which is reflexive, confluent, and transitive whenever the monad is weakly cartesian. For the case of probability monads, partial evaluations correspond to what probabilists call conditional expectation of random variables. This manuscript is part of a work in progress on a general rewriting interpretation of the bar construction. △ Less

Submitted 16 May, 2020; v1 submitted 14 October, 2018; originally announced October 2018.

Comments: Originally written for the ACT Adjoint School 2019. To appear in Proceedings of MFPS 2020

MSC Class: 18C15; 18G30

Journal ref: ENTCS 353, 129-148 (2020)

arXiv:1808.09898 [pdf, ps, other]

doi 10.1016/j.aim.2020.107081

Stochastic order on metric spaces and the ordered Kantorovich monad

Authors: Tobias Fritz, Paolo Perrone

Abstract: In earlier work, we had introduced the Kantorovich probability monad on complete metric spaces, extending a construction due to van Breugel. Here we extend the Kantorovich monad further to a certain class of ordered metric spaces, by endowing the spaces of probability measures with the usual stochastic order. It can be considered a metric analogue of the probabilistic powerdomain. The spaces we… ▽ More In earlier work, we had introduced the Kantorovich probability monad on complete metric spaces, extending a construction due to van Breugel. Here we extend the Kantorovich monad further to a certain class of ordered metric spaces, by endowing the spaces of probability measures with the usual stochastic order. It can be considered a metric analogue of the probabilistic powerdomain. The spaces we consider, which we call L-ordered, are spaces where the order satisfies a mild compatibility condition with the metric itself, rather than merely with the underlying topology. As we show, this is related to the theory of Lawvere metric spaces, in which the partial order structure is induced by the zero distances. We show that the algebras of the ordered Kantorovich monad are the closed convex subsets of Banach spaces equipped with a closed positive cone, with algebra morphisms given by the short and monotone affine maps. Considering the category of L-ordered metric spaces as a locally posetal 2-category, the lax and oplax algebra morphisms are exactly the concave and convex short maps, respectively. In the unordered case, we had identified the Wasserstein space as the colimit of the spaces of empirical distributions of finite sequences. We prove that this extends to the ordered setting as well by showing that the stochastic order arises by completing the order between the finite sequences, generalizing a recent result of Lawson. The proof holds on any metric space equipped with a closed partial order. △ Less

Submitted 18 February, 2020; v1 submitted 29 August, 2018; originally announced August 2018.

Comments: 49 pages. Removed incorrect statement (Theorem 6.1.10 of previous version)

MSC Class: 60B05; 46A20; 18C15

Journal ref: Advances in Mathematics, vol. 366, 2020

arXiv:1804.03527 [pdf, other]

doi 10.1016/j.entcs.2018.11.007

Bimonoidal Structure of Probability Monads

Authors: Tobias Fritz, Paolo Perrone

Abstract: We give a conceptual treatment of the notion of joints, marginals, and independence in the setting of categorical probability. This is achieved by endowing the usual probability monads (like the Giry monad) with a monoidal and an opmonoidal structure, mutually compatible (i.e. a bimonoidal structure). If the underlying monoidal category is cartesian monoidal, a bimonoidal structure is given unique… ▽ More We give a conceptual treatment of the notion of joints, marginals, and independence in the setting of categorical probability. This is achieved by endowing the usual probability monads (like the Giry monad) with a monoidal and an opmonoidal structure, mutually compatible (i.e. a bimonoidal structure). If the underlying monoidal category is cartesian monoidal, a bimonoidal structure is given uniquely by a commutative strength. However, if the underlying monoidal category is not cartesian monoidal, a strength is not enough to guarantee all the desired properties of joints and marginals. A bimonoidal structure is then the correct requirement for the more general case. We explain the theory and the operational interpretation, with the help of the graphical calculus for monoidal categories. We give a definition of stochastic independence based on the bimonoidal structure, compatible with the intuition and with other approaches in the literature for cartesian monoidal categories. We then show as an example that the Kantorovich monad on the category of complete metric spaces is a bimonoidal monad for a non-cartesian monoidal structure. △ Less

Submitted 31 January, 2020; v1 submitted 10 April, 2018; originally announced April 2018.

Comments: 39 pages, 58 figures, MFPS 2018 conference paper. Fixed minor issue in published version, see footnote 2

MSC Class: 60A05; 18C15; 16W30

Journal ref: Electron. Notes Theor. Comput. Sci. 341, 121-149 (2018)

arXiv:1712.05363 [pdf, ps, other]

A Probability Monad as the Colimit of Spaces of Finite Samples

Authors: Tobias Fritz, Paolo Perrone

Abstract: We define and study a probability monad on the category of complete metric spaces and short maps. It assigns to each space the space of Radon probability measures on it with finite first moment, equipped with the Kantorovich-Wasserstein distance. This monad is analogous to the Giry monad on the category of Polish spaces, and it extends a construction due to van Breugel for compact and for 1-bounde… ▽ More We define and study a probability monad on the category of complete metric spaces and short maps. It assigns to each space the space of Radon probability measures on it with finite first moment, equipped with the Kantorovich-Wasserstein distance. This monad is analogous to the Giry monad on the category of Polish spaces, and it extends a construction due to van Breugel for compact and for 1-bounded complete metric spaces. We prove that this Kantorovich monad arises from a colimit construction on finite power-like constructions, which formalizes the intuition that probability measures are limits of finite samples. The proof relies on a criterion for when an ordinary left Kan extension of lax monoidal functors is a monoidal Kan extension. The colimit characterization allows the development of integration theory and the treatment of measures on spaces of measures, without measure theory. We also show that the category of algebras of the Kantorovich monad is equivalent to the category of closed convex subsets of Banach spaces with short affine maps as morphisms. △ Less

Submitted 12 March, 2019; v1 submitted 14 December, 2017; originally announced December 2017.

Comments: 56 pages

MSC Class: 60A05; 18C15; 52A01

Journal ref: Theory and Applications of Categories, Vol. 34, No. 7, 2019, pp. 170-220

arXiv:1607.01302 [pdf, other]

doi 10.1103/PhysRevA.96.052112

A Resource Theory for Work and Heat

Authors: Carlo Sparaciari, Jonathan Oppenheim, Tobias Fritz

Abstract: Several recent results on thermodynamics have been obtained using the tools of quantum information theory and resource theories. So far, the resource theories utilised to describe thermodynamics have assumed the existence of an infinite thermal reservoir, by declaring that thermal states at some background temperature come for free. Here, we propose a resource theory of quantum thermodynamics with… ▽ More Several recent results on thermodynamics have been obtained using the tools of quantum information theory and resource theories. So far, the resource theories utilised to describe thermodynamics have assumed the existence of an infinite thermal reservoir, by declaring that thermal states at some background temperature come for free. Here, we propose a resource theory of quantum thermodynamics without a background temperature, so that no states at all come for free. We apply this resource theory to the case of many non-interacting systems, and show that all quantum states are classified by their entropy and average energy, even arbitrarily far away from equilibrium. This implies that thermodynamics takes place in a two-dimensional convex set that we call the energy-entropy diagram. The answers to many resource-theoretic questions about thermodynamics can be read off from this diagram, such as the efficiency of a heat engine consisting of finite reservoirs, or the rate of conversion between two states. This allows us to consider a resource theory which puts work and heat on an equal footing, and serves as a model for other resource theories. △ Less

Submitted 19 October, 2017; v1 submitted 5 July, 2016; originally announced July 2016.

Comments: main text: 12 pages, 5 figure; appendix: 7 pages

Journal ref: Phys. Rev. A 96, 052112 (2017)

arXiv:1409.6502 [pdf, other]

The Making of Cloud Applications An Empirical Study on Software Development for the Cloud

Authors: Jürgen Cito, Philipp Leitner, Thomas Fritz, Harald C. Gall

Abstract: Cloud computing is gaining more and more traction as a deployment and provisioning model for software. While a large body of research already covers how to optimally operate a cloud system, we still lack insights into how professional software engineers actually use clouds, and how the cloud impacts development practices. This paper reports on the first systematic study on how software developers… ▽ More Cloud computing is gaining more and more traction as a deployment and provisioning model for software. While a large body of research already covers how to optimally operate a cloud system, we still lack insights into how professional software engineers actually use clouds, and how the cloud impacts development practices. This paper reports on the first systematic study on how software developers build applications in the cloud. We conducted a mixed-method study, consisting of qualitative interviews of 25 professional developers and a quantitative survey with 294 responses. Our results show that adopting the cloud has a profound impact throughout the software development process, as well as on how developers utilize tools and data in their daily work. Among other things, we found that (1) developers need better means to anticipate runtime problems and rigorously define metrics for improved fault localization and (2) the cloud offers an abundance of operational data, however, developers still often rely on their experience and intuition rather than utilizing metrics. From our findings, we extracted a set of guidelines for cloud development and identified challenges for researchers and tool vendors. △ Less

Submitted 17 March, 2015; v1 submitted 23 September, 2014; originally announced September 2014.

arXiv:1409.5531 [pdf, other]

doi 10.1016/j.ic.2016.02.008

A mathematical theory of resources

Authors: Bob Coecke, Tobias Fritz, Robert W. Spekkens

Abstract: In many different fields of science, it is useful to characterize physical states and processes as resources. Chemistry, thermodynamics, Shannon's theory of communication channels, and the theory of quantum entanglement are prominent examples. Questions addressed by a theory of resources include: Which resources can be converted into which other ones? What is the rate at which arbitrarily many cop… ▽ More In many different fields of science, it is useful to characterize physical states and processes as resources. Chemistry, thermodynamics, Shannon's theory of communication channels, and the theory of quantum entanglement are prominent examples. Questions addressed by a theory of resources include: Which resources can be converted into which other ones? What is the rate at which arbitrarily many copies of one resource can be converted into arbitrarily many copies of another? Can a catalyst help in making an impossible transformation possible? How does one quantify the resource? Here, we propose a general mathematical definition of what constitutes a resource theory. We prove some general theorems about how resource theories can be constructed from theories of processes wherein there is a special class of processes that are implementable at no cost and which define the means by which the costly states and processes can be interconverted one to another. We outline how various existing resource theories fit into our framework. Our abstract characterization of resource theories is a first step in a larger project of identifying universal features and principles of resource theories. In this vein, we identify a few general results concerning resource convertibility. △ Less

Submitted 28 November, 2014; v1 submitted 19 September, 2014; originally announced September 2014.

Comments: 32 pages, many figures. v2 and v3: minor revisions

Journal ref: Information and Computation 250 (2016), 59--86

arXiv:1402.3067 [pdf, ps, other]

A Bayesian Characterization of Relative Entropy

Authors: John C. Baez, Tobias Fritz

Abstract: We give a new characterization of relative entropy, also known as the Kullback-Leibler divergence. We use a number of interesting categories related to probability theory. In particular, we consider a category FinStat where an object is a finite set equipped with a probability distribution, while a morphism is a measure-preserving function $f: X \to Y$ together with a stochastic right inverse… ▽ More We give a new characterization of relative entropy, also known as the Kullback-Leibler divergence. We use a number of interesting categories related to probability theory. In particular, we consider a category FinStat where an object is a finite set equipped with a probability distribution, while a morphism is a measure-preserving function $f: X \to Y$ together with a stochastic right inverse $s: Y \to X$. The function $f$ can be thought of as a measurement process, while s provides a hypothesis about the state of the measured system given the result of a measurement. Given this data we can define the entropy of the probability distribution on $X$ relative to the "prior" given by pushing the probability distribution on $Y$ forwards along $s$. We say that $s$ is "optimal" if these distributions agree. We show that any convex linear, lower semicontinuous functor from FinStat to the additive monoid $[0,\infty]$ which vanishes when $s$ is optimal must be a scalar multiple of this relative entropy. Our proof is independent of all earlier characterizations, but inspired by the work of Petz. △ Less

Submitted 11 July, 2014; v1 submitted 13 February, 2014; originally announced February 2014.

Comments: 32 pages, minor revision

MSC Class: Primary 94A17; Secondary 62F15; 18B99

Journal ref: Theory and Applications of Categories, Vol. 29 No. 16 (2014), 421-456

arXiv:1112.4788 [pdf, ps, other]

doi 10.1109/TIT.2012.2222863

Entropic Inequalities and Marginal Problems

Authors: Tobias Fritz, Rafael Chaves

Abstract: A marginal problem asks whether a given family of marginal distributions for some set of random variables arises from some joint distribution of these variables. Here we point out that the existence of such a joint distribution imposes non-trivial conditions already on the level of Shannon entropies of the given marginals. These entropic inequalities are necessary (but not sufficient) criteria for… ▽ More A marginal problem asks whether a given family of marginal distributions for some set of random variables arises from some joint distribution of these variables. Here we point out that the existence of such a joint distribution imposes non-trivial conditions already on the level of Shannon entropies of the given marginals. These entropic inequalities are necessary (but not sufficient) criteria for the existence of a joint distribution. For every marginal problem, a list of such Shannon-type entropic inequalities can be calculated by Fourier-Motzkin elimination, and we offer a software interface to a Fourier-Motzkin solver for doing so. For the case that the hypergraph of given marginals is a cycle graph, we provide a complete analytic solution to the problem of classifying all relevant entropic inequalities, and use this result to bound the decay of correlations in stochastic processes. Furthermore, we show that Shannon-type inequalities for differential entropies are not relevant for continuous-variable marginal problems; non-Shannon-type inequalities are, both in the discrete and in the continuous case. In contrast to other approaches, our general framework easily adapts to situations where one has additional (conditional) independence requirements on the joint distribution, as in the case of graphical models. We end with a list of open problems. A complementary article discusses applications to quantum nonlocality and contextuality. △ Less

Submitted 26 September, 2012; v1 submitted 20 December, 2011; originally announced December 2011.

Comments: 26 pages, 3 figures

MSC Class: 94A17; 60A05

Journal ref: IEEE Trans. on Information Theory, vol. 59, pages 803 - 817 (2013)

arXiv:1109.1963 [pdf, ps, other]

doi 10.1016/j.disc.2013.02.010

Velocity Polytopes of Periodic Graphs and a No-Go Theorem for Digital Physics

Authors: Tobias Fritz

Abstract: A periodic graph in dimension $d$ is a directed graph with a free action of $\Z^d$ with only finitely many orbits. It can conveniently be represented in terms of an associated finite graph with weights in $\Z^d$, corresponding to a $\Z^d$-bundle with connection. Here we use the weight sums along cycles in this associated graph to construct a certain polytope in $\R^d$, which we regard as a geometr… ▽ More A periodic graph in dimension $d$ is a directed graph with a free action of $\Z^d$ with only finitely many orbits. It can conveniently be represented in terms of an associated finite graph with weights in $\Z^d$, corresponding to a $\Z^d$-bundle with connection. Here we use the weight sums along cycles in this associated graph to construct a certain polytope in $\R^d$, which we regard as a geometrical invariant associated to the periodic graph. It is the unit ball of a norm on $\R^d$ describing the large-scale geometry of the graph. It has a physical interpretation as the set of attainable velocities of a particle on the graph which can hop along one edge per timestep. Since a polytope necessarily has distinguished directions, there is no periodic graph for which this velocity set is isotropic. In the context of classical physics, this can be viewed as a no-go theorem for the emergence of an isotropic space from a discrete structure. △ Less

Submitted 17 June, 2013; v1 submitted 9 September, 2011; originally announced September 2011.

Comments: 18 pages, 1 figure. See also http://pirsa.org/12100100/. Corrigendum in v3: most mathematical results were obtained earlier by other authors, references have been included

MSC Class: Primary: 05C38; 05C22; Secondary: 52C07; 68R10

Journal ref: Discrete Mathematics 313 (2013) pp. 1289-1301

arXiv:1106.1791 [pdf, ps, other]

doi 10.3390/e13111945

A Characterization of Entropy in Terms of Information Loss

Authors: John C. Baez, Tobias Fritz, Tom Leinster

Abstract: There are numerous characterizations of Shannon entropy and Tsallis entropy as measures of information obeying certain properties. Using work by Faddeev and Furuichi, we derive a very simple characterization. Instead of focusing on the entropy of a probability measure on a finite set, this characterization focuses on the `information loss', or change in entropy, associated with a measure-preservin… ▽ More There are numerous characterizations of Shannon entropy and Tsallis entropy as measures of information obeying certain properties. Using work by Faddeev and Furuichi, we derive a very simple characterization. Instead of focusing on the entropy of a probability measure on a finite set, this characterization focuses on the `information loss', or change in entropy, associated with a measure-preserving function. Information loss is a special case of conditional entropy: namely, it is the entropy of a random variable conditioned on some function of that variable. We show that Shannon entropy gives the only concept of information loss that is functorial, convex-linear and continuous. This characterization naturally generalizes to Tsallis entropy as well. △ Less

Submitted 18 November, 2011; v1 submitted 9 June, 2011; originally announced June 2011.

Comments: 11 pages LaTeX, minor revision

MSC Class: 94A17; 62B10

Journal ref: Entropy, Vol. 13 No. 11 (2011), 1945-1957

Showing 1–23 of 23 results for author: Fritz, T