Search | arXiv e-print repository

Boltzmann Bridges

Authors: Jordan Scharnhorst, David Wolpert, Carlo Rovelli

Abstract: It is often stated that the second law of thermodynamics follows from the condition that at some given time in the past the entropy was lower than it is now. Formally, this condition is the statement that $E[S(t)|S(t_0)]$, the expected entropy of the universe at the current time $t$ conditioned on its value $S(t_0)$ at a time $t_0$ in the past, is an increasing function of $t $. We point out that… ▽ More It is often stated that the second law of thermodynamics follows from the condition that at some given time in the past the entropy was lower than it is now. Formally, this condition is the statement that $E[S(t)|S(t_0)]$, the expected entropy of the universe at the current time $t$ conditioned on its value $S(t_0)$ at a time $t_0$ in the past, is an increasing function of $t $. We point out that in general this is incorrect. The epistemic axioms underlying probability theory say that we should condition expectations on all that we know, and on nothing that we do not know. Arguably, we know the value of the universe's entropy at the present time $t$ at least as well as its value at a time in the past, $t_0$. However, as we show here, conditioning expected entropy on its value at two times rather than one radically changes its dynamics, resulting in a unexpected, very rich structure. For example, the expectation value conditioned on two times can have a maximum at an intermediate time between $t_0$ and $t$, i.e., in our past. Moreover, it can have a negative rather than positive time derivative at the present. In such "Boltzmann bridge" situations, the second law would not hold at the present time. We illustrate and investigate these phenomena for a random walk model and an idealized gas model, and briefly discuss the role of Boltzmann bridges in our universe. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 10 pages, 10 figures

arXiv:2404.16050 [pdf, ps, other]

Implications of computer science theory for the simulation hypothesis

Authors: David H. Wolpert

Abstract: The simulation hypothesis has recently excited renewed interest, especially in the physics and philosophy communities. However, the hypothesis specifically concerns \textit{computers} that simulate physical universes, which means that to formally investigate it we need to couple computer science theory with physics. Here I couple those fields with the physical Church-Turing thesis. I then exploit… ▽ More The simulation hypothesis has recently excited renewed interest, especially in the physics and philosophy communities. However, the hypothesis specifically concerns \textit{computers} that simulate physical universes, which means that to formally investigate it we need to couple computer science theory with physics. Here I couple those fields with the physical Church-Turing thesis. I then exploit that coupling to investigate of some of the computer science theory aspects of the simulation hypothesis. In particular, I use Kleene's second recursion theorem to prove that it is mathematically possible for us to be a simulation that is being run on a computer - by us. In such a self-simulation, there would be two identical instances of us; the question of which of those is ``really us'' is meaningless. I also show how Rice's theorem provides some interesting impossibility results concerning simulation and self-simulation; briefly describe the philosophical implications of fully homomorphic encryption for (self-)simulation; and briefly investigate the graphical structure of universes simulating universes simulating universes ..., among other issues. I end by describing some of the possible avenues for future research. △ Less

Submitted 22 June, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

Comments: 44 pages of text, 5 pages of references, 10 pages of appendices

ACM Class: F.1; F.m

arXiv:2401.02353 [pdf, ps, other]

Game Mining: How to Make Money from those about to Play a Game

Authors: James W. Bono, David H. Wolpert

Abstract: It is known that a player in a noncooperative game can benefit by publicly restricting his possible moves before play begins. We show that, more generally, a player may benefit by publicly committing to pay an external party an amount that is contingent on the game's outcome. We explore what happens when external parties -- who we call ``game miners'' -- discover this fact and seek to profit from… ▽ More It is known that a player in a noncooperative game can benefit by publicly restricting his possible moves before play begins. We show that, more generally, a player may benefit by publicly committing to pay an external party an amount that is contingent on the game's outcome. We explore what happens when external parties -- who we call ``game miners'' -- discover this fact and seek to profit from it by entering an outcome-contingent contract with the players. We analyze various structured bargaining games between miners and players for determining such an outcome-contingent contract. These bargaining games include playing the players against one another, as well as allowing the players to pay the miner(s) for exclusivity and first-mover advantage. We establish restrictions on the strategic settings in which a game miner can profit and bounds on the game miner's profit. We also find that game miners can lead to both efficient and inefficient equilibria. △ Less

Submitted 4 January, 2024; originally announced January 2024.

Comments: 25 pages, 1 figure

MSC Class: 91A6; 91A10; 91A20; 91A28 ACM Class: J.4

arXiv:2311.17166 [pdf, other]

Is stochastic thermodynamics the key to understanding the energy costs of computation?

Authors: David Wolpert, Jan Korbel, Christopher Lynn, Farita Tasnim, Joshua Grochow, Gülce Kardeş, James Aimone, Vijay Balasubramanian, Eric de Giuli, David Doty, Nahuel Freitas, Matteo Marsili, Thomas E. Ouldridge, Andrea Richa, Paul Riechers, Édgar Roldán, Brenda Rubenstein, Zoltan Toroczkai, Joseph Paradiso

Abstract: The relationship between the thermodynamic and computational characteristics of dynamical physical systems has been a major theoretical interest since at least the 19th century, and has been of increasing practical importance as the energetic cost of digital devices has exploded over the last half century. One of the most important thermodynamic features of real-world computers is that they operat… ▽ More The relationship between the thermodynamic and computational characteristics of dynamical physical systems has been a major theoretical interest since at least the 19th century, and has been of increasing practical importance as the energetic cost of digital devices has exploded over the last half century. One of the most important thermodynamic features of real-world computers is that they operate very far from thermal equilibrium, in finite time, with many quickly (co-)evolving degrees of freedom. Such computers also must almost always obey multiple physical constraints on how they work. For example, all modern digital computers are periodic processes, governed by a global clock. Another example is that many computers are modular, hierarchical systems, with strong restrictions on the connectivity of their subsystems. This properties hold both for naturally occurring computers, like brains or Eukaryotic cells, as well as digital systems. These features of real-world computers are absent in 20th century analyses of the thermodynamics of computational processes, which focused on quasi-statically slow processes. However, the field of stochastic thermodynamics has been developed in the last few decades - and it provides the formal tools for analyzing systems that have exactly these features of real-world computers. We argue here that these tools, together with other tools currently being developed in stochastic thermodynamics, may help us understand at a far deeper level just how the fundamental physical properties of dynamic systems are related to the computation that they perform. △ Less

Submitted 30 November, 2023; v1 submitted 28 November, 2023; originally announced November 2023.

Comments: Typo fix

arXiv:2311.10586 [pdf, other]

Game Manipulators -- the Strategic Implications of Binding Contracts

Authors: Maria Alejandra Ramirez, Yoav Kolumbus, Rosemarie Nagel, David Wolpert, Jürgen Jost

Abstract: Commitment devices are powerful tools that can influence and incentivise certain behaviours by linking them to rewards or punishments. These devices are particularly useful in decision-making, as they can steer individuals towards specific choices. In the field of game theory, commitment devices can alter a player's payoff matrix, ultimately changing the game's Nash equilibria. Interestingly, agen… ▽ More Commitment devices are powerful tools that can influence and incentivise certain behaviours by linking them to rewards or punishments. These devices are particularly useful in decision-making, as they can steer individuals towards specific choices. In the field of game theory, commitment devices can alter a player's payoff matrix, ultimately changing the game's Nash equilibria. Interestingly, agents, whom we term game manipulators and who can be external to the original game, can leverage such devices to extract fees from players by making them contingent offers that modify the payoffs of their actions. This can result in a different Nash equilibrium with potentially lower payoffs for the players compared to the original game. For this scheme to work, it is required that all commitments be binding, meaning that once an offer is made, it cannot be revoked. Consequently, we analyse binding contracts as the commitment mechanism that enables game manipulation scenarios. The main focus of this study is to formulate the logic of this setting, expand its scope to encompass more intricate schemes, and analyse the behaviour of regret-minimizing agents in scenarios involving game manipulation. △ Less

Submitted 8 December, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

arXiv:2309.10927 [pdf, ps, other]

Memory Systems, the Epistemic Arrow of Time, and the Second Law

Authors: David H. Wolpert, Jens Kipper

Abstract: The epistemic arrow of time is the fact that our knowledge of the past seems to be both of a different kind and more detailed than our knowledge of the future. Just like with the other arrows of time, it has often been speculated that the epistemic arrow arises due to the second law of thermodynamics. In this paper we investigate the epistemic arrow of time, using a fully formal framework. We be… ▽ More The epistemic arrow of time is the fact that our knowledge of the past seems to be both of a different kind and more detailed than our knowledge of the future. Just like with the other arrows of time, it has often been speculated that the epistemic arrow arises due to the second law of thermodynamics. In this paper we investigate the epistemic arrow of time, using a fully formal framework. We begin by defining a memory system as any physical system whose present state can provide information about the state of the external world at some time other than the present. We then identify two types of memory systems in our universe, along with an important special case of the first type, which we distinguish as a third type of memory system. We show that two of these types of memory system are time-symmetric, able to provide knowledge about both the past and the future. However, the third type of memory systems exploits the second law of thermodynamics in all of its instances we find in our universe. The result is that in our universe, this type of memory system only ever provides information about the past. Finally, we argue that human memory is of this third type, completing the argument. Our analysis is indebted to prior work in Wolpert 1992, but expands and improves upon this work in several respects. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: 24 pages

arXiv:2307.05713 [pdf, other]

doi 10.1103/PhysRevX.14.021026

Thermodynamics of computations with absolute irreversibility, unidirectional transitions, and stochastic computation times

Authors: Gonzalo Manzano, Gülce Kardeş, Édgar Roldán, David Wolpert

Abstract: Develo** a thermodynamic theory of computation is a challenging task at the interface of non-equilibrium thermodynamics and computer science. In particular, this task requires dealing with difficulties such as stochastic halting times, unidirectional (possibly deterministic) transitions, and restricted initial conditions, features common in real-world computers. Here, we present a framework whic… ▽ More Develo** a thermodynamic theory of computation is a challenging task at the interface of non-equilibrium thermodynamics and computer science. In particular, this task requires dealing with difficulties such as stochastic halting times, unidirectional (possibly deterministic) transitions, and restricted initial conditions, features common in real-world computers. Here, we present a framework which tackles all such difficulties by extending the martingale theory of non-equilibrium thermodynamics to generic non-stationary Markovian processes, including those with broken detailed balance and/or absolute irreversibility. We derive several universal fluctuation relations and second-law-like inequalities that provide both lower and upper bounds on the intrinsic dissipation (mismatch cost) associated with any periodic process -- in particular the periodic processes underlying all current digital computation. Crucially, these bounds apply even if the process has stochastic stop** times, as it does in many computational machines. We illustrate our results with exhaustive numerical simulations of deterministic finite automata (DFA) processing bit strings, one of the fundamental models of computation from theoretical computer science. We also provide universal equalities and inequalities for the acceptance probability of words of a given length by a deterministic finite automaton in terms of thermodynamic quantities, and outline connections between computer science and stochastic resetting. Our results, while motivated from the computational context, are applicable far more broadly. △ Less

Submitted 16 February, 2024; v1 submitted 11 July, 2023; originally announced July 2023.

Comments: 24 + 8 pages, 10 figures; v2: minor changes, references added; v3: presentation improved, extra appendix added, footnotes and references added

Journal ref: Phys. Rev. X 14, 021026 (2024)

arXiv:2305.09571 [pdf, other]

Stochastic thermodynamics of multiple co-evolving systems -- beyond multipartite processes

Authors: Farita Tasnim, David H. Wolpert

Abstract: Many dynamical systems consist of multiple, co-evolving subsystems (degrees of freedom). These subsystems often depend upon each other in a way that restricts the overall system's dynamics. How does this network of dependencies affect the system's thermodynamics? Prior studies in the stochastic thermodynamics of multipartite processes (MPPs) have approached this question by restricting the system… ▽ More Many dynamical systems consist of multiple, co-evolving subsystems (degrees of freedom). These subsystems often depend upon each other in a way that restricts the overall system's dynamics. How does this network of dependencies affect the system's thermodynamics? Prior studies in the stochastic thermodynamics of multipartite processes (MPPs) have approached this question by restricting the system to allow only one subsystem to change state at a time. However, in many real systems, such as chemical reaction networks or electronic circuits, multiple subsystems must change state together. Therefore, studies of MPPs do not apply to such systems. Here, we investigate the thermodynamics of composite processes, in which subsets of subsystems are allowed to change state simultaneously. These subsets correspond to the subsystems that interact with a single mechanism (e.g., a thermal or chemical reservoir) that is coupled to the system. An MPP is simply a (subcase of a) composite process in which all such subsets have cardinality one. We demonstrate the power of the composite systems framework to study the thermodynamics of multiple, co-evolving subsystems. In particular, we derive thermodynamic uncertainty relations for information flows in composite processes. We also derive strengthened speed limits for composite processes. Our results apply to a much broader class of dynamical systems than do results for MPPs, and could guide future studies of the thermodynamics of distributed computational systems. △ Less

Submitted 22 May, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

Comments: 8 pages, 2 figures

arXiv:2302.04320 [pdf, other]

Entropy production in communication channels

Authors: Farita Tasnim, Nahuel Freitas, David H. Wolpert

Abstract: In many complex systems, whether biological or artificial, the thermodynamic costs of communication among their components are large. These systems also tend to split information transmitted between any two components across multiple channels. A common hypothesis is that such inverse multiplexing strategies reduce total thermodynamic costs. So far, however, there have been no physics-based results… ▽ More In many complex systems, whether biological or artificial, the thermodynamic costs of communication among their components are large. These systems also tend to split information transmitted between any two components across multiple channels. A common hypothesis is that such inverse multiplexing strategies reduce total thermodynamic costs. So far, however, there have been no physics-based results supporting this hypothesis. This gap existed partially because we have lacked a theoretical framework that addresses the interplay of thermodynamics and information in off-equilibrium systems. Here we present the first study that rigorously combines such a framework, stochastic thermodynamics, with Shannon information theory. We develop a minimal model that captures the fundamental features common to a wide variety of communication systems, and study the relationship between the entropy production of the communication process and the channel capacity, the canonical measure of the communication capability of a channel. In contrast to what is assumed in previous works not based on first principles, we show that the entropy production is not always a convex and monotonically increasing function of the channel capacity. However, those two properties are recovered for sufficiently high channel capacity. These results clarify when and how to split a single communication stream across multiple channels. △ Less

Submitted 8 February, 2024; v1 submitted 8 February, 2023; originally announced February 2023.

Comments: 15 pages, 3 figures

arXiv:2210.05249 [pdf, other]

doi 10.1103/PhysRevResearch.6.013021

Nonequilibrium thermodynamics of uncertain stochastic processes

Authors: Jan Korbel, David H. Wolpert

Abstract: Stochastic thermodynamics is formulated under the assumption of perfect knowledge of all thermodynamic parameters. However, in any real-world experiment, there is non-zero uncertainty about the precise value of temperatures, chemical potentials, energy spectrum, etc. Here we investigate how this uncertainty modifies the theorems of stochastic thermodynamics. We consider two scenarios: in the (call… ▽ More Stochastic thermodynamics is formulated under the assumption of perfect knowledge of all thermodynamic parameters. However, in any real-world experiment, there is non-zero uncertainty about the precise value of temperatures, chemical potentials, energy spectrum, etc. Here we investigate how this uncertainty modifies the theorems of stochastic thermodynamics. We consider two scenarios: in the (called \emph{effective}) scenario we fix the (unknown, randomly generated) experimental apparatus and then repeatedly observe (stochastic) trajectories of the system for that fixed apparatus. In contrast, in a (called \emph{phenomenological}) scenario the (unknown) apparatus is re-generated for each trajectory. We derive expressions for thermodynamic quantities in both scenarios. We also discuss the physical interpretation of effective (scenario) entropy production (EP), derive the effective mismatch cost, and provide a numerical analysis of the effective thermodynamics of a quantum dot implementing bit erasure with uncertain temperature. We then analyze the protocol for moving between two state distributions that maximize effective work extraction. Next, we investigate the effective thermodynamic value of information, focusing on the case where there is a delay between the initialization of the system and the start of the protocol. Finally, we derive the detailed and integrated fluctuation theorems (FTs) for the phenomenological EP. In particular, we show how the phenomenological FTs account for the fact that the longer a trajectory runs, the more information it provides concerning the precise experimental apparatus, and therefore the less EP it generates. △ Less

Submitted 22 May, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

Comments: 27 pages, 4 figures

Journal ref: Phys. Rev. Research 6 (2024) 013021

arXiv:2209.00543 [pdf, other]

Stochastic Mathematical Systems

Authors: David H. Wolpert, David B. Kinney

Abstract: We introduce a framework that can be used to model both mathematics and human reasoning about mathematics. This framework involves {stochastic mathematical systems} (SMSs), which are stochastic processes that generate pairs of questions and associated answers (with no explicit referents). We use the SMS framework to define normative conditions for mathematical reasoning, by defining a ``calibratio… ▽ More We introduce a framework that can be used to model both mathematics and human reasoning about mathematics. This framework involves {stochastic mathematical systems} (SMSs), which are stochastic processes that generate pairs of questions and associated answers (with no explicit referents). We use the SMS framework to define normative conditions for mathematical reasoning, by defining a ``calibration'' relation between a pair of SMSs. The first SMS is the human reasoner, and the second is an ``oracle'' SMS that can be interpreted as deciding whether the question-answer pairs of the reasoner SMS are valid. To ground thinking, we understand the answers to questions given by this oracle to be the answers that would be given by an SMS representing the entire mathematical community in the infinite long run of the process of asking and answering questions. We then introduce a slight extension of SMSs to allow us to model both the physical universe and human reasoning about the physical universe. We then define a slightly different calibration relation appropriate for the case of scientific reasoning. In this case the first SMS represents a human scientist predicting the outcome of future experiments, while the second SMS represents the physical universe in which the scientist is embedded, with the question-answer pairs of that SMS being specifications of the experiments that will occur and the outcome of those experiments, respectively. Next we derive conditions justifying two important patterns of inference in both mathematical and scientific reasoning: i) the practice of increasing one's degree of belief in a claim as one observes increasingly many lines of evidence for that claim, and ii) abduction, the practice of inferring a claim's probability of being correct from its explanatory power with respect to some other claim that is already taken to hold for independent reasons. △ Less

Submitted 14 March, 2023; v1 submitted 1 September, 2022; originally announced September 2022.

Comments: 43 pages of text, 6 pages of references, 11 pages of appendices

arXiv:2208.06895 [pdf, ps, other]

Thermodynamics of deterministic finite automata operating locally and periodically

Authors: Thomas E. Ouldridge, David H. Wolpert

Abstract: Real-world computers have operational constraints that cause nonzero entropy production (EP). In particular, almost all real-world computers are ``periodic'', iteratively undergoing the same physical process; and ``local", in that subsystems evolve whilst physically decoupled from the rest of the computer. These constraints are so universal because decomposing a complex computation into small, ite… ▽ More Real-world computers have operational constraints that cause nonzero entropy production (EP). In particular, almost all real-world computers are ``periodic'', iteratively undergoing the same physical process; and ``local", in that subsystems evolve whilst physically decoupled from the rest of the computer. These constraints are so universal because decomposing a complex computation into small, iterative calculations is what makes computers so powerful. We first derive the nonzero EP caused by the locality and periodicity constraints for deterministic finite automata (DFA), a foundational system of computer science theory. We then relate this minimal EP to the computational characteristics of the DFA. We thus divide the languages recognised by DFA into two classes: those that can be recognised with zero EP, and those that necessarily have non-zero EP. We also demonstrate the thermodynamic advantages of implementing a DFA with a physical process that is agnostic about the inputs that it processes. △ Less

Submitted 3 July, 2023; v1 submitted 14 August, 2022; originally announced August 2022.

arXiv:2208.03886 [pdf, ps, other]

What can we know about that which we cannot even imagine?

Authors: David H. Wolpert

Abstract: In this essay I will consider a sequence of questions. The first questions concern the biological function of intelligence in general, and cognitive prostheses of human intelligence in particular. These will lead into questions concerning human language, perhaps the most important cognitive prosthesis humanity has ever developed. While it is traditional to rhapsodize about the cognitive power enca… ▽ More In this essay I will consider a sequence of questions. The first questions concern the biological function of intelligence in general, and cognitive prostheses of human intelligence in particular. These will lead into questions concerning human language, perhaps the most important cognitive prosthesis humanity has ever developed. While it is traditional to rhapsodize about the cognitive power encapsulated in human language, I will emphasize how horribly limited human language is - and therefore how limited our cognitive abilities are, despite their being augmented with language. This will lead to questions of whether human mathematics, being ultimately formulated in terms of human language, is also deeply limited. I will then combine these questions to pose a partial, sort-of, sideways answer to the guiding concern of this essay: what we can ever discern about that we cannot even conceive? △ Less

Submitted 17 March, 2024; v1 submitted 7 August, 2022; originally announced August 2022.

Comments: 39 pages, 10 pages of which are references

arXiv:2208.02902 [pdf, other]

Combining lower bounds on entropy production in complex systems with multiple interacting components

Authors: David H. Wolpert

Abstract: The past two decades have seen a revolution in statistical physics, generalizing it to apply to systems of arbitrary size, evolving while arbitrarily far from equilibrium. Many of these new results are based on analyzing the dynamics of the entropy of a system that is evolving according to a Markov process. These results comprise a sub-field called ``stochastic thermodynamics''. Some of the most p… ▽ More The past two decades have seen a revolution in statistical physics, generalizing it to apply to systems of arbitrary size, evolving while arbitrarily far from equilibrium. Many of these new results are based on analyzing the dynamics of the entropy of a system that is evolving according to a Markov process. These results comprise a sub-field called ``stochastic thermodynamics''. Some of the most powerful results in stochastic thermodynamics were traditionally concerned with single, monolithic systems, evolving by themselves, ignoring any internal structure of those systems. In this chapter I review how in complex systems, composed of many interacting constituent systems, it is possible to substantially strengthen many of these traditional results of stochastic thermodynamics. This is done by ``mixing and matching'' those traditional results, to each apply to only a subset of the interacting systems, thereby producing a more powerful result at the level of the aggregate, complex system. △ Less

Submitted 4 August, 2022; originally announced August 2022.

Comments: 17 pages text, 31 pages appendices, 1 figure, to appear in "Encyclopedia of Entropy across the Disciplines"

arXiv:2206.01165 [pdf, other]

Inclusive Thermodynamics of Computational Machines

Authors: Gülce Kardeş, David Wolpert

Abstract: We introduce a framework designed to analyze the thermodynamics of an abstractly defined logical computer like a deterministic finite automaton (DFA) or a Turing machine, without specifying any extraneous parameters (like rate matrices, Hamiltonians, etc.) of a physical process that implements the computer. Earlier investigations of how to do this were based on the continuous-time Markov chain (CT… ▽ More We introduce a framework designed to analyze the thermodynamics of an abstractly defined logical computer like a deterministic finite automaton (DFA) or a Turing machine, without specifying any extraneous parameters (like rate matrices, Hamiltonians, etc.) of a physical process that implements the computer. Earlier investigations of how to do this were based on the continuous-time Markov chain (CTMC) formulation of stochastic thermodynamics. These investigations either assumed that there was exactly zero irreversible entropy production (EP) generated by the physical system implementing the computation, or allowed the EP to be nonzero but only considered the mismatch cost component of the EP. In addition, they only applied to a single type of computer. Our framework neither requires that EP equal zero nor restricts attention to the mismatch cost component of EP, and is designed to apply to all types of computational machines. In contrast to earlier investigations using the CTMC-based formulation, our framework is based on the inclusive Hamiltonian formulation, in which the combination of the system of interest and the baths evolve in a Hamiltonian (or unitary) dynamics. Here, we use our framework to derive an integral fluctuation theorem for computers, in which the expectation value is strictly less than 1. We also derive an exchange fluctuation theorem, and a mismatch cost formula involving first-passage times. We analyze the EP generated by a DFA, a Markov information source, and a noisy communication channel. In particular, we use the Myhill-Nerode theorem of computer science to prove that out of all DFAs which recognize the same language, the minimal complexity DFA is the one with minimal EP for all dynamics and at all iterations. △ Less

Submitted 27 December, 2022; v1 submitted 2 June, 2022; originally announced June 2022.

arXiv:2112.05876 [pdf, other]

The Past as a Stochastic Process

Authors: David H. Wolpert, Michael H. Price, Stefani A. Crabtree, Timothy A. Kohler, Jurgen Jost, James Evans, Peter F. Stadler, Hajime Shimao, Manfred D. Laubichler

Abstract: Historical processes manifest remarkable diversity. Nevertheless, scholars have long attempted to identify patterns and categorize historical actors and influences with some success. A stochastic process framework provides a structured approach for the analysis of large historical datasets that allows for detection of sometimes surprising patterns, identification of relevant causal actors both end… ▽ More Historical processes manifest remarkable diversity. Nevertheless, scholars have long attempted to identify patterns and categorize historical actors and influences with some success. A stochastic process framework provides a structured approach for the analysis of large historical datasets that allows for detection of sometimes surprising patterns, identification of relevant causal actors both endogenous and exogenous to the process, and comparison between different historical cases. The combination of data, analytical tools and the organizing theoretical framework of stochastic processes complements traditional narrative approaches in history and archaeology. △ Less

Submitted 10 December, 2021; originally announced December 2021.

Comments: 20 pages, 4 figures

arXiv:2107.12471 [pdf, other]

Thermodynamic speed limits for co-evolving systems

Authors: Farita Tasnim, David H. Wolpert

Abstract: Previously derived "global" thermodynamic speed limit theorems state that increasing the maximum speed with which a system can evolve between two given probability distributions over its states requires the system to produce more entropy in its evolution. However, these theorems ignore that many systems are not monolithic, but instead comprise multiple subsystems that interact according to an (oft… ▽ More Previously derived "global" thermodynamic speed limit theorems state that increasing the maximum speed with which a system can evolve between two given probability distributions over its states requires the system to produce more entropy in its evolution. However, these theorems ignore that many systems are not monolithic, but instead comprise multiple subsystems that interact according to an (often sparse) network. Indeed, most naturally-occurring and human-engineered systems of increasing complexity can be decomposed into sets of co-evolving subsystems, where there exist a priori constraints on the dynamics of each subsystem, restricting which other subsystems can affect its dynamics. Here we derive three new SLTs that account for the thermodynamic effects of such constraints. Our first new speed limit strengthens the global speed limit. While our other two SLTs do not have this guarantee, in some situations they are even stronger than our first speed limit. Our results establish that a stochastically evolving system will, on average, produce more entropy in evolving between two distributions within a given time simply due to its comprising multiple, co-evolving subsystems. We illustrate our results with numerical calculations involving a model of two cells sensing and storing information about their environment. △ Less

Submitted 4 December, 2021; v1 submitted 26 July, 2021; originally announced July 2021.

Comments: 8 pages, 3 figures; Supplementary information provided as a separate PDF

arXiv:2105.05378 [pdf, other]

doi 10.1103/PhysRevE.104.034129

Entropy production given constraints on the energy functions

Authors: Artemy Kolchinsky, David H. Wolpert

Abstract: We consider the problem of driving a finite-state classical system from some initial distribution $p$ to some final distribution $p'$ with vanishing entropy production (EP), under the constraint that the driving protocols can only use some limited set of energy functions $\mathcal{E}$. Assuming no other constraints on the driving protocol, we derive a simple condition that guarantees that such a t… ▽ More We consider the problem of driving a finite-state classical system from some initial distribution $p$ to some final distribution $p'$ with vanishing entropy production (EP), under the constraint that the driving protocols can only use some limited set of energy functions $\mathcal{E}$. Assuming no other constraints on the driving protocol, we derive a simple condition that guarantees that such a transformation can be carried out, which is stated in terms of the smallest probabilities in $\{p,p'\}$ and a graph-theoretic property defined in terms of $\mathcal{E}$. Our results imply that a surprisingly small amount of control over the energy function is sufficient (in particular, any transformation $p\to p'$ can be carried out as soon as one can control some one-dimensional parameter of the energy function, e.g., the energy of a single state). We also derive a lower bound on the EP under more general constraints on the transition rates, which is formulated in terms of a convex optimization problem. △ Less

Submitted 21 September, 2021; v1 submitted 11 May, 2021; originally announced May 2021.

Journal ref: Physical Review E, 2021

arXiv:2103.11956 [pdf, ps, other]

The Implications of the No-Free-Lunch Theorems for Meta-induction

Authors: David H. Wolpert

Abstract: The important recent book by G. Schurz appreciates that the no-free-lunch theorems (NFL) have major implications for the problem of (meta) induction. Here I review the NFL theorems, emphasizing that they do not only concern the case where there is a uniform prior -- they prove that there are "as many priors" (loosely speaking) for which any induction algorithm $A$ out-generalizes some induction al… ▽ More The important recent book by G. Schurz appreciates that the no-free-lunch theorems (NFL) have major implications for the problem of (meta) induction. Here I review the NFL theorems, emphasizing that they do not only concern the case where there is a uniform prior -- they prove that there are "as many priors" (loosely speaking) for which any induction algorithm $A$ out-generalizes some induction algorithm $B$ as vice-versa. Importantly though, in addition to the NFL theorems, there are many {free lunch} theorems. In particular, the NFL theorems can only be used to compare the {marginal} expected performance of an induction algorithm $A$ with the marginal expected performance of an induction algorithm $B$. There is a rich set of free lunches which instead concern the statistical correlations among the generalization errors of induction algorithms. As I describe, the meta-induction algorithms that Schurz advocate as a "solution to Hume's problem" are just an example of such a free lunch based on correlations among the generalization errors of induction algorithms. I end by pointing out that the prior that Schurz advocates, which is uniform over bit frequencies rather than bit patterns, is contradicted by thousands of experiments in statistical physics and by the great success of the maximum entropy procedure in inductive inference. △ Less

Submitted 27 July, 2022; v1 submitted 22 March, 2021; originally announced March 2021.

Comments: 14 pages

arXiv:2103.08997 [pdf, ps, other]

Thermodynamics of nonequilibrium systems with uncertain parameters

Authors: Jan Korbel, David H. Wolpert

Abstract: In the real world, one almost never knows the parameters of a thermodynamic process to infinite precision. Reflecting this, here we investigate how to extend stochastic thermodynamics to systems with uncertain parameters, including uncertain number of heat baths / particle reservoirs, uncertainty in the precise values of temperatures / chemical potentials of those reservoirs, uncertainty in the en… ▽ More In the real world, one almost never knows the parameters of a thermodynamic process to infinite precision. Reflecting this, here we investigate how to extend stochastic thermodynamics to systems with uncertain parameters, including uncertain number of heat baths / particle reservoirs, uncertainty in the precise values of temperatures / chemical potentials of those reservoirs, uncertainty in the energy spectrum, uncertainty in the control protocol, etc. We formalize such uncertainty with an (arbitrary) probability measure over all transition rate matrices satisfying local detailed balance. This lets us define the effective thermodynamic quantities by averaging over all LDB-obeying rate matrices. We show that the resultant effective entropy violates the second law of thermodynamics. In contrast to the effective entropy though, the expected stochastic entropy, defined as the ensemble average of the effective trajectory-level entropy, satisfies the second law. We then and explicitly calculate the second-order correction to the second law for the case of one heat bath with uncertain temperature. We also derive the detailed fluctuation theorem for expected effective trajectory entropy production for this case, and derive a lower bound for the associated expected work. Next, to ground these formal considerations with experimentally testable bounds on allowed energetics, we derive a bound on the maximal work that can be extracted from systems with arbitrarily uncertain temperature. We end by extending previous work on "thermodynamic value of information", to allow for uncertainty in the time-evolution of the rate matrix. △ Less

Submitted 16 March, 2021; originally announced March 2021.

arXiv:2103.05734 [pdf, other]

doi 10.1103/PhysRevE.104.054107

Dependence of integrated, instantaneous, and fluctuating entropy production on the initial state in quantum and classical processes

Authors: Artemy Kolchinsky, David H. Wolpert

Abstract: We consider the additional entropy production (EP) incurred by a fixed quantum or classical process on some initial state $ρ$, above the minimum EP incurred by the same process on any initial state. We show that this additional EP, which we term the "mismatch cost of $ρ$", has a universal information-theoretic form: it is given by the contraction of the relative entropy between $ρ$ and the least-d… ▽ More We consider the additional entropy production (EP) incurred by a fixed quantum or classical process on some initial state $ρ$, above the minimum EP incurred by the same process on any initial state. We show that this additional EP, which we term the "mismatch cost of $ρ$", has a universal information-theoretic form: it is given by the contraction of the relative entropy between $ρ$ and the least-dissipative initial state $\varphi$ over time. We derive versions of this result for integrated EP incurred over the course of a process, for trajectory-level fluctuating EP, and for instantaneous EP rate. We also show that mismatch cost for fluctuating EP obeys an integral fluctuation theorem. Our results demonstrate a fundamental relationship between "thermodynamic irreversibility" (generation of EP) and "logical irreversibility" (inability to know the initial state corresponding to a given final state). We use this relationship to derive quantitative bounds on the thermodynamics of quantum error correction and to propose a thermodynamically-operationalized measure of the logical irreversibility of a quantum channel. Our results hold for both finite and infinite dimensional systems, and generalize beyond EP to many other thermodynamic costs, including nonadiabatic EP, free energy loss, and entropy gain. △ Less

Submitted 5 June, 2022; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: Fix some line breaking issues in the appendix

Journal ref: Physical Review E, 2021

arXiv:2101.01610 [pdf, other]

Thermodynamic Uncertainty Relations for Multipartite Processes

Authors: Gülce Kardeş, David H. Wolpert

Abstract: The thermodynamic uncertainty relations (TURs) provide lower bounds on the entropy production (EP) of a system in terms of the statistical precision of an arbitrary current in that system. All conventional TURs derived so far have concerned a single physical system, differing from one another in what properties they require the system to have. However, many physical scenarios of interest involve m… ▽ More The thermodynamic uncertainty relations (TURs) provide lower bounds on the entropy production (EP) of a system in terms of the statistical precision of an arbitrary current in that system. All conventional TURs derived so far have concerned a single physical system, differing from one another in what properties they require the system to have. However, many physical scenarios of interest involve multiple interacting systems, e.g. organelles within a biological cell. Here we show how to extend the conventional TURs to those scenarios. A common feature of these extended versions of the TURs is that they bound the global EP, jointly generated by the set of interacting systems, in terms of a weighted sum of the precisions of the local currents generated within those systems -- plus an information-theoretic correction term. Importantly, these extended TURs can bound the global EP even when the global system does not meet any of the requirements of the conventional TURs. After deriving these extended TURs we use them to obtain bounds that do not involve the global EP, but instead relate the local EPs of the individual systems and the statistical coupling among the currents generated within those systems. We derive such bounds for both scalar-valued and vector-valued currents within each system. We illustrate our results with numerical experiments. △ Less

Submitted 18 March, 2021; v1 submitted 5 January, 2021; originally announced January 2021.

Comments: 23 pages, LaTeX; typos corrected, references added

arXiv:2012.08298 [pdf, other]

Noisy Deductive Reasoning: How Humans Construct Math, and How Math Constructs Universes

Authors: David H. Wolpert, David Kinney

Abstract: We present a computational model of mathematical reasoning according to which mathematics is a fundamentally stochastic process. That is, on our model, whether or not a given formula is deemed a theorem in some axiomatic system is not a matter of certainty, but is instead governed by a probability distribution. We then show that this framework gives a compelling account of several aspects of mathe… ▽ More We present a computational model of mathematical reasoning according to which mathematics is a fundamentally stochastic process. That is, on our model, whether or not a given formula is deemed a theorem in some axiomatic system is not a matter of certainty, but is instead governed by a probability distribution. We then show that this framework gives a compelling account of several aspects of mathematical practice. These include: 1) the way in which mathematicians generate research programs, 2) the applicability of Bayesian models of mathematical heuristics, 3) the role of abductive reasoning in mathematics, 4) the way in which multiple proofs of a proposition can strengthen our degree of belief in that proposition, and 5) the nature of the hypothesis that there are multiple formal systems that are isomorphic to physically possible universes. Thus, by embracing a model of mathematics as not perfectly predictable, we generate a new and fruitful perspective on the epistemology and practice of mathematics. △ Less

Submitted 28 October, 2020; originally announced December 2020.

Comments: Forthcoming in Undecidability, Uncomputability, and Unpredictability. Springer. Ed. Anthony Aguirre, Zeeya Merali, and David Sloan. (Collection of winning essays from FQXi's 2020 Essay Context)

arXiv:2011.06866 [pdf, ps, other]

doi 10.1088/1367-2630/abea46

Stochastic thermodynamics and fluctuation theorems for non-linear systems

Authors: Jan Korbel, David H. Wolpert

Abstract: We extend stochastic thermodynamics by relaxing the two assumptions that the Markovian dynamics must be linear and that the equilibrium distribution must be a Boltzmann distribution. We show that if we require the second law to hold when those assumptions are relaxed, then it cannot be formulated in terms of Shannon entropy. However, thermodynamic consistency is salvaged if we reformulate the seco… ▽ More We extend stochastic thermodynamics by relaxing the two assumptions that the Markovian dynamics must be linear and that the equilibrium distribution must be a Boltzmann distribution. We show that if we require the second law to hold when those assumptions are relaxed, then it cannot be formulated in terms of Shannon entropy. However, thermodynamic consistency is salvaged if we reformulate the second law in terms of generalized entropy; our first result is an equation relating the precise form of the non-linear master equation to the precise associated generalized entropy which results in thermodynamic consistency. We then build on this result to extend the usual trajectory-level definitions of thermodynamic quantities that are appropriate even when the two assumptions are relaxed. We end by using these trajectory-level definitions to derive extended versions of the Crooks fluctuation theorem and Jarzynski equality which apply when the two assumptions are relaxed. △ Less

Submitted 13 November, 2020; originally announced November 2020.

arXiv:2008.10764 [pdf, other]

doi 10.1103/PhysRevX.11.041024

Work, entropy production, and thermodynamics of information under protocol constraints

Authors: Artemy Kolchinsky, David H. Wolpert

Abstract: In many real-world situations, there are constraints on the ways in which a physical system can be manipulated. We investigate the entropy production (EP) and extractable work involved in bringing a system from some initial distribution $p$ to some final distribution $p'$, given that the set of master equations available to the driving protocol obeys some constraints. We first derive general bound… ▽ More In many real-world situations, there are constraints on the ways in which a physical system can be manipulated. We investigate the entropy production (EP) and extractable work involved in bringing a system from some initial distribution $p$ to some final distribution $p'$, given that the set of master equations available to the driving protocol obeys some constraints. We first derive general bounds on EP and extractable work, as well as a decomposition of the nonequilibrium free energy into an "accessible free energy" (which can be extracted as work, given a set of constraints) and an "inaccessible free energy" (which must be dissipated as EP). In a similar vein, we consider the thermodynamics of information in the presence of constraints, and decompose the information acquired in a measurement into "accessible" and "inaccessible" components. This decomposition allows us to consider the thermodynamic efficiency of different measurements of the same system, given a set of constraints. We use our framework to analyze protocols subject to symmetry, modularity, and coarse-grained constraints, and consider various examples including the Szilard box, the 2D Ising model, and a multi-particle flashing ratchet. △ Less

Submitted 19 October, 2021; v1 submitted 24 August, 2020; originally announced August 2020.

Journal ref: Physical Review X, 2021

arXiv:2007.10950 [pdf, other]

Strengthened second law for multi-dimensional systems coupled to multiple thermodynamic reservoirs

Authors: David H. Wolpert

Abstract: The second law of thermodynamics can be formulated as a restriction on the evolution of the entropy of any system undergoing Markovian dynamics. Here I show that this form of the second law is strengthened for multi-dimensional, complex systems, coupled to multiple thermodynamic reservoirs, if we have a set of \textit{a priori} constraints restricting how the dynamics of each coordinate can depend… ▽ More The second law of thermodynamics can be formulated as a restriction on the evolution of the entropy of any system undergoing Markovian dynamics. Here I show that this form of the second law is strengthened for multi-dimensional, complex systems, coupled to multiple thermodynamic reservoirs, if we have a set of \textit{a priori} constraints restricting how the dynamics of each coordinate can depend on the other coordinates. As an example, this strengthened second law (SSL) applies to complex systems composed of multiple physically separated, co-evolving subsystems, each identified as a coordinate of the overall system. In this example, the constraints concern how the dynamics of some subsystems are allowed to depend on the states of the other subsystems. Importantly the SSL applies to such complex systems even if some of its subsystems can change state simultaneously, which is prohibited in a multipartite process. The SSL also strengthens previously derived bounds on how much work can be extracted from a system using feedback control, if the system is multi-dimensional. Importantly, the SSL does not require local detailed balance. So it potentially applies to complex systems ranging from interacting economic agents to co-evolving biological species. △ Less

Submitted 13 February, 2024; v1 submitted 21 July, 2020; originally announced July 2020.

Comments: Corrected version of D.H. Wolpert, Phil. Trans. Royal Soc. A, May 23, 2022

arXiv:2007.10928 [pdf, ps, other]

What is important about the No Free Lunch theorems?

Authors: David H. Wolpert

Abstract: The No Free Lunch theorems prove that under a uniform distribution over induction problems (search problems or learning problems), all induction algorithms perform equally. As I discuss in this chapter, the importance of the theorems arises by using them to analyze scenarios involving {non-uniform} distributions, and to compare different algorithms, without any assumption about the distribution ov… ▽ More The No Free Lunch theorems prove that under a uniform distribution over induction problems (search problems or learning problems), all induction algorithms perform equally. As I discuss in this chapter, the importance of the theorems arises by using them to analyze scenarios involving {non-uniform} distributions, and to compare different algorithms, without any assumption about the distribution over problems at all. In particular, the theorems prove that {anti}-cross-validation (choosing among a set of candidate algorithms based on which has {worst} out-of-sample behavior) performs as well as cross-validation, unless one makes an assumption -- which has never been formalized -- about how the distribution over induction problems, on the one hand, is related to the set of algorithms one is choosing among using (anti-)cross validation, on the other. In addition, they establish strong caveats concerning the significance of the many results in the literature which establish the strength of a particular algorithm without assuming a particular distribution. They also motivate a ``dictionary'' between supervised learning and improve blackbox optimization, which allows one to ``translate'' techniques from supervised learning into the domain of blackbox optimization, thereby strengthening blackbox optimization algorithms. In addition to these topics, I also briefly discuss their implications for philosophy of science. △ Less

Submitted 21 July, 2020; originally announced July 2020.

Comments: 15 pages, 11 of main text, to be published in "Black Box Optimization, Machine Learning and No-Free Lunch Theorems", P. Pardalos, V. Rasskazova, M.N. Vrahatis, Ed., Springer

arXiv:2003.11144 [pdf, other]

Fluctuation theorems for multiple co-evolving systems

Authors: David H. Wolpert

Abstract: All previously derived thermodynamic fluctuation theorems (FTs) that concern multiple co-evolving systems have required that each system can only change its state during an associated pre-fixed, limited set of time intervals. However, in many real-world cases the times when systems change their states are randomly determined, e.g., in almost all biological examples of co-evolving systems. Such ran… ▽ More All previously derived thermodynamic fluctuation theorems (FTs) that concern multiple co-evolving systems have required that each system can only change its state during an associated pre-fixed, limited set of time intervals. However, in many real-world cases the times when systems change their states are randomly determined, e.g., in almost all biological examples of co-evolving systems. Such randomness in the timing drastically modifies the thermodynamics. Here I derive FTs that apply whether or not the timing is random. These FTs provide new versions of the second law, and of all conventional thermodynamic uncertainty relations (TURs). These new results are often stronger than the conventional versions, which ignore how an overall system may decompose into a set of co-evolving systems. In addition, the new TURs often bound entropy production (EP) of the overall system even if none of the criteria for a conventional TUR (e.g., being a non-equilibrium steady state) hold for that overall system. △ Less

Submitted 14 April, 2021; v1 submitted 24 March, 2020; originally announced March 2020.

Comments: 5 pages of text and references, 10 pages of appendices, 2 figures

arXiv:2001.02205 [pdf, other]

Minimal entropy production due to constraints on rate matrix dependencies in multipartite processes

Authors: David H Wolpert

Abstract: I consider multipartite processes in which there are constraints on each subsystem's rate matrix, restricting which other subsystems can directly affect its dynamics. I derive a strictly nonzero lower bound on the minimal achievable entropy production rate of the process in terms of these constraints on the rate matrices of its subsystems. The bound is based on constructing counterfactual rate mat… ▽ More I consider multipartite processes in which there are constraints on each subsystem's rate matrix, restricting which other subsystems can directly affect its dynamics. I derive a strictly nonzero lower bound on the minimal achievable entropy production rate of the process in terms of these constraints on the rate matrices of its subsystems. The bound is based on constructing counterfactual rate matrices, in which some subsystems are held fixed while the others are allowed to evolve. This bound is related to the "learning rate" of stationary bipartite systems, and more generally to the "information flow" in bipartite systems. △ Less

Submitted 13 May, 2020; v1 submitted 7 January, 2020; originally announced January 2020.

Comments: 11 pages, 1 figure

arXiv:1912.04685 [pdf, ps, other]

doi 10.1103/PhysRevResearch.2.033312

Thermodynamic costs of Turing Machines

Authors: Artemy Kolchinsky, David H. Wolpert

Abstract: Turing Machines (TMs) are the canonical model of computation in computer science and physics. We combine techniques from algorithmic information theory and stochastic thermodynamics to analyze the thermodynamic costs of TMs. We consider two different ways of realizing a given TM with a physical process. The first realization is designed to be thermodynamically reversible when fed with random input… ▽ More Turing Machines (TMs) are the canonical model of computation in computer science and physics. We combine techniques from algorithmic information theory and stochastic thermodynamics to analyze the thermodynamic costs of TMs. We consider two different ways of realizing a given TM with a physical process. The first realization is designed to be thermodynamically reversible when fed with random input bits. The second realization is designed to generate less heat, up to an additive constant, than any realization that is computable (i.e., consistent with the physical Church-Turing thesis). We consider three different thermodynamic costs: the heat generated when the TM is run on each input (which we refer to as the "heat function"), the minimum heat generated when a TM is run with an input that results in some desired output (which we refer to as the "thermodynamic complexity" of the output, in analogy to the Kolmogorov complexity), and the expected heat on the input distribution that minimizes entropy production. For universal TMs, we show for both realizations that the thermodynamic complexity of any desired output is bounded by a constant (unlike the conventional Kolmogorov complexity), while the expected amount of generated heat is infinite. We also show that any computable realization faces a fundamental tradeoff between heat generation, the Kolmogorov complexity of its heat function, and the Kolmogorov complexity of its input-output map. We demonstrate this tradeoff by analyzing the thermodynamics of erasing a long string. △ Less

Submitted 20 August, 2020; v1 submitted 10 December, 2019; originally announced December 2019.

Comments: Physical Review Research, 2020

Journal ref: Physical Review Research, 2020

arXiv:1911.02700 [pdf, other]

doi 10.1103/PhysRevLett.125.200602

Uncertainty relations and fluctuation theorems for Bayes nets

Authors: David H. Wolpert

Abstract: Recent research has considered the stochastic thermodynamics of multiple interacting systems, representing the overall system as a Bayes net. I derive fluctuation theorems governing the entropy production (EP)of arbitrary sets of the systems in such a Bayes net. I also derive ``conditional'' fluctuation theorems, governing the distribution of EP in one set of systems conditioned on the EP of a dif… ▽ More Recent research has considered the stochastic thermodynamics of multiple interacting systems, representing the overall system as a Bayes net. I derive fluctuation theorems governing the entropy production (EP)of arbitrary sets of the systems in such a Bayes net. I also derive ``conditional'' fluctuation theorems, governing the distribution of EP in one set of systems conditioned on the EP of a different set of systems. I then derive thermodynamic uncertainty relations relating the EP of the overall system to the precisions of probability currents within the individual systems. △ Less

Submitted 1 June, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

Comments: 5 pages main text, 10 pages appendices, 1 figure - typos fixed from earlier version

Journal ref: Phys. Rev. Lett. 125, 200602 (2020)

arXiv:1911.01968 [pdf]

Thermodynamic Computing

Authors: Tom Conte, Erik DeBenedictis, Natesh Ganesh, Todd Hylton, John Paul Strachan, R. Stanley Williams, Alexander Alemi, Lee Altenberg, Gavin Crooks, James Crutchfield, Lidia del Rio, Josh Deutsch, Michael DeWeese, Khari Douglas, Massimiliano Esposito, Michael Frank, Robert Fry, Peter Harsha, Mark Hill, Christopher Kello, Jeff Krichmar, Suhas Kumar, Shih-Chii Liu, Seth Lloyd, Matteo Marsili , et al. (14 additional authors not shown)

Abstract: The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hard… ▽ More The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hardware, devices have become so small that we are struggling to eliminate the effects of thermodynamic fluctuations, which are unavoidable at the nanometer scale. In terms of software, our ability to imagine and program effective computational abstractions and implementations are clearly challenged in complex domains. In terms of systems, currently five percent of the power generated in the US is used to run computing systems - this astonishing figure is neither ecologically sustainable nor economically scalable. Economically, the cost of building next-generation semiconductor fabrication plants has soared past $10 billion. All of these difficulties - device scaling, software complexity, adaptability, energy consumption, and fabrication economics - indicate that the current computing paradigm has matured and that continued improvements along this path will be limited. If technological progress is to continue and corresponding social and economic benefits are to continue to accrue, computing must become much more capable, energy efficient, and affordable. We propose that progress in computing can continue under a united, physically grounded, computational paradigm centered on thermodynamics. Herein we propose a research agenda to extend these thermodynamic foundations into complex, non-equilibrium, self-organizing systems and apply them holistically to future computing systems that will harness nature's innate computational capacity. We call this type of computing "Thermodynamic Computing" or TC. △ Less

Submitted 14 November, 2019; v1 submitted 5 November, 2019; originally announced November 2019.

Comments: A Computing Community Consortium (CCC) workshop report, 36 pages

Report number: ccc2019report_6

arXiv:1905.05669 [pdf, ps, other]

doi 10.1088/1751-8121/ab0850

Stochastic thermodynamics of computation

Authors: David H. Wolpert

Abstract: One of the major resource requirements of computers - ranging from biological cells to human brains to high-performance (engineered) computers - is the energy used to run them. Those costs of performing a computation have long been a focus of research in physics, going back to the early work of Landauer. One of the most prominent aspects of computers is that they are inherently nonequilibrium syst… ▽ More One of the major resource requirements of computers - ranging from biological cells to human brains to high-performance (engineered) computers - is the energy used to run them. Those costs of performing a computation have long been a focus of research in physics, going back to the early work of Landauer. One of the most prominent aspects of computers is that they are inherently nonequilibrium systems. However, the early research was done when nonequilibrium statistical physics was in its infancy, which meant the work was formulated in terms of equilibrium statistical physics. Since then there have been major breakthroughs in nonequilibrium statistical physics, which are allowing us to investigate the myriad aspects of the relationship between statistical physics and computation, extending well beyond the issue of how much work is required to erase a bit. In this paper I review some of this recent work on the `stochastic thermodynamics of computation'. After reviewing the salient parts of information theory, computer science theory, and stochastic thermodynamics, I summarize what has been learned about the entropic costs of performing a broad range of computations, extending from bit erasure to loop-free circuits to logically reversible circuits to information ratchets to Turing machines. These results reveal new, challenging engineering problems for how to design computers to have minimal thermodynamic costs. They also allow us to start to combine computer science theory and stochastic thermodynamics at a foundational level, thereby expanding both. △ Less

Submitted 16 February, 2023; v1 submitted 14 May, 2019; originally announced May 2019.

Comments: 113 pages, no figures. arXiv admin note: text overlap with arXiv:1901.00386

Journal ref: Invited article for special issue, "Shannon's information theory: 70 years on", Journal of Physics A: Mathematical and Theoretical, 2019

arXiv:1901.00386 [pdf, ps, other]

Overview of Information Theory, Computer Science Theory, and Stochastic Thermodynamics for Thermodynamics of Computation

Authors: David H. Wolpert

Abstract: I give a quick overview of some of the theoretical background necessary for using modern non-equilibrium statistical physics to investigate the thermodynamics of computation. I first present some of the necessary concepts from information theory, and then introduce some of the most important types of computational machine considered in computer science theory. After this I present a central resu… ▽ More I give a quick overview of some of the theoretical background necessary for using modern non-equilibrium statistical physics to investigate the thermodynamics of computation. I first present some of the necessary concepts from information theory, and then introduce some of the most important types of computational machine considered in computer science theory. After this I present a central result from modern non-equilibrium statistical physics: an exact expression for the entropy flow out of a system undergoing a given dynamics with a given initial distribution over states. This central expression is crucial for analyzing how the total entropy flow out of a computer depends on its global structure, since that global structure determines the initial distributions into all of the computer's subsystems, and therefore (via the central expression) the entropy flows generated by all of those subsystems. I illustrate these results by analyzing some of the subtleties concerning the benefits that are sometimes claimed for implementing an irreversible computation with a reversible circuit constructed out of Fredkin gates. △ Less

Submitted 18 June, 2019; v1 submitted 30 December, 2018; originally announced January 2019.

Comments: 39 pages, no appendices, no figures. arXiv admin note: text overlap with arXiv:1806.04103

MSC Class: 03D15 ACM Class: F.1; F.2.3

Journal ref: In "The Energetics of Computation in Life and Machines", Wolpert, D., Kempes C., Stadler, P., Grochow, J. (Ed.'s), Santa Fe Institute Press, 2019

arXiv:1806.08053 [pdf, other]

doi 10.1098/rsfs.2018.0041

Semantic information, autonomous agency, and nonequilibrium statistical physics

Authors: Artemy Kolchinsky, David H. Wolpert

Abstract: Shannon information theory provides various measures of so-called "syntactic information", which reflect the amount of statistical correlation between systems. In contrast, the concept of "semantic information" refers to those correlations which carry significance or "meaning" for a given system. Semantic information plays an important role in many fields, including biology, cognitive science, and… ▽ More Shannon information theory provides various measures of so-called "syntactic information", which reflect the amount of statistical correlation between systems. In contrast, the concept of "semantic information" refers to those correlations which carry significance or "meaning" for a given system. Semantic information plays an important role in many fields, including biology, cognitive science, and philosophy, and there has been a long-standing interest in formulating a broadly applicable and formal theory of semantic information. In this paper we introduce such a theory. We define semantic information as the syntactic information that a physical system has about its environment which is causally necessary for the system to maintain its own existence. "Causal necessity" is defined in terms of counter-factual interventions which scramble correlations between the system and its environment, while "maintaining existence" is defined in terms of the system's ability to keep itself in a low entropy state. We also use recent results in nonequilibrium statistical physics to analyze semantic information from a thermodynamic point of view. Our framework is grounded in the intrinsic dynamics of a system coupled to an environment, and is applicable to any physical system, living or otherwise. It leads to formal definitions of several concepts that have been intuitively understood to be related to semantic information, including "value of information", "semantic content", and "agency". △ Less

Submitted 7 November, 2018; v1 submitted 20 June, 2018; originally announced June 2018.

Journal ref: Interface Focus, 2018

arXiv:1806.04103 [pdf, other]

doi 10.1088/1367-2630/ab82b8

Thermodynamics of computing with circuits

Authors: David Hilton Wolpert, Artemy Kolchinsky

Abstract: Digital computers implement computations using circuits, as do many naturally occurring systems (e.g., gene regulatory networks). The topology of any such circuit restricts which variables may be physically coupled during the operation of a circuit. We investigate how such restrictions on the physical coupling affects the thermodynamic costs of running the circuit. To do this we first calculate th… ▽ More Digital computers implement computations using circuits, as do many naturally occurring systems (e.g., gene regulatory networks). The topology of any such circuit restricts which variables may be physically coupled during the operation of a circuit. We investigate how such restrictions on the physical coupling affects the thermodynamic costs of running the circuit. To do this we first calculate the minimal additional entropy production that arises when we run a given gate in a circuit. We then build on this calculation, to analyze how the thermodynamic costs of implementing a computation with a full circuit, comprising multiple connected gates, depends on the topology of that circuit. This analysis provides a rich new set of optimization problems that must be addressed by any designer of a circuit, if they wish to minimize thermodynamic costs. △ Less

Submitted 20 July, 2023; v1 submitted 11 June, 2018; originally announced June 2018.

Comments: 26 pages (6 of appendices), 5 figures

MSC Class: 03D15 ACM Class: F.1; F.2.3

arXiv:1801.06147 [pdf, other]

doi 10.2514/6.2018-1659

Upgrading from Gaussian Processes to Student's-T Processes

Authors: Brendan D. Tracey, David H. Wolpert

Abstract: Gaussian process priors are commonly used in aerospace design for performing Bayesian optimization. Nonetheless, Gaussian processes suffer two significant drawbacks: outliers are a priori assumed unlikely, and the posterior variance conditioned on observed data depends only on the locations of those data, not the associated sample values. Student's-T processes are a generalization of Gaussian proc… ▽ More Gaussian process priors are commonly used in aerospace design for performing Bayesian optimization. Nonetheless, Gaussian processes suffer two significant drawbacks: outliers are a priori assumed unlikely, and the posterior variance conditioned on observed data depends only on the locations of those data, not the associated sample values. Student's-T processes are a generalization of Gaussian processes, founded on the Student's-T distribution instead of the Gaussian distribution. Student's-T processes maintain the primary advantages of Gaussian processes (kernel function, analytic update rule) with additional benefits beyond Gaussian processes. The Student's-T distribution has higher Kurtosis than a Gaussian distribution and so outliers are much more likely, and the posterior variance increases or decreases depending on the variance of observed data sample values. Here, we describe Student's-T processes, and discuss their advantages in the context of aerospace optimization. We show how to construct a Student's-T process using a kernel function and how to update the process given new samples. We provide a clear derivation of optimization-relevant quantities such as expected improvement, and contrast with the related computations for Gaussian processes. Finally, we compare the performance of Student's-T processes against Gaussian process on canonical test problems in Bayesian optimization, and apply the Student's-T process to the optimization of an aerostructural design problem. △ Less

Submitted 18 January, 2018; originally announced January 2018.

Comments: 2018 AIAA Non-Deterministic Approaches Conference

arXiv:1711.03499 [pdf, other]

Constraints on physical reality arising from a formalization of knowledge

Authors: David Wolpert

Abstract: There are (at least) four ways that an agent can acquire information concerning the state of the universe: via observation, control, prediction, or via retrodiction, i.e., memory. Each of these four ways of acquiring information seems to rely on a different kind of physical device (resp., an observation device, a control device, etc.). However it turns out that certain mathematical structure is co… ▽ More There are (at least) four ways that an agent can acquire information concerning the state of the universe: via observation, control, prediction, or via retrodiction, i.e., memory. Each of these four ways of acquiring information seems to rely on a different kind of physical device (resp., an observation device, a control device, etc.). However it turns out that certain mathematical structure is common to those four types of device. Any device that possesses a certain subset of that structure is known as an "inference device" (ID). Here I review some of the properties of IDs, including their relation with Turing machines, and (more loosely) quantum mechanics. I also review the bounds of the joint abilities of any set of IDs to know facts about the physical universe that contains them. These bounds constrain the possible properties of any universe that contains agents who can acquire information concerning that universe. I then extend this previous work on IDs, by adding to the definition of IDs some of the other mathematical structure that is common to the four ways of acquiring information about the universe but is not captured in the (minimal) definition of IDs. I discuss these extensions of IDs in the context of epistemic logic (especially possible worlds formalisms like Kripke structures and Aumann structures). In particular, I show that these extensions of IDs are not subject to the problem of logical omniscience that plagues many previously studied forms of epistemic logic. △ Less

Submitted 27 June, 2018; v1 submitted 9 November, 2017; originally announced November 2017.

Comments: 24 pages, 7 figures

arXiv:1709.06656 [pdf, ps, other]

doi 10.1109/TITS.2018.2848264

Deep Reinforcement Learning for Event-Driven Multi-Agent Decision Processes

Authors: Kunal Menda, Yi-Chun Chen, Justin Grana, James W. Bono, Brendan D. Tracey, Mykel J. Kochenderfer, David Wolpert

Abstract: The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems. Since macro-actions last for stochastic durations, multiple agents executing decentralized policies in cooperative environments must act asynchronously. We present an algorithm that modifies generalized… ▽ More The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems. Since macro-actions last for stochastic durations, multiple agents executing decentralized policies in cooperative environments must act asynchronously. We present an algorithm that modifies generalized advantage estimation for temporally extended actions, allowing a state-of-the-art policy optimization algorithm to optimize policies in Dec-POMDPs in which agents act asynchronously. We show that our algorithm is capable of learning optimal policies in two cooperative domains, one involving real-time bus holding control and one involving wildfire fighting with unmanned aircraft. Our algorithm works by framing problems as "event-driven decision processes," which are scenarios in which the sequence and timing of actions and events are random and governed by an underlying stochastic process. In addition to optimizing policies with continuous state and action spaces, our algorithm also facilitates the use of event-driven simulators, which do not require time to be discretized into time-steps. We demonstrate the benefit of using event-driven simulation in the context of multiple agents taking asynchronous actions. We show that fixed time-step simulation risks obfuscating the sequence in which closely separated events occur, adversely affecting the policies learned. In addition, we show that arbitrarily shrinking the time-step scales poorly with the number of agents. △ Less

Submitted 29 May, 2019; v1 submitted 19 September, 2017; originally announced September 2017.

Comments: Published in IEEE Transactions on Intelligent Transportation Systems (Volume: 20, Issue: 4, April 2019). https://ieeexplore.ieee.org/document/8419722

Journal ref: IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 4, pp. 1259-1268, April 2019

arXiv:1709.00765 [pdf, other]

doi 10.1088/1367-2630/aaf81d

Number of hidden states needed to physically implement a given conditional distribution

Authors: Jeremy A. Owen, Artemy Kolchinsky, David H. Wolpert

Abstract: We consider the problem of how to construct a physical process over a finite state space $X$ that applies some desired conditional distribution $P$ to initial states to produce final states. This problem arises often in the thermodynamics of computation and nonequilibrium statistical physics more generally (e.g., when designing processes to implement some desired computation, feedback controller,… ▽ More We consider the problem of how to construct a physical process over a finite state space $X$ that applies some desired conditional distribution $P$ to initial states to produce final states. This problem arises often in the thermodynamics of computation and nonequilibrium statistical physics more generally (e.g., when designing processes to implement some desired computation, feedback controller, or Maxwell demon). It was previously known that some conditional distributions cannot be implemented using any master equation that involves just the states in $X$. However, here we show that any conditional distribution $P$ can in fact be implemented---if additional "hidden" states not in $X$ are available. Moreover, we show that it is always possible to implement $P$ in a thermodynamically reversible manner. We then investigate a novel cost of the physical resources needed to implement a given distribution $P$: the minimal number of hidden states needed to do so. We calculate this cost exactly for the special case where $P$ represents a single-valued function, and provide an upper bound for the general case, in terms of the nonnegative rank of $P$. These results show that having access to one extra binary degree of freedom, thus doubling the total number of states, is sufficient to implement any $P$ with a master equation in a thermodynamically reversible way, if there are no constraints on the allowed form of the master equation. (Such constraints can greatly increase the minimal needed number of hidden states.) Our results also imply that for certain $P$ that can be implemented without hidden states, having hidden states permits an implementation that generates less heat. △ Less

Submitted 13 October, 2019; v1 submitted 3 September, 2017; originally announced September 2017.

Comments: 16 pages, 2 figures

Journal ref: New J. Phys. 21 013022, 2019

arXiv:1708.08494 [pdf, other]

doi 10.1038/s41467-019-09542-x

A space-time tradeoff for implementing a function with master equation dynamics

Authors: David H. Wolpert, Artemy Kolchinsky, Jeremy A. Owen

Abstract: Master equations are commonly used to model the dynamics of physical systems, including systems that implement single-valued functions like a computer's update step. However, many such functions cannot be implemented by any master equation, even approximately, which raises the question of how they can occur in the real world. Here we show how any function over some "visible" states can be implemen… ▽ More Master equations are commonly used to model the dynamics of physical systems, including systems that implement single-valued functions like a computer's update step. However, many such functions cannot be implemented by any master equation, even approximately, which raises the question of how they can occur in the real world. Here we show how any function over some "visible" states can be implemented with master equation dynamics--if the dynamics exploits additional, "hidden" states at intermediate times. We also show that any master equation implementing a function can be decomposed into a sequence of "hidden" timesteps, demarcated by changes in what state-to-state transitions have nonzero probability. In many real-world situations there is a cost both for more hidden states and for more hidden timesteps. Accordingly, we derive a "space-time" tradeoff between the number of hidden states and the number of hidden timesteps needed to implement any given function. △ Less

Submitted 21 April, 2019; v1 submitted 28 August, 2017; originally announced August 2017.

Comments: 17 pages, 2 figures

Journal ref: Nature Communications, 10(1):1727, 2019

arXiv:1706.05043 [pdf, other]

doi 10.1098/rsta.2016.0343

The thermodynamic efficiency of computations made in cells across the range of life

Authors: Christopher P. Kempes, David Wolpert, Zachary Cohen, Juan Pérez-Mercader

Abstract: Biological organisms must perform computation as they grow, reproduce, and evolve. Moreover, ever since Landauer's bound was proposed it has been known that all computation has some thermodynamic cost -- and that the same computation can be achieved with greater or smaller thermodynamic cost depending on how it is implemented. Accordingly an important issue concerning the evolution of life is asse… ▽ More Biological organisms must perform computation as they grow, reproduce, and evolve. Moreover, ever since Landauer's bound was proposed it has been known that all computation has some thermodynamic cost -- and that the same computation can be achieved with greater or smaller thermodynamic cost depending on how it is implemented. Accordingly an important issue concerning the evolution of life is assessing the thermodynamic efficiency of the computations performed by organisms. This issue is interesting both from the perspective of how close life has come to maximally efficient computation (presumably under the pressure of natural selection), and from the practical perspective of what efficiencies we might hope that engineered biological computers might achieve, especially in comparison with current computational systems. Here we show that the computational efficiency of translation, defined as free energy expended per amino acid operation, outperforms the best supercomputers by several orders of magnitude, and is only about an order of magnitude worse than the Landauer bound. However this efficiency depends strongly on the size and architecture of the cell in question. In particular, we show that the {\it useful} efficiency of an amino acid operation, defined as the bulk energy per amino acid polymerization, decreases for increasing bacterial size and converges to the polymerization cost of the ribosome. This cost of the largest bacteria does not change in cells as we progress through the major evolutionary shifts to both single and multicellular eukaryotes. However, the rates of total computation per unit mass are nonmonotonic in bacteria with increasing cell size, and also change across different biological architectures including the shift from unicellular to multicellular eukaryotes. △ Less

Submitted 15 June, 2017; originally announced June 2017.

Comments: 31 pages, 7 figures, 1 Table

arXiv:1705.09598 [pdf, ps, other]

When is a bit worth much more than kT ln2?

Authors: Can Gokler, Artemy Kolchinsky, Zi-Wen Liu, Iman Marvian, Peter Shor, Oles Shtanko, Kevin Thompson, David Wolpert, Seth Lloyd

Abstract: Physical processes thatobtain, process, and erase information involve tradeoffs between information and energy. The fundamental energetic value of a bit of information exchanged with a reservoir at temperature T is kT ln2. This paper investigates the situation in which information is missing about just what physical process is about to take place. The fundamental energetic value of such informatio… ▽ More Physical processes thatobtain, process, and erase information involve tradeoffs between information and energy. The fundamental energetic value of a bit of information exchanged with a reservoir at temperature T is kT ln2. This paper investigates the situation in which information is missing about just what physical process is about to take place. The fundamental energetic value of such information can be far greater than kT ln2 per bit. △ Less

Submitted 26 May, 2017; originally announced May 2017.

Comments: 7 pages, plain TeX

arXiv:1705.02436 [pdf, other]

doi 10.3390/e21121181

Nonlinear Information Bottleneck

Authors: Artemy Kolchinsky, Brendan D. Tracey, David H. Wolpert

Abstract: Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$. IB works by encoding $X$ in a compressed "bottleneck" random variable $M$ from which $Y$ can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been cons… ▽ More Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$. IB works by encoding $X$ in a compressed "bottleneck" random variable $M$ from which $Y$ can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been considered for only two limited cases: discrete $X$ and $Y$ with small state spaces, and continuous $X$ and $Y$ with a Gaussian joint distribution (in which case optimal encoding and decoding maps are linear). We propose a method for performing IB on arbitrarily-distributed discrete and/or continuous $X$ and $Y$, while allowing for nonlinear encoding and decoding maps. Our approach relies on a novel non-parametric upper bound for mutual information. We describe how to implement our method using neural networks. We then show that it achieves better performance than the recently-proposed "variational IB" method on several real-world datasets. △ Less

Submitted 30 November, 2019; v1 submitted 5 May, 2017; originally announced May 2017.

Journal ref: Entropy, 2019

arXiv:1705.00041 [pdf, ps, other]

Maximizing free energy gain

Authors: Artemy Kolchinsky, Iman Marvian, Can Gokler, Zi-Wen Liu, Peter Shor, Oles Shtanko, Kevin Thompson, David Wolpert, Seth Lloyd

Abstract: Free energy is energy that is available to do work. Maximizing the free energy gain and the gain in work that can be extracted from a system is important for a wide variety of physical and technological processes, from energy harvesting processes such as photosynthesis to energy storage systems such as fuels and batteries. This paper extends recent results from non-equilibrium thermodynamics and q… ▽ More Free energy is energy that is available to do work. Maximizing the free energy gain and the gain in work that can be extracted from a system is important for a wide variety of physical and technological processes, from energy harvesting processes such as photosynthesis to energy storage systems such as fuels and batteries. This paper extends recent results from non-equilibrium thermodynamics and quantum resource theory to derive closed-form solutions for the maximum possible gain in free energy and extractable work that can be obtained by varying the initial states of classical and quantum stochastic processes. Simple formulae allow the comparison the free energy increase for the optimal procedure with that for a sub-optimal procedure. The problem of finding the optimal free-energy harvesting procedure is shown to be convex and solvable via gradient descent. △ Less

Submitted 28 April, 2017; originally announced May 2017.

Comments: 15 pages, plain TeX

arXiv:1702.04449 [pdf, other]

Modeling Social Organizations as Communication Networks

Authors: David Wolpert, Justin Grana, Brendan Tracey, Tim Kohler, Artemy Kolchinsky

Abstract: We identify the "organization" of a human social group as the communication network(s) within that group. We then introduce three theoretical approaches to analyzing what determines the structures of human organizations. All three approaches adopt a group-selection perspective, so that the group's network structure is (approximately) optimal, given the information-processing limitations of agents… ▽ More We identify the "organization" of a human social group as the communication network(s) within that group. We then introduce three theoretical approaches to analyzing what determines the structures of human organizations. All three approaches adopt a group-selection perspective, so that the group's network structure is (approximately) optimal, given the information-processing limitations of agents within the social group, and the exogenous welfare function of the overall group. In the first approach we use a new sub-field of telecommunications theory called network coding, and focus on a welfare function that involves the ability of the organization to convey information among the agents. In the second approach we focus on a scenario where agents within the organization must allocate their future communication resources when the state of the future environment is uncertain. We show how this formulation can be solved with a linear program. In the third approach, we introduce an information synthesis problem in which agents within an organization receive information from various sources and must decide how to transform such information and transmit the results to other agents in the organization. We propose leveraging the computational power of neural networks to solve such problems. These three approaches formalize and synthesize work in fields including anthropology, archeology, economics and psychology that deal with organization structure, theory of the firm, span of control and cognitive limits on communication. △ Less

Submitted 14 February, 2017; originally announced February 2017.

arXiv:1701.07602 [pdf, other]

doi 10.3390/e19100527

Coarse-graining and the Blackwell order

Authors: Johannes Rauh, Pradeep Kr. Banerjee, Eckehard Olbrich, Jürgen Jost, Nils Bertschinger, David Wolpert

Abstract: Suppose we have a pair of information channels, $κ_{1},κ_{2}$, with a common input. The Blackwell order is a partial order over channels that compares $κ_{1}$ and $κ_{2}$ by the maximal expected utility an agent can obtain when decisions are based on the channel outputs. Equivalently, $κ_{1}$ is said to be Blackwell-inferior to $κ_{2}$ if and only if $κ_{1}$ can be constructed by garbling the outp… ▽ More Suppose we have a pair of information channels, $κ_{1},κ_{2}$, with a common input. The Blackwell order is a partial order over channels that compares $κ_{1}$ and $κ_{2}$ by the maximal expected utility an agent can obtain when decisions are based on the channel outputs. Equivalently, $κ_{1}$ is said to be Blackwell-inferior to $κ_{2}$ if and only if $κ_{1}$ can be constructed by garbling the output of $κ_{2}$. A related partial order stipulates that $κ_{2}$ is more capable than $κ_{1}$ if the mutual information between the input and output is larger for $κ_{2}$ than for $κ_{1}$ for any distribution over inputs. A Blackwell-inferior channel is necessarily less capable. However, examples are known where $κ_{1}$ is less capable than $κ_{2}$ but not Blackwell-inferior. We show that this may even happen when $κ_{1}$ is constructed by coarse-graining the inputs of $κ_{2}$. Such a coarse-graining is a special kind of "pre-garbling" of the channel inputs. This example directly establishes that the expected value of the shared utility function for the coarse-grained channel is larger than it is for the non-coarse-grained channel. This contradicts the intuition that coarse-graining can only destroy information and lead to inferior channels. We also discuss our results in the context of information decompositions. △ Less

Submitted 10 November, 2017; v1 submitted 26 January, 2017; originally announced January 2017.

Comments: 12 pages, 1 figure, journal version

MSC Class: 62B15; 94A15; 94A17

Journal ref: Entropy 2017, 19(10), 527

arXiv:1612.00036 [pdf, other]

A quantitative definition of organismality and its application to lichen

Authors: Eric Libby, Joshua Grochow, Simon DeDeo, David Wolpert

Abstract: The organism is a fundamental concept in biology. However there is no universally accepted, formal, and yet broadly applicable definition of what an organism is. Here we introduce a candidate definition. We adopt the view that the "organism" is a functional concept, used by scientists to address particular questions concerning the future state of a biological system, rather than something wholly d… ▽ More The organism is a fundamental concept in biology. However there is no universally accepted, formal, and yet broadly applicable definition of what an organism is. Here we introduce a candidate definition. We adopt the view that the "organism" is a functional concept, used by scientists to address particular questions concerning the future state of a biological system, rather than something wholly defined by that system. In this approach organisms are a coarse-graining of a fine-grained dynamical model of a biological system. Crucially, the coarse-graining of the system into organisms is chosen so that their dynamics can be used by scientists to make accurate predictions of those features of the biological system that interests them, and do so with minimal computational burden. To illustrate our framework we apply it to a dynamic model of lichen symbiosis---a system where either the lichen or its constituent fungi and algae could reasonably be considered "organisms." We find that the best choice for what organisms are in this scenario are complex mixtures of many entities that do not resemble standard notions of organisms. When we restrict our allowed coarse-grainings to more traditional types of organisms, we find that ecological conditions, such as niche competition and predation pressure, play a significant role in determining the best choice for organisms. △ Less

Submitted 30 November, 2016; originally announced December 2016.

Comments: 21 pages, 6 figures

arXiv:1609.00104 [pdf, other]

A Likelihood Ratio Detector for Identifying Within-Perimeter Computer Network Attacks

Authors: Justin Grana, David Wolpert, Joshua Neil, Tanmoy Bhattacharya, Russel Bent

Abstract: The rapid detection of attackers within firewalls of enterprise computer net- works is of paramount importance. Anomaly detectors address this problem by quantifying deviations from baseline statistical models of normal network behav- ior and signaling an intrusion when the observed data deviates significantly from the baseline model. However, many anomaly detectors do not take into account plausi… ▽ More The rapid detection of attackers within firewalls of enterprise computer net- works is of paramount importance. Anomaly detectors address this problem by quantifying deviations from baseline statistical models of normal network behav- ior and signaling an intrusion when the observed data deviates significantly from the baseline model. However, many anomaly detectors do not take into account plausible attacker behavior. As a result, anomaly detectors are prone to a large number of false positives due to unusual but benign activity. This paper first in- troduces a stochastic model of attacker behavior which is motivated by real world attacker traversal. Then, we develop a likelihood ratio detector that compares the probability of observed network behavior under normal conditions against the case when an attacker has possibly compromised a subset of hosts within the network. Since the likelihood ratio detector requires integrating over the time each host be- comes compromised, we illustrate how to use Monte Carlo methods to compute the requisite integral. We then present Receiver Operating Characteristic (ROC) curves for various network parameterizations that show for any rate of true posi- tives, the rate of false positives for the likelihood ratio detector is no higher than that of a simple anomaly detector and is often lower. We conclude by demon- strating the superiority of the proposed likelihood ratio detector when the network topologies and parameterizations are extracted from real-world networks. △ Less

Submitted 1 September, 2016; originally announced September 2016.

arXiv:1607.00956 [pdf, ps, other]

doi 10.1088/1742-5468/aa7ee1

Dependence of dissipation on the initial distribution over states

Authors: Artemy Kolchinsky, David H. Wolpert

Abstract: We analyze how the amount of work dissipated by a fixed nonequilibrium process depends on the initial distribution over states. Specifically, we compare the amount of dissipation when the process is used with some specified initial distribution to the minimal amount of dissipation possible for any initial distribution. We show that the difference between those two amounts of dissipation is given b… ▽ More We analyze how the amount of work dissipated by a fixed nonequilibrium process depends on the initial distribution over states. Specifically, we compare the amount of dissipation when the process is used with some specified initial distribution to the minimal amount of dissipation possible for any initial distribution. We show that the difference between those two amounts of dissipation is given by a simple information-theoretic function that depends only on the initial and final state distributions. Crucially, this difference is independent of the details of the process relating those distributions. We then consider how dissipation depends on the initial distribution for a 'computer', i.e., a nonequilibrium process whose dynamics over coarse-grained macrostates implement some desired input-output map. We show that our results still apply when stated in terms of distributions over the computer's coarse-grained macrostates. This can be viewed as a novel thermodynamic cost of computation, reflecting changes in the distribution over inputs rather than the logical dynamics of the computation. △ Less

Submitted 22 August, 2017; v1 submitted 4 July, 2016; originally announced July 2016.

MSC Class: 82C05

Journal ref: J. Stat. Mech, 2017

Showing 1–50 of 89 results for author: Wolpert, D