-
Boltzmann Bridges
Authors:
Jordan Scharnhorst,
David Wolpert,
Carlo Rovelli
Abstract:
It is often stated that the second law of thermodynamics follows from the condition that at some given time in the past the entropy was lower than it is now. Formally, this condition is the statement that $E[S(t)|S(t_0)]$, the expected entropy of the universe at the current time $t$ conditioned on its value $S(t_0)$ at a time $t_0$ in the past, is an increasing function of $t $. We point out that…
▽ More
It is often stated that the second law of thermodynamics follows from the condition that at some given time in the past the entropy was lower than it is now. Formally, this condition is the statement that $E[S(t)|S(t_0)]$, the expected entropy of the universe at the current time $t$ conditioned on its value $S(t_0)$ at a time $t_0$ in the past, is an increasing function of $t $. We point out that in general this is incorrect. The epistemic axioms underlying probability theory say that we should condition expectations on all that we know, and on nothing that we do not know. Arguably, we know the value of the universe's entropy at the present time $t$ at least as well as its value at a time in the past, $t_0$. However, as we show here, conditioning expected entropy on its value at two times rather than one radically changes its dynamics, resulting in a unexpected, very rich structure. For example, the expectation value conditioned on two times can have a maximum at an intermediate time between $t_0$ and $t$, i.e., in our past. Moreover, it can have a negative rather than positive time derivative at the present. In such "Boltzmann bridge" situations, the second law would not hold at the present time. We illustrate and investigate these phenomena for a random walk model and an idealized gas model, and briefly discuss the role of Boltzmann bridges in our universe.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
Implications of computer science theory for the simulation hypothesis
Authors:
David H. Wolpert
Abstract:
The simulation hypothesis has recently excited renewed interest, especially in the physics and philosophy communities. However, the hypothesis specifically concerns \textit{computers} that simulate physical universes, which means that to formally investigate it we need to couple computer science theory with physics. Here I couple those fields with the physical Church-Turing thesis. I then exploit…
▽ More
The simulation hypothesis has recently excited renewed interest, especially in the physics and philosophy communities. However, the hypothesis specifically concerns \textit{computers} that simulate physical universes, which means that to formally investigate it we need to couple computer science theory with physics. Here I couple those fields with the physical Church-Turing thesis. I then exploit that coupling to investigate of some of the computer science theory aspects of the simulation hypothesis. In particular, I use Kleene's second recursion theorem to prove that it is mathematically possible for us to be a simulation that is being run on a computer - by us. In such a self-simulation, there would be two identical instances of us; the question of which of those is ``really us'' is meaningless. I also show how Rice's theorem provides some interesting impossibility results concerning simulation and self-simulation; briefly describe the philosophical implications of fully homomorphic encryption for (self-)simulation; and briefly investigate the graphical structure of universes simulating universes simulating universes ..., among other issues. I end by describing some of the possible avenues for future research.
△ Less
Submitted 22 June, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Game Mining: How to Make Money from those about to Play a Game
Authors:
James W. Bono,
David H. Wolpert
Abstract:
It is known that a player in a noncooperative game can benefit by publicly restricting his possible moves before play begins. We show that, more generally, a player may benefit by publicly committing to pay an external party an amount that is contingent on the game's outcome. We explore what happens when external parties -- who we call ``game miners'' -- discover this fact and seek to profit from…
▽ More
It is known that a player in a noncooperative game can benefit by publicly restricting his possible moves before play begins. We show that, more generally, a player may benefit by publicly committing to pay an external party an amount that is contingent on the game's outcome. We explore what happens when external parties -- who we call ``game miners'' -- discover this fact and seek to profit from it by entering an outcome-contingent contract with the players. We analyze various structured bargaining games between miners and players for determining such an outcome-contingent contract. These bargaining games include playing the players against one another, as well as allowing the players to pay the miner(s) for exclusivity and first-mover advantage. We establish restrictions on the strategic settings in which a game miner can profit and bounds on the game miner's profit. We also find that game miners can lead to both efficient and inefficient equilibria.
△ Less
Submitted 4 January, 2024;
originally announced January 2024.
-
Is stochastic thermodynamics the key to understanding the energy costs of computation?
Authors:
David Wolpert,
Jan Korbel,
Christopher Lynn,
Farita Tasnim,
Joshua Grochow,
Gülce Kardeş,
James Aimone,
Vijay Balasubramanian,
Eric de Giuli,
David Doty,
Nahuel Freitas,
Matteo Marsili,
Thomas E. Ouldridge,
Andrea Richa,
Paul Riechers,
Édgar Roldán,
Brenda Rubenstein,
Zoltan Toroczkai,
Joseph Paradiso
Abstract:
The relationship between the thermodynamic and computational characteristics of dynamical physical systems has been a major theoretical interest since at least the 19th century, and has been of increasing practical importance as the energetic cost of digital devices has exploded over the last half century. One of the most important thermodynamic features of real-world computers is that they operat…
▽ More
The relationship between the thermodynamic and computational characteristics of dynamical physical systems has been a major theoretical interest since at least the 19th century, and has been of increasing practical importance as the energetic cost of digital devices has exploded over the last half century. One of the most important thermodynamic features of real-world computers is that they operate very far from thermal equilibrium, in finite time, with many quickly (co-)evolving degrees of freedom. Such computers also must almost always obey multiple physical constraints on how they work. For example, all modern digital computers are periodic processes, governed by a global clock. Another example is that many computers are modular, hierarchical systems, with strong restrictions on the connectivity of their subsystems. This properties hold both for naturally occurring computers, like brains or Eukaryotic cells, as well as digital systems. These features of real-world computers are absent in 20th century analyses of the thermodynamics of computational processes, which focused on quasi-statically slow processes. However, the field of stochastic thermodynamics has been developed in the last few decades - and it provides the formal tools for analyzing systems that have exactly these features of real-world computers. We argue here that these tools, together with other tools currently being developed in stochastic thermodynamics, may help us understand at a far deeper level just how the fundamental physical properties of dynamic systems are related to the computation that they perform.
△ Less
Submitted 30 November, 2023; v1 submitted 28 November, 2023;
originally announced November 2023.
-
Game Manipulators -- the Strategic Implications of Binding Contracts
Authors:
Maria Alejandra Ramirez,
Yoav Kolumbus,
Rosemarie Nagel,
David Wolpert,
Jürgen Jost
Abstract:
Commitment devices are powerful tools that can influence and incentivise certain behaviours by linking them to rewards or punishments. These devices are particularly useful in decision-making, as they can steer individuals towards specific choices. In the field of game theory, commitment devices can alter a player's payoff matrix, ultimately changing the game's Nash equilibria. Interestingly, agen…
▽ More
Commitment devices are powerful tools that can influence and incentivise certain behaviours by linking them to rewards or punishments. These devices are particularly useful in decision-making, as they can steer individuals towards specific choices. In the field of game theory, commitment devices can alter a player's payoff matrix, ultimately changing the game's Nash equilibria. Interestingly, agents, whom we term game manipulators and who can be external to the original game, can leverage such devices to extract fees from players by making them contingent offers that modify the payoffs of their actions. This can result in a different Nash equilibrium with potentially lower payoffs for the players compared to the original game. For this scheme to work, it is required that all commitments be binding, meaning that once an offer is made, it cannot be revoked. Consequently, we analyse binding contracts as the commitment mechanism that enables game manipulation scenarios. The main focus of this study is to formulate the logic of this setting, expand its scope to encompass more intricate schemes, and analyse the behaviour of regret-minimizing agents in scenarios involving game manipulation.
△ Less
Submitted 8 December, 2023; v1 submitted 17 November, 2023;
originally announced November 2023.
-
Memory Systems, the Epistemic Arrow of Time, and the Second Law
Authors:
David H. Wolpert,
Jens Kipper
Abstract:
The epistemic arrow of time is the fact that our knowledge of the past seems to be both of a different kind and more detailed than our knowledge of the future. Just like with the other arrows of time, it has often been speculated that the epistemic arrow arises due to the second law of thermodynamics.
In this paper we investigate the epistemic arrow of time, using a fully formal framework. We be…
▽ More
The epistemic arrow of time is the fact that our knowledge of the past seems to be both of a different kind and more detailed than our knowledge of the future. Just like with the other arrows of time, it has often been speculated that the epistemic arrow arises due to the second law of thermodynamics.
In this paper we investigate the epistemic arrow of time, using a fully formal framework. We begin by defining a memory system as any physical system whose present state can provide information about the state of the external world at some time other than the present. We then identify two types of memory systems in our universe, along with an important special case of the first type, which we distinguish as a third type of memory system.
We show that two of these types of memory system are time-symmetric, able to provide knowledge about both the past and the future. However, the third type of memory systems exploits the second law of thermodynamics in all of its instances we find in our universe. The result is that in our universe, this type of memory system only ever provides information about the past. Finally, we argue that human memory is of this third type, completing the argument. Our analysis is indebted to prior work in Wolpert 1992, but expands and improves upon this work in several respects.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Thermodynamics of computations with absolute irreversibility, unidirectional transitions, and stochastic computation times
Authors:
Gonzalo Manzano,
Gülce Kardeş,
Édgar Roldán,
David Wolpert
Abstract:
Develo** a thermodynamic theory of computation is a challenging task at the interface of non-equilibrium thermodynamics and computer science. In particular, this task requires dealing with difficulties such as stochastic halting times, unidirectional (possibly deterministic) transitions, and restricted initial conditions, features common in real-world computers. Here, we present a framework whic…
▽ More
Develo** a thermodynamic theory of computation is a challenging task at the interface of non-equilibrium thermodynamics and computer science. In particular, this task requires dealing with difficulties such as stochastic halting times, unidirectional (possibly deterministic) transitions, and restricted initial conditions, features common in real-world computers. Here, we present a framework which tackles all such difficulties by extending the martingale theory of non-equilibrium thermodynamics to generic non-stationary Markovian processes, including those with broken detailed balance and/or absolute irreversibility. We derive several universal fluctuation relations and second-law-like inequalities that provide both lower and upper bounds on the intrinsic dissipation (mismatch cost) associated with any periodic process -- in particular the periodic processes underlying all current digital computation. Crucially, these bounds apply even if the process has stochastic stop** times, as it does in many computational machines. We illustrate our results with exhaustive numerical simulations of deterministic finite automata (DFA) processing bit strings, one of the fundamental models of computation from theoretical computer science. We also provide universal equalities and inequalities for the acceptance probability of words of a given length by a deterministic finite automaton in terms of thermodynamic quantities, and outline connections between computer science and stochastic resetting. Our results, while motivated from the computational context, are applicable far more broadly.
△ Less
Submitted 16 February, 2024; v1 submitted 11 July, 2023;
originally announced July 2023.
-
Stochastic thermodynamics of multiple co-evolving systems -- beyond multipartite processes
Authors:
Farita Tasnim,
David H. Wolpert
Abstract:
Many dynamical systems consist of multiple, co-evolving subsystems (degrees of freedom). These subsystems often depend upon each other in a way that restricts the overall system's dynamics. How does this network of dependencies affect the system's thermodynamics? Prior studies in the stochastic thermodynamics of multipartite processes (MPPs) have approached this question by restricting the system…
▽ More
Many dynamical systems consist of multiple, co-evolving subsystems (degrees of freedom). These subsystems often depend upon each other in a way that restricts the overall system's dynamics. How does this network of dependencies affect the system's thermodynamics? Prior studies in the stochastic thermodynamics of multipartite processes (MPPs) have approached this question by restricting the system to allow only one subsystem to change state at a time. However, in many real systems, such as chemical reaction networks or electronic circuits, multiple subsystems must change state together. Therefore, studies of MPPs do not apply to such systems. Here, we investigate the thermodynamics of composite processes, in which subsets of subsystems are allowed to change state simultaneously. These subsets correspond to the subsystems that interact with a single mechanism (e.g., a thermal or chemical reservoir) that is coupled to the system. An MPP is simply a (subcase of a) composite process in which all such subsets have cardinality one. We demonstrate the power of the composite systems framework to study the thermodynamics of multiple, co-evolving subsystems. In particular, we derive thermodynamic uncertainty relations for information flows in composite processes. We also derive strengthened speed limits for composite processes. Our results apply to a much broader class of dynamical systems than do results for MPPs, and could guide future studies of the thermodynamics of distributed computational systems.
△ Less
Submitted 22 May, 2023; v1 submitted 16 May, 2023;
originally announced May 2023.
-
Entropy production in communication channels
Authors:
Farita Tasnim,
Nahuel Freitas,
David H. Wolpert
Abstract:
In many complex systems, whether biological or artificial, the thermodynamic costs of communication among their components are large. These systems also tend to split information transmitted between any two components across multiple channels. A common hypothesis is that such inverse multiplexing strategies reduce total thermodynamic costs. So far, however, there have been no physics-based results…
▽ More
In many complex systems, whether biological or artificial, the thermodynamic costs of communication among their components are large. These systems also tend to split information transmitted between any two components across multiple channels. A common hypothesis is that such inverse multiplexing strategies reduce total thermodynamic costs. So far, however, there have been no physics-based results supporting this hypothesis. This gap existed partially because we have lacked a theoretical framework that addresses the interplay of thermodynamics and information in off-equilibrium systems. Here we present the first study that rigorously combines such a framework, stochastic thermodynamics, with Shannon information theory. We develop a minimal model that captures the fundamental features common to a wide variety of communication systems, and study the relationship between the entropy production of the communication process and the channel capacity, the canonical measure of the communication capability of a channel. In contrast to what is assumed in previous works not based on first principles, we show that the entropy production is not always a convex and monotonically increasing function of the channel capacity. However, those two properties are recovered for sufficiently high channel capacity. These results clarify when and how to split a single communication stream across multiple channels.
△ Less
Submitted 8 February, 2024; v1 submitted 8 February, 2023;
originally announced February 2023.
-
Nonequilibrium thermodynamics of uncertain stochastic processes
Authors:
Jan Korbel,
David H. Wolpert
Abstract:
Stochastic thermodynamics is formulated under the assumption of perfect knowledge of all thermodynamic parameters. However, in any real-world experiment, there is non-zero uncertainty about the precise value of temperatures, chemical potentials, energy spectrum, etc. Here we investigate how this uncertainty modifies the theorems of stochastic thermodynamics. We consider two scenarios: in the (call…
▽ More
Stochastic thermodynamics is formulated under the assumption of perfect knowledge of all thermodynamic parameters. However, in any real-world experiment, there is non-zero uncertainty about the precise value of temperatures, chemical potentials, energy spectrum, etc. Here we investigate how this uncertainty modifies the theorems of stochastic thermodynamics. We consider two scenarios: in the (called \emph{effective}) scenario we fix the (unknown, randomly generated) experimental apparatus and then repeatedly observe (stochastic) trajectories of the system for that fixed apparatus. In contrast, in a (called \emph{phenomenological}) scenario the (unknown) apparatus is re-generated for each trajectory. We derive expressions for thermodynamic quantities in both scenarios. We also discuss the physical interpretation of effective (scenario) entropy production (EP), derive the effective mismatch cost, and provide a numerical analysis of the effective thermodynamics of a quantum dot implementing bit erasure with uncertain temperature. We then analyze the protocol for moving between two state distributions that maximize effective work extraction. Next, we investigate the effective thermodynamic value of information, focusing on the case where there is a delay between the initialization of the system and the start of the protocol. Finally, we derive the detailed and integrated fluctuation theorems (FTs) for the phenomenological EP. In particular, we show how the phenomenological FTs account for the fact that the longer a trajectory runs, the more information it provides concerning the precise experimental apparatus, and therefore the less EP it generates.
△ Less
Submitted 22 May, 2023; v1 submitted 11 October, 2022;
originally announced October 2022.
-
Stochastic Mathematical Systems
Authors:
David H. Wolpert,
David B. Kinney
Abstract:
We introduce a framework that can be used to model both mathematics and human reasoning about mathematics. This framework involves {stochastic mathematical systems} (SMSs), which are stochastic processes that generate pairs of questions and associated answers (with no explicit referents). We use the SMS framework to define normative conditions for mathematical reasoning, by defining a ``calibratio…
▽ More
We introduce a framework that can be used to model both mathematics and human reasoning about mathematics. This framework involves {stochastic mathematical systems} (SMSs), which are stochastic processes that generate pairs of questions and associated answers (with no explicit referents). We use the SMS framework to define normative conditions for mathematical reasoning, by defining a ``calibration'' relation between a pair of SMSs. The first SMS is the human reasoner, and the second is an ``oracle'' SMS that can be interpreted as deciding whether the question-answer pairs of the reasoner SMS are valid. To ground thinking, we understand the answers to questions given by this oracle to be the answers that would be given by an SMS representing the entire mathematical community in the infinite long run of the process of asking and answering questions. We then introduce a slight extension of SMSs to allow us to model both the physical universe and human reasoning about the physical universe. We then define a slightly different calibration relation appropriate for the case of scientific reasoning. In this case the first SMS represents a human scientist predicting the outcome of future experiments, while the second SMS represents the physical universe in which the scientist is embedded, with the question-answer pairs of that SMS being specifications of the experiments that will occur and the outcome of those experiments, respectively. Next we derive conditions justifying two important patterns of inference in both mathematical and scientific reasoning: i) the practice of increasing one's degree of belief in a claim as one observes increasingly many lines of evidence for that claim, and ii) abduction, the practice of inferring a claim's probability of being correct from its explanatory power with respect to some other claim that is already taken to hold for independent reasons.
△ Less
Submitted 14 March, 2023; v1 submitted 1 September, 2022;
originally announced September 2022.
-
Thermodynamics of deterministic finite automata operating locally and periodically
Authors:
Thomas E. Ouldridge,
David H. Wolpert
Abstract:
Real-world computers have operational constraints that cause nonzero entropy production (EP). In particular, almost all real-world computers are ``periodic'', iteratively undergoing the same physical process; and ``local", in that subsystems evolve whilst physically decoupled from the rest of the computer. These constraints are so universal because decomposing a complex computation into small, ite…
▽ More
Real-world computers have operational constraints that cause nonzero entropy production (EP). In particular, almost all real-world computers are ``periodic'', iteratively undergoing the same physical process; and ``local", in that subsystems evolve whilst physically decoupled from the rest of the computer. These constraints are so universal because decomposing a complex computation into small, iterative calculations is what makes computers so powerful. We first derive the nonzero EP caused by the locality and periodicity constraints for deterministic finite automata (DFA), a foundational system of computer science theory. We then relate this minimal EP to the computational characteristics of the DFA. We thus divide the languages recognised by DFA into two classes: those that can be recognised with zero EP, and those that necessarily have non-zero EP. We also demonstrate the thermodynamic advantages of implementing a DFA with a physical process that is agnostic about the inputs that it processes.
△ Less
Submitted 3 July, 2023; v1 submitted 14 August, 2022;
originally announced August 2022.
-
What can we know about that which we cannot even imagine?
Authors:
David H. Wolpert
Abstract:
In this essay I will consider a sequence of questions. The first questions concern the biological function of intelligence in general, and cognitive prostheses of human intelligence in particular. These will lead into questions concerning human language, perhaps the most important cognitive prosthesis humanity has ever developed. While it is traditional to rhapsodize about the cognitive power enca…
▽ More
In this essay I will consider a sequence of questions. The first questions concern the biological function of intelligence in general, and cognitive prostheses of human intelligence in particular. These will lead into questions concerning human language, perhaps the most important cognitive prosthesis humanity has ever developed. While it is traditional to rhapsodize about the cognitive power encapsulated in human language, I will emphasize how horribly limited human language is - and therefore how limited our cognitive abilities are, despite their being augmented with language. This will lead to questions of whether human mathematics, being ultimately formulated in terms of human language, is also deeply limited. I will then combine these questions to pose a partial, sort-of, sideways answer to the guiding concern of this essay: what we can ever discern about that we cannot even conceive?
△ Less
Submitted 17 March, 2024; v1 submitted 7 August, 2022;
originally announced August 2022.
-
Combining lower bounds on entropy production in complex systems with multiple interacting components
Authors:
David H. Wolpert
Abstract:
The past two decades have seen a revolution in statistical physics, generalizing it to apply to systems of arbitrary size, evolving while arbitrarily far from equilibrium. Many of these new results are based on analyzing the dynamics of the entropy of a system that is evolving according to a Markov process. These results comprise a sub-field called ``stochastic thermodynamics''. Some of the most p…
▽ More
The past two decades have seen a revolution in statistical physics, generalizing it to apply to systems of arbitrary size, evolving while arbitrarily far from equilibrium. Many of these new results are based on analyzing the dynamics of the entropy of a system that is evolving according to a Markov process. These results comprise a sub-field called ``stochastic thermodynamics''. Some of the most powerful results in stochastic thermodynamics were traditionally concerned with single, monolithic systems, evolving by themselves, ignoring any internal structure of those systems. In this chapter I review how in complex systems, composed of many interacting constituent systems, it is possible to substantially strengthen many of these traditional results of stochastic thermodynamics. This is done by ``mixing and matching'' those traditional results, to each apply to only a subset of the interacting systems, thereby producing a more powerful result at the level of the aggregate, complex system.
△ Less
Submitted 4 August, 2022;
originally announced August 2022.
-
Inclusive Thermodynamics of Computational Machines
Authors:
Gülce Kardeş,
David Wolpert
Abstract:
We introduce a framework designed to analyze the thermodynamics of an abstractly defined logical computer like a deterministic finite automaton (DFA) or a Turing machine, without specifying any extraneous parameters (like rate matrices, Hamiltonians, etc.) of a physical process that implements the computer. Earlier investigations of how to do this were based on the continuous-time Markov chain (CT…
▽ More
We introduce a framework designed to analyze the thermodynamics of an abstractly defined logical computer like a deterministic finite automaton (DFA) or a Turing machine, without specifying any extraneous parameters (like rate matrices, Hamiltonians, etc.) of a physical process that implements the computer. Earlier investigations of how to do this were based on the continuous-time Markov chain (CTMC) formulation of stochastic thermodynamics. These investigations either assumed that there was exactly zero irreversible entropy production (EP) generated by the physical system implementing the computation, or allowed the EP to be nonzero but only considered the mismatch cost component of the EP. In addition, they only applied to a single type of computer. Our framework neither requires that EP equal zero nor restricts attention to the mismatch cost component of EP, and is designed to apply to all types of computational machines. In contrast to earlier investigations using the CTMC-based formulation, our framework is based on the inclusive Hamiltonian formulation, in which the combination of the system of interest and the baths evolve in a Hamiltonian (or unitary) dynamics. Here, we use our framework to derive an integral fluctuation theorem for computers, in which the expectation value is strictly less than 1. We also derive an exchange fluctuation theorem, and a mismatch cost formula involving first-passage times. We analyze the EP generated by a DFA, a Markov information source, and a noisy communication channel. In particular, we use the Myhill-Nerode theorem of computer science to prove that out of all DFAs which recognize the same language, the minimal complexity DFA is the one with minimal EP for all dynamics and at all iterations.
△ Less
Submitted 27 December, 2022; v1 submitted 2 June, 2022;
originally announced June 2022.
-
The Past as a Stochastic Process
Authors:
David H. Wolpert,
Michael H. Price,
Stefani A. Crabtree,
Timothy A. Kohler,
Jurgen Jost,
James Evans,
Peter F. Stadler,
Hajime Shimao,
Manfred D. Laubichler
Abstract:
Historical processes manifest remarkable diversity. Nevertheless, scholars have long attempted to identify patterns and categorize historical actors and influences with some success. A stochastic process framework provides a structured approach for the analysis of large historical datasets that allows for detection of sometimes surprising patterns, identification of relevant causal actors both end…
▽ More
Historical processes manifest remarkable diversity. Nevertheless, scholars have long attempted to identify patterns and categorize historical actors and influences with some success. A stochastic process framework provides a structured approach for the analysis of large historical datasets that allows for detection of sometimes surprising patterns, identification of relevant causal actors both endogenous and exogenous to the process, and comparison between different historical cases. The combination of data, analytical tools and the organizing theoretical framework of stochastic processes complements traditional narrative approaches in history and archaeology.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
Thermodynamic speed limits for co-evolving systems
Authors:
Farita Tasnim,
David H. Wolpert
Abstract:
Previously derived "global" thermodynamic speed limit theorems state that increasing the maximum speed with which a system can evolve between two given probability distributions over its states requires the system to produce more entropy in its evolution. However, these theorems ignore that many systems are not monolithic, but instead comprise multiple subsystems that interact according to an (oft…
▽ More
Previously derived "global" thermodynamic speed limit theorems state that increasing the maximum speed with which a system can evolve between two given probability distributions over its states requires the system to produce more entropy in its evolution. However, these theorems ignore that many systems are not monolithic, but instead comprise multiple subsystems that interact according to an (often sparse) network. Indeed, most naturally-occurring and human-engineered systems of increasing complexity can be decomposed into sets of co-evolving subsystems, where there exist a priori constraints on the dynamics of each subsystem, restricting which other subsystems can affect its dynamics. Here we derive three new SLTs that account for the thermodynamic effects of such constraints. Our first new speed limit strengthens the global speed limit. While our other two SLTs do not have this guarantee, in some situations they are even stronger than our first speed limit. Our results establish that a stochastically evolving system will, on average, produce more entropy in evolving between two distributions within a given time simply due to its comprising multiple, co-evolving subsystems. We illustrate our results with numerical calculations involving a model of two cells sensing and storing information about their environment.
△ Less
Submitted 4 December, 2021; v1 submitted 26 July, 2021;
originally announced July 2021.
-
Entropy production given constraints on the energy functions
Authors:
Artemy Kolchinsky,
David H. Wolpert
Abstract:
We consider the problem of driving a finite-state classical system from some initial distribution $p$ to some final distribution $p'$ with vanishing entropy production (EP), under the constraint that the driving protocols can only use some limited set of energy functions $\mathcal{E}$. Assuming no other constraints on the driving protocol, we derive a simple condition that guarantees that such a t…
▽ More
We consider the problem of driving a finite-state classical system from some initial distribution $p$ to some final distribution $p'$ with vanishing entropy production (EP), under the constraint that the driving protocols can only use some limited set of energy functions $\mathcal{E}$. Assuming no other constraints on the driving protocol, we derive a simple condition that guarantees that such a transformation can be carried out, which is stated in terms of the smallest probabilities in $\{p,p'\}$ and a graph-theoretic property defined in terms of $\mathcal{E}$. Our results imply that a surprisingly small amount of control over the energy function is sufficient (in particular, any transformation $p\to p'$ can be carried out as soon as one can control some one-dimensional parameter of the energy function, e.g., the energy of a single state). We also derive a lower bound on the EP under more general constraints on the transition rates, which is formulated in terms of a convex optimization problem.
△ Less
Submitted 21 September, 2021; v1 submitted 11 May, 2021;
originally announced May 2021.
-
The Implications of the No-Free-Lunch Theorems for Meta-induction
Authors:
David H. Wolpert
Abstract:
The important recent book by G. Schurz appreciates that the no-free-lunch theorems (NFL) have major implications for the problem of (meta) induction. Here I review the NFL theorems, emphasizing that they do not only concern the case where there is a uniform prior -- they prove that there are "as many priors" (loosely speaking) for which any induction algorithm $A$ out-generalizes some induction al…
▽ More
The important recent book by G. Schurz appreciates that the no-free-lunch theorems (NFL) have major implications for the problem of (meta) induction. Here I review the NFL theorems, emphasizing that they do not only concern the case where there is a uniform prior -- they prove that there are "as many priors" (loosely speaking) for which any induction algorithm $A$ out-generalizes some induction algorithm $B$ as vice-versa. Importantly though, in addition to the NFL theorems, there are many {free lunch} theorems. In particular, the NFL theorems can only be used to compare the {marginal} expected performance of an induction algorithm $A$ with the marginal expected performance of an induction algorithm $B$. There is a rich set of free lunches which instead concern the statistical correlations among the generalization errors of induction algorithms. As I describe, the meta-induction algorithms that Schurz advocate as a "solution to Hume's problem" are just an example of such a free lunch based on correlations among the generalization errors of induction algorithms. I end by pointing out that the prior that Schurz advocates, which is uniform over bit frequencies rather than bit patterns, is contradicted by thousands of experiments in statistical physics and by the great success of the maximum entropy procedure in inductive inference.
△ Less
Submitted 27 July, 2022; v1 submitted 22 March, 2021;
originally announced March 2021.
-
Thermodynamics of nonequilibrium systems with uncertain parameters
Authors:
Jan Korbel,
David H. Wolpert
Abstract:
In the real world, one almost never knows the parameters of a thermodynamic process to infinite precision. Reflecting this, here we investigate how to extend stochastic thermodynamics to systems with uncertain parameters, including uncertain number of heat baths / particle reservoirs, uncertainty in the precise values of temperatures / chemical potentials of those reservoirs, uncertainty in the en…
▽ More
In the real world, one almost never knows the parameters of a thermodynamic process to infinite precision. Reflecting this, here we investigate how to extend stochastic thermodynamics to systems with uncertain parameters, including uncertain number of heat baths / particle reservoirs, uncertainty in the precise values of temperatures / chemical potentials of those reservoirs, uncertainty in the energy spectrum, uncertainty in the control protocol, etc. We formalize such uncertainty with an (arbitrary) probability measure over all transition rate matrices satisfying local detailed balance. This lets us define the effective thermodynamic quantities by averaging over all LDB-obeying rate matrices. We show that the resultant effective entropy violates the second law of thermodynamics. In contrast to the effective entropy though, the expected stochastic entropy, defined as the ensemble average of the effective trajectory-level entropy, satisfies the second law. We then and explicitly calculate the second-order correction to the second law for the case of one heat bath with uncertain temperature. We also derive the detailed fluctuation theorem for expected effective trajectory entropy production for this case, and derive a lower bound for the associated expected work. Next, to ground these formal considerations with experimentally testable bounds on allowed energetics, we derive a bound on the maximal work that can be extracted from systems with arbitrarily uncertain temperature. We end by extending previous work on "thermodynamic value of information", to allow for uncertainty in the time-evolution of the rate matrix.
△ Less
Submitted 16 March, 2021;
originally announced March 2021.
-
Dependence of integrated, instantaneous, and fluctuating entropy production on the initial state in quantum and classical processes
Authors:
Artemy Kolchinsky,
David H. Wolpert
Abstract:
We consider the additional entropy production (EP) incurred by a fixed quantum or classical process on some initial state $ρ$, above the minimum EP incurred by the same process on any initial state. We show that this additional EP, which we term the "mismatch cost of $ρ$", has a universal information-theoretic form: it is given by the contraction of the relative entropy between $ρ$ and the least-d…
▽ More
We consider the additional entropy production (EP) incurred by a fixed quantum or classical process on some initial state $ρ$, above the minimum EP incurred by the same process on any initial state. We show that this additional EP, which we term the "mismatch cost of $ρ$", has a universal information-theoretic form: it is given by the contraction of the relative entropy between $ρ$ and the least-dissipative initial state $\varphi$ over time. We derive versions of this result for integrated EP incurred over the course of a process, for trajectory-level fluctuating EP, and for instantaneous EP rate. We also show that mismatch cost for fluctuating EP obeys an integral fluctuation theorem. Our results demonstrate a fundamental relationship between "thermodynamic irreversibility" (generation of EP) and "logical irreversibility" (inability to know the initial state corresponding to a given final state). We use this relationship to derive quantitative bounds on the thermodynamics of quantum error correction and to propose a thermodynamically-operationalized measure of the logical irreversibility of a quantum channel. Our results hold for both finite and infinite dimensional systems, and generalize beyond EP to many other thermodynamic costs, including nonadiabatic EP, free energy loss, and entropy gain.
△ Less
Submitted 5 June, 2022; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Thermodynamic Uncertainty Relations for Multipartite Processes
Authors:
Gülce Kardeş,
David H. Wolpert
Abstract:
The thermodynamic uncertainty relations (TURs) provide lower bounds on the entropy production (EP) of a system in terms of the statistical precision of an arbitrary current in that system. All conventional TURs derived so far have concerned a single physical system, differing from one another in what properties they require the system to have. However, many physical scenarios of interest involve m…
▽ More
The thermodynamic uncertainty relations (TURs) provide lower bounds on the entropy production (EP) of a system in terms of the statistical precision of an arbitrary current in that system. All conventional TURs derived so far have concerned a single physical system, differing from one another in what properties they require the system to have. However, many physical scenarios of interest involve multiple interacting systems, e.g. organelles within a biological cell. Here we show how to extend the conventional TURs to those scenarios. A common feature of these extended versions of the TURs is that they bound the global EP, jointly generated by the set of interacting systems, in terms of a weighted sum of the precisions of the local currents generated within those systems -- plus an information-theoretic correction term. Importantly, these extended TURs can bound the global EP even when the global system does not meet any of the requirements of the conventional TURs. After deriving these extended TURs we use them to obtain bounds that do not involve the global EP, but instead relate the local EPs of the individual systems and the statistical coupling among the currents generated within those systems. We derive such bounds for both scalar-valued and vector-valued currents within each system. We illustrate our results with numerical experiments.
△ Less
Submitted 18 March, 2021; v1 submitted 5 January, 2021;
originally announced January 2021.
-
Noisy Deductive Reasoning: How Humans Construct Math, and How Math Constructs Universes
Authors:
David H. Wolpert,
David Kinney
Abstract:
We present a computational model of mathematical reasoning according to which mathematics is a fundamentally stochastic process. That is, on our model, whether or not a given formula is deemed a theorem in some axiomatic system is not a matter of certainty, but is instead governed by a probability distribution. We then show that this framework gives a compelling account of several aspects of mathe…
▽ More
We present a computational model of mathematical reasoning according to which mathematics is a fundamentally stochastic process. That is, on our model, whether or not a given formula is deemed a theorem in some axiomatic system is not a matter of certainty, but is instead governed by a probability distribution. We then show that this framework gives a compelling account of several aspects of mathematical practice. These include: 1) the way in which mathematicians generate research programs, 2) the applicability of Bayesian models of mathematical heuristics, 3) the role of abductive reasoning in mathematics, 4) the way in which multiple proofs of a proposition can strengthen our degree of belief in that proposition, and 5) the nature of the hypothesis that there are multiple formal systems that are isomorphic to physically possible universes. Thus, by embracing a model of mathematics as not perfectly predictable, we generate a new and fruitful perspective on the epistemology and practice of mathematics.
△ Less
Submitted 28 October, 2020;
originally announced December 2020.
-
Stochastic thermodynamics and fluctuation theorems for non-linear systems
Authors:
Jan Korbel,
David H. Wolpert
Abstract:
We extend stochastic thermodynamics by relaxing the two assumptions that the Markovian dynamics must be linear and that the equilibrium distribution must be a Boltzmann distribution. We show that if we require the second law to hold when those assumptions are relaxed, then it cannot be formulated in terms of Shannon entropy. However, thermodynamic consistency is salvaged if we reformulate the seco…
▽ More
We extend stochastic thermodynamics by relaxing the two assumptions that the Markovian dynamics must be linear and that the equilibrium distribution must be a Boltzmann distribution. We show that if we require the second law to hold when those assumptions are relaxed, then it cannot be formulated in terms of Shannon entropy. However, thermodynamic consistency is salvaged if we reformulate the second law in terms of generalized entropy; our first result is an equation relating the precise form of the non-linear master equation to the precise associated generalized entropy which results in thermodynamic consistency. We then build on this result to extend the usual trajectory-level definitions of thermodynamic quantities that are appropriate even when the two assumptions are relaxed. We end by using these trajectory-level definitions to derive extended versions of the Crooks fluctuation theorem and Jarzynski equality which apply when the two assumptions are relaxed.
△ Less
Submitted 13 November, 2020;
originally announced November 2020.
-
Work, entropy production, and thermodynamics of information under protocol constraints
Authors:
Artemy Kolchinsky,
David H. Wolpert
Abstract:
In many real-world situations, there are constraints on the ways in which a physical system can be manipulated. We investigate the entropy production (EP) and extractable work involved in bringing a system from some initial distribution $p$ to some final distribution $p'$, given that the set of master equations available to the driving protocol obeys some constraints. We first derive general bound…
▽ More
In many real-world situations, there are constraints on the ways in which a physical system can be manipulated. We investigate the entropy production (EP) and extractable work involved in bringing a system from some initial distribution $p$ to some final distribution $p'$, given that the set of master equations available to the driving protocol obeys some constraints. We first derive general bounds on EP and extractable work, as well as a decomposition of the nonequilibrium free energy into an "accessible free energy" (which can be extracted as work, given a set of constraints) and an "inaccessible free energy" (which must be dissipated as EP). In a similar vein, we consider the thermodynamics of information in the presence of constraints, and decompose the information acquired in a measurement into "accessible" and "inaccessible" components. This decomposition allows us to consider the thermodynamic efficiency of different measurements of the same system, given a set of constraints. We use our framework to analyze protocols subject to symmetry, modularity, and coarse-grained constraints, and consider various examples including the Szilard box, the 2D Ising model, and a multi-particle flashing ratchet.
△ Less
Submitted 19 October, 2021; v1 submitted 24 August, 2020;
originally announced August 2020.
-
Strengthened second law for multi-dimensional systems coupled to multiple thermodynamic reservoirs
Authors:
David H. Wolpert
Abstract:
The second law of thermodynamics can be formulated as a restriction on the evolution of the entropy of any system undergoing Markovian dynamics. Here I show that this form of the second law is strengthened for multi-dimensional, complex systems, coupled to multiple thermodynamic reservoirs, if we have a set of \textit{a priori} constraints restricting how the dynamics of each coordinate can depend…
▽ More
The second law of thermodynamics can be formulated as a restriction on the evolution of the entropy of any system undergoing Markovian dynamics. Here I show that this form of the second law is strengthened for multi-dimensional, complex systems, coupled to multiple thermodynamic reservoirs, if we have a set of \textit{a priori} constraints restricting how the dynamics of each coordinate can depend on the other coordinates. As an example, this strengthened second law (SSL) applies to complex systems composed of multiple physically separated, co-evolving subsystems, each identified as a coordinate of the overall system. In this example, the constraints concern how the dynamics of some subsystems are allowed to depend on the states of the other subsystems. Importantly the SSL applies to such complex systems even if some of its subsystems can change state simultaneously, which is prohibited in a multipartite process. The SSL also strengthens previously derived bounds on how much work can be extracted from a system using feedback control, if the system is multi-dimensional. Importantly, the SSL does not require local detailed balance. So it potentially applies to complex systems ranging from interacting economic agents to co-evolving biological species.
△ Less
Submitted 13 February, 2024; v1 submitted 21 July, 2020;
originally announced July 2020.
-
What is important about the No Free Lunch theorems?
Authors:
David H. Wolpert
Abstract:
The No Free Lunch theorems prove that under a uniform distribution over induction problems (search problems or learning problems), all induction algorithms perform equally. As I discuss in this chapter, the importance of the theorems arises by using them to analyze scenarios involving {non-uniform} distributions, and to compare different algorithms, without any assumption about the distribution ov…
▽ More
The No Free Lunch theorems prove that under a uniform distribution over induction problems (search problems or learning problems), all induction algorithms perform equally. As I discuss in this chapter, the importance of the theorems arises by using them to analyze scenarios involving {non-uniform} distributions, and to compare different algorithms, without any assumption about the distribution over problems at all. In particular, the theorems prove that {anti}-cross-validation (choosing among a set of candidate algorithms based on which has {worst} out-of-sample behavior) performs as well as cross-validation, unless one makes an assumption -- which has never been formalized -- about how the distribution over induction problems, on the one hand, is related to the set of algorithms one is choosing among using (anti-)cross validation, on the other. In addition, they establish strong caveats concerning the significance of the many results in the literature which establish the strength of a particular algorithm without assuming a particular distribution. They also motivate a ``dictionary'' between supervised learning and improve blackbox optimization, which allows one to ``translate'' techniques from supervised learning into the domain of blackbox optimization, thereby strengthening blackbox optimization algorithms. In addition to these topics, I also briefly discuss their implications for philosophy of science.
△ Less
Submitted 21 July, 2020;
originally announced July 2020.
-
Fluctuation theorems for multiple co-evolving systems
Authors:
David H. Wolpert
Abstract:
All previously derived thermodynamic fluctuation theorems (FTs) that concern multiple co-evolving systems have required that each system can only change its state during an associated pre-fixed, limited set of time intervals. However, in many real-world cases the times when systems change their states are randomly determined, e.g., in almost all biological examples of co-evolving systems. Such ran…
▽ More
All previously derived thermodynamic fluctuation theorems (FTs) that concern multiple co-evolving systems have required that each system can only change its state during an associated pre-fixed, limited set of time intervals. However, in many real-world cases the times when systems change their states are randomly determined, e.g., in almost all biological examples of co-evolving systems. Such randomness in the timing drastically modifies the thermodynamics. Here I derive FTs that apply whether or not the timing is random. These FTs provide new versions of the second law, and of all conventional thermodynamic uncertainty relations (TURs). These new results are often stronger than the conventional versions, which ignore how an overall system may decompose into a set of co-evolving systems. In addition, the new TURs often bound entropy production (EP) of the overall system even if none of the criteria for a conventional TUR (e.g., being a non-equilibrium steady state) hold for that overall system.
△ Less
Submitted 14 April, 2021; v1 submitted 24 March, 2020;
originally announced March 2020.
-
Minimal entropy production due to constraints on rate matrix dependencies in multipartite processes
Authors:
David H Wolpert
Abstract:
I consider multipartite processes in which there are constraints on each subsystem's rate matrix, restricting which other subsystems can directly affect its dynamics. I derive a strictly nonzero lower bound on the minimal achievable entropy production rate of the process in terms of these constraints on the rate matrices of its subsystems. The bound is based on constructing counterfactual rate mat…
▽ More
I consider multipartite processes in which there are constraints on each subsystem's rate matrix, restricting which other subsystems can directly affect its dynamics. I derive a strictly nonzero lower bound on the minimal achievable entropy production rate of the process in terms of these constraints on the rate matrices of its subsystems. The bound is based on constructing counterfactual rate matrices, in which some subsystems are held fixed while the others are allowed to evolve. This bound is related to the "learning rate" of stationary bipartite systems, and more generally to the "information flow" in bipartite systems.
△ Less
Submitted 13 May, 2020; v1 submitted 7 January, 2020;
originally announced January 2020.
-
Thermodynamic costs of Turing Machines
Authors:
Artemy Kolchinsky,
David H. Wolpert
Abstract:
Turing Machines (TMs) are the canonical model of computation in computer science and physics. We combine techniques from algorithmic information theory and stochastic thermodynamics to analyze the thermodynamic costs of TMs. We consider two different ways of realizing a given TM with a physical process. The first realization is designed to be thermodynamically reversible when fed with random input…
▽ More
Turing Machines (TMs) are the canonical model of computation in computer science and physics. We combine techniques from algorithmic information theory and stochastic thermodynamics to analyze the thermodynamic costs of TMs. We consider two different ways of realizing a given TM with a physical process. The first realization is designed to be thermodynamically reversible when fed with random input bits. The second realization is designed to generate less heat, up to an additive constant, than any realization that is computable (i.e., consistent with the physical Church-Turing thesis). We consider three different thermodynamic costs: the heat generated when the TM is run on each input (which we refer to as the "heat function"), the minimum heat generated when a TM is run with an input that results in some desired output (which we refer to as the "thermodynamic complexity" of the output, in analogy to the Kolmogorov complexity), and the expected heat on the input distribution that minimizes entropy production. For universal TMs, we show for both realizations that the thermodynamic complexity of any desired output is bounded by a constant (unlike the conventional Kolmogorov complexity), while the expected amount of generated heat is infinite. We also show that any computable realization faces a fundamental tradeoff between heat generation, the Kolmogorov complexity of its heat function, and the Kolmogorov complexity of its input-output map. We demonstrate this tradeoff by analyzing the thermodynamics of erasing a long string.
△ Less
Submitted 20 August, 2020; v1 submitted 10 December, 2019;
originally announced December 2019.
-
Uncertainty relations and fluctuation theorems for Bayes nets
Authors:
David H. Wolpert
Abstract:
Recent research has considered the stochastic thermodynamics of multiple interacting systems, representing the overall system as a Bayes net. I derive fluctuation theorems governing the entropy production (EP)of arbitrary sets of the systems in such a Bayes net. I also derive ``conditional'' fluctuation theorems, governing the distribution of EP in one set of systems conditioned on the EP of a dif…
▽ More
Recent research has considered the stochastic thermodynamics of multiple interacting systems, representing the overall system as a Bayes net. I derive fluctuation theorems governing the entropy production (EP)of arbitrary sets of the systems in such a Bayes net. I also derive ``conditional'' fluctuation theorems, governing the distribution of EP in one set of systems conditioned on the EP of a different set of systems. I then derive thermodynamic uncertainty relations relating the EP of the overall system to the precisions of probability currents within the individual systems.
△ Less
Submitted 1 June, 2020; v1 submitted 6 November, 2019;
originally announced November 2019.
-
Thermodynamic Computing
Authors:
Tom Conte,
Erik DeBenedictis,
Natesh Ganesh,
Todd Hylton,
John Paul Strachan,
R. Stanley Williams,
Alexander Alemi,
Lee Altenberg,
Gavin Crooks,
James Crutchfield,
Lidia del Rio,
Josh Deutsch,
Michael DeWeese,
Khari Douglas,
Massimiliano Esposito,
Michael Frank,
Robert Fry,
Peter Harsha,
Mark Hill,
Christopher Kello,
Jeff Krichmar,
Suhas Kumar,
Shih-Chii Liu,
Seth Lloyd,
Matteo Marsili
, et al. (14 additional authors not shown)
Abstract:
The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hard…
▽ More
The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hardware, devices have become so small that we are struggling to eliminate the effects of thermodynamic fluctuations, which are unavoidable at the nanometer scale. In terms of software, our ability to imagine and program effective computational abstractions and implementations are clearly challenged in complex domains. In terms of systems, currently five percent of the power generated in the US is used to run computing systems - this astonishing figure is neither ecologically sustainable nor economically scalable. Economically, the cost of building next-generation semiconductor fabrication plants has soared past $10 billion. All of these difficulties - device scaling, software complexity, adaptability, energy consumption, and fabrication economics - indicate that the current computing paradigm has matured and that continued improvements along this path will be limited. If technological progress is to continue and corresponding social and economic benefits are to continue to accrue, computing must become much more capable, energy efficient, and affordable. We propose that progress in computing can continue under a united, physically grounded, computational paradigm centered on thermodynamics. Herein we propose a research agenda to extend these thermodynamic foundations into complex, non-equilibrium, self-organizing systems and apply them holistically to future computing systems that will harness nature's innate computational capacity. We call this type of computing "Thermodynamic Computing" or TC.
△ Less
Submitted 14 November, 2019; v1 submitted 5 November, 2019;
originally announced November 2019.
-
Stochastic thermodynamics of computation
Authors:
David H. Wolpert
Abstract:
One of the major resource requirements of computers - ranging from biological cells to human brains to high-performance (engineered) computers - is the energy used to run them. Those costs of performing a computation have long been a focus of research in physics, going back to the early work of Landauer. One of the most prominent aspects of computers is that they are inherently nonequilibrium syst…
▽ More
One of the major resource requirements of computers - ranging from biological cells to human brains to high-performance (engineered) computers - is the energy used to run them. Those costs of performing a computation have long been a focus of research in physics, going back to the early work of Landauer. One of the most prominent aspects of computers is that they are inherently nonequilibrium systems. However, the early research was done when nonequilibrium statistical physics was in its infancy, which meant the work was formulated in terms of equilibrium statistical physics. Since then there have been major breakthroughs in nonequilibrium statistical physics, which are allowing us to investigate the myriad aspects of the relationship between statistical physics and computation, extending well beyond the issue of how much work is required to erase a bit. In this paper I review some of this recent work on the `stochastic thermodynamics of computation'. After reviewing the salient parts of information theory, computer science theory, and stochastic thermodynamics, I summarize what has been learned about the entropic costs of performing a broad range of computations, extending from bit erasure to loop-free circuits to logically reversible circuits to information ratchets to Turing machines. These results reveal new, challenging engineering problems for how to design computers to have minimal thermodynamic costs. They also allow us to start to combine computer science theory and stochastic thermodynamics at a foundational level, thereby expanding both.
△ Less
Submitted 16 February, 2023; v1 submitted 14 May, 2019;
originally announced May 2019.
-
Overview of Information Theory, Computer Science Theory, and Stochastic Thermodynamics for Thermodynamics of Computation
Authors:
David H. Wolpert
Abstract:
I give a quick overview of some of the theoretical background necessary for using modern non-equilibrium statistical physics to investigate the thermodynamics of computation. I first present some of the necessary concepts from information theory, and then introduce some of the most important types of computational machine considered in computer science theory.
After this I present a central resu…
▽ More
I give a quick overview of some of the theoretical background necessary for using modern non-equilibrium statistical physics to investigate the thermodynamics of computation. I first present some of the necessary concepts from information theory, and then introduce some of the most important types of computational machine considered in computer science theory.
After this I present a central result from modern non-equilibrium statistical physics: an exact expression for the entropy flow out of a system undergoing a given dynamics with a given initial distribution over states. This central expression is crucial for analyzing how the total entropy flow out of a computer depends on its global structure, since that global structure determines the initial distributions into all of the computer's subsystems, and therefore (via the central expression) the entropy flows generated by all of those subsystems. I illustrate these results by analyzing some of the subtleties concerning the benefits that are sometimes claimed for implementing an irreversible computation with a reversible circuit constructed out of Fredkin gates.
△ Less
Submitted 18 June, 2019; v1 submitted 30 December, 2018;
originally announced January 2019.
-
Semantic information, autonomous agency, and nonequilibrium statistical physics
Authors:
Artemy Kolchinsky,
David H. Wolpert
Abstract:
Shannon information theory provides various measures of so-called "syntactic information", which reflect the amount of statistical correlation between systems. In contrast, the concept of "semantic information" refers to those correlations which carry significance or "meaning" for a given system. Semantic information plays an important role in many fields, including biology, cognitive science, and…
▽ More
Shannon information theory provides various measures of so-called "syntactic information", which reflect the amount of statistical correlation between systems. In contrast, the concept of "semantic information" refers to those correlations which carry significance or "meaning" for a given system. Semantic information plays an important role in many fields, including biology, cognitive science, and philosophy, and there has been a long-standing interest in formulating a broadly applicable and formal theory of semantic information. In this paper we introduce such a theory. We define semantic information as the syntactic information that a physical system has about its environment which is causally necessary for the system to maintain its own existence. "Causal necessity" is defined in terms of counter-factual interventions which scramble correlations between the system and its environment, while "maintaining existence" is defined in terms of the system's ability to keep itself in a low entropy state. We also use recent results in nonequilibrium statistical physics to analyze semantic information from a thermodynamic point of view. Our framework is grounded in the intrinsic dynamics of a system coupled to an environment, and is applicable to any physical system, living or otherwise. It leads to formal definitions of several concepts that have been intuitively understood to be related to semantic information, including "value of information", "semantic content", and "agency".
△ Less
Submitted 7 November, 2018; v1 submitted 20 June, 2018;
originally announced June 2018.
-
Thermodynamics of computing with circuits
Authors:
David Hilton Wolpert,
Artemy Kolchinsky
Abstract:
Digital computers implement computations using circuits, as do many naturally occurring systems (e.g., gene regulatory networks). The topology of any such circuit restricts which variables may be physically coupled during the operation of a circuit. We investigate how such restrictions on the physical coupling affects the thermodynamic costs of running the circuit. To do this we first calculate th…
▽ More
Digital computers implement computations using circuits, as do many naturally occurring systems (e.g., gene regulatory networks). The topology of any such circuit restricts which variables may be physically coupled during the operation of a circuit. We investigate how such restrictions on the physical coupling affects the thermodynamic costs of running the circuit. To do this we first calculate the minimal additional entropy production that arises when we run a given gate in a circuit. We then build on this calculation, to analyze how the thermodynamic costs of implementing a computation with a full circuit, comprising multiple connected gates, depends on the topology of that circuit. This analysis provides a rich new set of optimization problems that must be addressed by any designer of a circuit, if they wish to minimize thermodynamic costs.
△ Less
Submitted 20 July, 2023; v1 submitted 11 June, 2018;
originally announced June 2018.
-
Upgrading from Gaussian Processes to Student's-T Processes
Authors:
Brendan D. Tracey,
David H. Wolpert
Abstract:
Gaussian process priors are commonly used in aerospace design for performing Bayesian optimization. Nonetheless, Gaussian processes suffer two significant drawbacks: outliers are a priori assumed unlikely, and the posterior variance conditioned on observed data depends only on the locations of those data, not the associated sample values. Student's-T processes are a generalization of Gaussian proc…
▽ More
Gaussian process priors are commonly used in aerospace design for performing Bayesian optimization. Nonetheless, Gaussian processes suffer two significant drawbacks: outliers are a priori assumed unlikely, and the posterior variance conditioned on observed data depends only on the locations of those data, not the associated sample values. Student's-T processes are a generalization of Gaussian processes, founded on the Student's-T distribution instead of the Gaussian distribution. Student's-T processes maintain the primary advantages of Gaussian processes (kernel function, analytic update rule) with additional benefits beyond Gaussian processes. The Student's-T distribution has higher Kurtosis than a Gaussian distribution and so outliers are much more likely, and the posterior variance increases or decreases depending on the variance of observed data sample values. Here, we describe Student's-T processes, and discuss their advantages in the context of aerospace optimization. We show how to construct a Student's-T process using a kernel function and how to update the process given new samples. We provide a clear derivation of optimization-relevant quantities such as expected improvement, and contrast with the related computations for Gaussian processes. Finally, we compare the performance of Student's-T processes against Gaussian process on canonical test problems in Bayesian optimization, and apply the Student's-T process to the optimization of an aerostructural design problem.
△ Less
Submitted 18 January, 2018;
originally announced January 2018.
-
Constraints on physical reality arising from a formalization of knowledge
Authors:
David Wolpert
Abstract:
There are (at least) four ways that an agent can acquire information concerning the state of the universe: via observation, control, prediction, or via retrodiction, i.e., memory. Each of these four ways of acquiring information seems to rely on a different kind of physical device (resp., an observation device, a control device, etc.). However it turns out that certain mathematical structure is co…
▽ More
There are (at least) four ways that an agent can acquire information concerning the state of the universe: via observation, control, prediction, or via retrodiction, i.e., memory. Each of these four ways of acquiring information seems to rely on a different kind of physical device (resp., an observation device, a control device, etc.). However it turns out that certain mathematical structure is common to those four types of device. Any device that possesses a certain subset of that structure is known as an "inference device" (ID).
Here I review some of the properties of IDs, including their relation with Turing machines, and (more loosely) quantum mechanics. I also review the bounds of the joint abilities of any set of IDs to know facts about the physical universe that contains them. These bounds constrain the possible properties of any universe that contains agents who can acquire information concerning that universe.
I then extend this previous work on IDs, by adding to the definition of IDs some of the other mathematical structure that is common to the four ways of acquiring information about the universe but is not captured in the (minimal) definition of IDs. I discuss these extensions of IDs in the context of epistemic logic (especially possible worlds formalisms like Kripke structures and Aumann structures). In particular, I show that these extensions of IDs are not subject to the problem of logical omniscience that plagues many previously studied forms of epistemic logic.
△ Less
Submitted 27 June, 2018; v1 submitted 9 November, 2017;
originally announced November 2017.
-
Deep Reinforcement Learning for Event-Driven Multi-Agent Decision Processes
Authors:
Kunal Menda,
Yi-Chun Chen,
Justin Grana,
James W. Bono,
Brendan D. Tracey,
Mykel J. Kochenderfer,
David Wolpert
Abstract:
The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems. Since macro-actions last for stochastic durations, multiple agents executing decentralized policies in cooperative environments must act asynchronously. We present an algorithm that modifies generalized…
▽ More
The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems. Since macro-actions last for stochastic durations, multiple agents executing decentralized policies in cooperative environments must act asynchronously. We present an algorithm that modifies generalized advantage estimation for temporally extended actions, allowing a state-of-the-art policy optimization algorithm to optimize policies in Dec-POMDPs in which agents act asynchronously. We show that our algorithm is capable of learning optimal policies in two cooperative domains, one involving real-time bus holding control and one involving wildfire fighting with unmanned aircraft. Our algorithm works by framing problems as "event-driven decision processes," which are scenarios in which the sequence and timing of actions and events are random and governed by an underlying stochastic process. In addition to optimizing policies with continuous state and action spaces, our algorithm also facilitates the use of event-driven simulators, which do not require time to be discretized into time-steps. We demonstrate the benefit of using event-driven simulation in the context of multiple agents taking asynchronous actions. We show that fixed time-step simulation risks obfuscating the sequence in which closely separated events occur, adversely affecting the policies learned. In addition, we show that arbitrarily shrinking the time-step scales poorly with the number of agents.
△ Less
Submitted 29 May, 2019; v1 submitted 19 September, 2017;
originally announced September 2017.
-
Number of hidden states needed to physically implement a given conditional distribution
Authors:
Jeremy A. Owen,
Artemy Kolchinsky,
David H. Wolpert
Abstract:
We consider the problem of how to construct a physical process over a finite state space $X$ that applies some desired conditional distribution $P$ to initial states to produce final states. This problem arises often in the thermodynamics of computation and nonequilibrium statistical physics more generally (e.g., when designing processes to implement some desired computation, feedback controller,…
▽ More
We consider the problem of how to construct a physical process over a finite state space $X$ that applies some desired conditional distribution $P$ to initial states to produce final states. This problem arises often in the thermodynamics of computation and nonequilibrium statistical physics more generally (e.g., when designing processes to implement some desired computation, feedback controller, or Maxwell demon). It was previously known that some conditional distributions cannot be implemented using any master equation that involves just the states in $X$. However, here we show that any conditional distribution $P$ can in fact be implemented---if additional "hidden" states not in $X$ are available. Moreover, we show that it is always possible to implement $P$ in a thermodynamically reversible manner. We then investigate a novel cost of the physical resources needed to implement a given distribution $P$: the minimal number of hidden states needed to do so. We calculate this cost exactly for the special case where $P$ represents a single-valued function, and provide an upper bound for the general case, in terms of the nonnegative rank of $P$. These results show that having access to one extra binary degree of freedom, thus doubling the total number of states, is sufficient to implement any $P$ with a master equation in a thermodynamically reversible way, if there are no constraints on the allowed form of the master equation. (Such constraints can greatly increase the minimal needed number of hidden states.) Our results also imply that for certain $P$ that can be implemented without hidden states, having hidden states permits an implementation that generates less heat.
△ Less
Submitted 13 October, 2019; v1 submitted 3 September, 2017;
originally announced September 2017.
-
A space-time tradeoff for implementing a function with master equation dynamics
Authors:
David H. Wolpert,
Artemy Kolchinsky,
Jeremy A. Owen
Abstract:
Master equations are commonly used to model the dynamics of physical systems, including systems that implement single-valued functions like a computer's update step. However, many such functions cannot be implemented by any master equation, even approximately, which raises the question of how they can occur in the real world. Here we show how any function over some "visible" states can be implemen…
▽ More
Master equations are commonly used to model the dynamics of physical systems, including systems that implement single-valued functions like a computer's update step. However, many such functions cannot be implemented by any master equation, even approximately, which raises the question of how they can occur in the real world. Here we show how any function over some "visible" states can be implemented with master equation dynamics--if the dynamics exploits additional, "hidden" states at intermediate times. We also show that any master equation implementing a function can be decomposed into a sequence of "hidden" timesteps, demarcated by changes in what state-to-state transitions have nonzero probability. In many real-world situations there is a cost both for more hidden states and for more hidden timesteps. Accordingly, we derive a "space-time" tradeoff between the number of hidden states and the number of hidden timesteps needed to implement any given function.
△ Less
Submitted 21 April, 2019; v1 submitted 28 August, 2017;
originally announced August 2017.
-
The thermodynamic efficiency of computations made in cells across the range of life
Authors:
Christopher P. Kempes,
David Wolpert,
Zachary Cohen,
Juan Pérez-Mercader
Abstract:
Biological organisms must perform computation as they grow, reproduce, and evolve. Moreover, ever since Landauer's bound was proposed it has been known that all computation has some thermodynamic cost -- and that the same computation can be achieved with greater or smaller thermodynamic cost depending on how it is implemented. Accordingly an important issue concerning the evolution of life is asse…
▽ More
Biological organisms must perform computation as they grow, reproduce, and evolve. Moreover, ever since Landauer's bound was proposed it has been known that all computation has some thermodynamic cost -- and that the same computation can be achieved with greater or smaller thermodynamic cost depending on how it is implemented. Accordingly an important issue concerning the evolution of life is assessing the thermodynamic efficiency of the computations performed by organisms. This issue is interesting both from the perspective of how close life has come to maximally efficient computation (presumably under the pressure of natural selection), and from the practical perspective of what efficiencies we might hope that engineered biological computers might achieve, especially in comparison with current computational systems. Here we show that the computational efficiency of translation, defined as free energy expended per amino acid operation, outperforms the best supercomputers by several orders of magnitude, and is only about an order of magnitude worse than the Landauer bound. However this efficiency depends strongly on the size and architecture of the cell in question. In particular, we show that the {\it useful} efficiency of an amino acid operation, defined as the bulk energy per amino acid polymerization, decreases for increasing bacterial size and converges to the polymerization cost of the ribosome. This cost of the largest bacteria does not change in cells as we progress through the major evolutionary shifts to both single and multicellular eukaryotes. However, the rates of total computation per unit mass are nonmonotonic in bacteria with increasing cell size, and also change across different biological architectures including the shift from unicellular to multicellular eukaryotes.
△ Less
Submitted 15 June, 2017;
originally announced June 2017.
-
When is a bit worth much more than kT ln2?
Authors:
Can Gokler,
Artemy Kolchinsky,
Zi-Wen Liu,
Iman Marvian,
Peter Shor,
Oles Shtanko,
Kevin Thompson,
David Wolpert,
Seth Lloyd
Abstract:
Physical processes thatobtain, process, and erase information involve tradeoffs between information and energy. The fundamental energetic value of a bit of information exchanged with a reservoir at temperature T is kT ln2. This paper investigates the situation in which information is missing about just what physical process is about to take place. The fundamental energetic value of such informatio…
▽ More
Physical processes thatobtain, process, and erase information involve tradeoffs between information and energy. The fundamental energetic value of a bit of information exchanged with a reservoir at temperature T is kT ln2. This paper investigates the situation in which information is missing about just what physical process is about to take place. The fundamental energetic value of such information can be far greater than kT ln2 per bit.
△ Less
Submitted 26 May, 2017;
originally announced May 2017.
-
Nonlinear Information Bottleneck
Authors:
Artemy Kolchinsky,
Brendan D. Tracey,
David H. Wolpert
Abstract:
Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$. IB works by encoding $X$ in a compressed "bottleneck" random variable $M$ from which $Y$ can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been cons…
▽ More
Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$. IB works by encoding $X$ in a compressed "bottleneck" random variable $M$ from which $Y$ can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been considered for only two limited cases: discrete $X$ and $Y$ with small state spaces, and continuous $X$ and $Y$ with a Gaussian joint distribution (in which case optimal encoding and decoding maps are linear). We propose a method for performing IB on arbitrarily-distributed discrete and/or continuous $X$ and $Y$, while allowing for nonlinear encoding and decoding maps. Our approach relies on a novel non-parametric upper bound for mutual information. We describe how to implement our method using neural networks. We then show that it achieves better performance than the recently-proposed "variational IB" method on several real-world datasets.
△ Less
Submitted 30 November, 2019; v1 submitted 5 May, 2017;
originally announced May 2017.
-
Maximizing free energy gain
Authors:
Artemy Kolchinsky,
Iman Marvian,
Can Gokler,
Zi-Wen Liu,
Peter Shor,
Oles Shtanko,
Kevin Thompson,
David Wolpert,
Seth Lloyd
Abstract:
Free energy is energy that is available to do work. Maximizing the free energy gain and the gain in work that can be extracted from a system is important for a wide variety of physical and technological processes, from energy harvesting processes such as photosynthesis to energy storage systems such as fuels and batteries. This paper extends recent results from non-equilibrium thermodynamics and q…
▽ More
Free energy is energy that is available to do work. Maximizing the free energy gain and the gain in work that can be extracted from a system is important for a wide variety of physical and technological processes, from energy harvesting processes such as photosynthesis to energy storage systems such as fuels and batteries. This paper extends recent results from non-equilibrium thermodynamics and quantum resource theory to derive closed-form solutions for the maximum possible gain in free energy and extractable work that can be obtained by varying the initial states of classical and quantum stochastic processes. Simple formulae allow the comparison the free energy increase for the optimal procedure with that for a sub-optimal procedure. The problem of finding the optimal free-energy harvesting procedure is shown to be convex and solvable via gradient descent.
△ Less
Submitted 28 April, 2017;
originally announced May 2017.
-
Modeling Social Organizations as Communication Networks
Authors:
David Wolpert,
Justin Grana,
Brendan Tracey,
Tim Kohler,
Artemy Kolchinsky
Abstract:
We identify the "organization" of a human social group as the communication network(s) within that group. We then introduce three theoretical approaches to analyzing what determines the structures of human organizations. All three approaches adopt a group-selection perspective, so that the group's network structure is (approximately) optimal, given the information-processing limitations of agents…
▽ More
We identify the "organization" of a human social group as the communication network(s) within that group. We then introduce three theoretical approaches to analyzing what determines the structures of human organizations. All three approaches adopt a group-selection perspective, so that the group's network structure is (approximately) optimal, given the information-processing limitations of agents within the social group, and the exogenous welfare function of the overall group. In the first approach we use a new sub-field of telecommunications theory called network coding, and focus on a welfare function that involves the ability of the organization to convey information among the agents. In the second approach we focus on a scenario where agents within the organization must allocate their future communication resources when the state of the future environment is uncertain. We show how this formulation can be solved with a linear program. In the third approach, we introduce an information synthesis problem in which agents within an organization receive information from various sources and must decide how to transform such information and transmit the results to other agents in the organization. We propose leveraging the computational power of neural networks to solve such problems. These three approaches formalize and synthesize work in fields including anthropology, archeology, economics and psychology that deal with organization structure, theory of the firm, span of control and cognitive limits on communication.
△ Less
Submitted 14 February, 2017;
originally announced February 2017.
-
Coarse-graining and the Blackwell order
Authors:
Johannes Rauh,
Pradeep Kr. Banerjee,
Eckehard Olbrich,
Jürgen Jost,
Nils Bertschinger,
David Wolpert
Abstract:
Suppose we have a pair of information channels, $κ_{1},κ_{2}$, with a common input. The Blackwell order is a partial order over channels that compares $κ_{1}$ and $κ_{2}$ by the maximal expected utility an agent can obtain when decisions are based on the channel outputs. Equivalently, $κ_{1}$ is said to be Blackwell-inferior to $κ_{2}$ if and only if $κ_{1}$ can be constructed by garbling the outp…
▽ More
Suppose we have a pair of information channels, $κ_{1},κ_{2}$, with a common input. The Blackwell order is a partial order over channels that compares $κ_{1}$ and $κ_{2}$ by the maximal expected utility an agent can obtain when decisions are based on the channel outputs. Equivalently, $κ_{1}$ is said to be Blackwell-inferior to $κ_{2}$ if and only if $κ_{1}$ can be constructed by garbling the output of $κ_{2}$. A related partial order stipulates that $κ_{2}$ is more capable than $κ_{1}$ if the mutual information between the input and output is larger for $κ_{2}$ than for $κ_{1}$ for any distribution over inputs. A Blackwell-inferior channel is necessarily less capable. However, examples are known where $κ_{1}$ is less capable than $κ_{2}$ but not Blackwell-inferior. We show that this may even happen when $κ_{1}$ is constructed by coarse-graining the inputs of $κ_{2}$. Such a coarse-graining is a special kind of "pre-garbling" of the channel inputs. This example directly establishes that the expected value of the shared utility function for the coarse-grained channel is larger than it is for the non-coarse-grained channel. This contradicts the intuition that coarse-graining can only destroy information and lead to inferior channels. We also discuss our results in the context of information decompositions.
△ Less
Submitted 10 November, 2017; v1 submitted 26 January, 2017;
originally announced January 2017.
-
A quantitative definition of organismality and its application to lichen
Authors:
Eric Libby,
Joshua Grochow,
Simon DeDeo,
David Wolpert
Abstract:
The organism is a fundamental concept in biology. However there is no universally accepted, formal, and yet broadly applicable definition of what an organism is. Here we introduce a candidate definition. We adopt the view that the "organism" is a functional concept, used by scientists to address particular questions concerning the future state of a biological system, rather than something wholly d…
▽ More
The organism is a fundamental concept in biology. However there is no universally accepted, formal, and yet broadly applicable definition of what an organism is. Here we introduce a candidate definition. We adopt the view that the "organism" is a functional concept, used by scientists to address particular questions concerning the future state of a biological system, rather than something wholly defined by that system. In this approach organisms are a coarse-graining of a fine-grained dynamical model of a biological system. Crucially, the coarse-graining of the system into organisms is chosen so that their dynamics can be used by scientists to make accurate predictions of those features of the biological system that interests them, and do so with minimal computational burden. To illustrate our framework we apply it to a dynamic model of lichen symbiosis---a system where either the lichen or its constituent fungi and algae could reasonably be considered "organisms." We find that the best choice for what organisms are in this scenario are complex mixtures of many entities that do not resemble standard notions of organisms. When we restrict our allowed coarse-grainings to more traditional types of organisms, we find that ecological conditions, such as niche competition and predation pressure, play a significant role in determining the best choice for organisms.
△ Less
Submitted 30 November, 2016;
originally announced December 2016.
-
A Likelihood Ratio Detector for Identifying Within-Perimeter Computer Network Attacks
Authors:
Justin Grana,
David Wolpert,
Joshua Neil,
Tanmoy Bhattacharya,
Russel Bent
Abstract:
The rapid detection of attackers within firewalls of enterprise computer net- works is of paramount importance. Anomaly detectors address this problem by quantifying deviations from baseline statistical models of normal network behav- ior and signaling an intrusion when the observed data deviates significantly from the baseline model. However, many anomaly detectors do not take into account plausi…
▽ More
The rapid detection of attackers within firewalls of enterprise computer net- works is of paramount importance. Anomaly detectors address this problem by quantifying deviations from baseline statistical models of normal network behav- ior and signaling an intrusion when the observed data deviates significantly from the baseline model. However, many anomaly detectors do not take into account plausible attacker behavior. As a result, anomaly detectors are prone to a large number of false positives due to unusual but benign activity. This paper first in- troduces a stochastic model of attacker behavior which is motivated by real world attacker traversal. Then, we develop a likelihood ratio detector that compares the probability of observed network behavior under normal conditions against the case when an attacker has possibly compromised a subset of hosts within the network. Since the likelihood ratio detector requires integrating over the time each host be- comes compromised, we illustrate how to use Monte Carlo methods to compute the requisite integral. We then present Receiver Operating Characteristic (ROC) curves for various network parameterizations that show for any rate of true posi- tives, the rate of false positives for the likelihood ratio detector is no higher than that of a simple anomaly detector and is often lower. We conclude by demon- strating the superiority of the proposed likelihood ratio detector when the network topologies and parameterizations are extracted from real-world networks.
△ Less
Submitted 1 September, 2016;
originally announced September 2016.
-
Dependence of dissipation on the initial distribution over states
Authors:
Artemy Kolchinsky,
David H. Wolpert
Abstract:
We analyze how the amount of work dissipated by a fixed nonequilibrium process depends on the initial distribution over states. Specifically, we compare the amount of dissipation when the process is used with some specified initial distribution to the minimal amount of dissipation possible for any initial distribution. We show that the difference between those two amounts of dissipation is given b…
▽ More
We analyze how the amount of work dissipated by a fixed nonequilibrium process depends on the initial distribution over states. Specifically, we compare the amount of dissipation when the process is used with some specified initial distribution to the minimal amount of dissipation possible for any initial distribution. We show that the difference between those two amounts of dissipation is given by a simple information-theoretic function that depends only on the initial and final state distributions. Crucially, this difference is independent of the details of the process relating those distributions. We then consider how dissipation depends on the initial distribution for a 'computer', i.e., a nonequilibrium process whose dynamics over coarse-grained macrostates implement some desired input-output map. We show that our results still apply when stated in terms of distributions over the computer's coarse-grained macrostates. This can be viewed as a novel thermodynamic cost of computation, reflecting changes in the distribution over inputs rather than the logical dynamics of the computation.
△ Less
Submitted 22 August, 2017; v1 submitted 4 July, 2016;
originally announced July 2016.