-
Implications of computer science theory for the simulation hypothesis
Authors:
David H. Wolpert
Abstract:
The simulation hypothesis has recently excited renewed interest, especially in the physics and philosophy communities. However, the hypothesis specifically concerns \textit{computers} that simulate physical universes, which means that to formally investigate it we need to couple computer science theory with physics. Here I couple those fields with the physical Church-Turing thesis. I then exploit…
▽ More
The simulation hypothesis has recently excited renewed interest, especially in the physics and philosophy communities. However, the hypothesis specifically concerns \textit{computers} that simulate physical universes, which means that to formally investigate it we need to couple computer science theory with physics. Here I couple those fields with the physical Church-Turing thesis. I then exploit that coupling to investigate of some of the computer science theory aspects of the simulation hypothesis. In particular, I use Kleene's second recursion theorem to prove that it is mathematically possible for us to be a simulation that is being run on a computer - by us. In such a self-simulation, there would be two identical instances of us; the question of which of those is ``really us'' is meaningless. I also show how Rice's theorem provides some interesting impossibility results concerning simulation and self-simulation; briefly describe the philosophical implications of fully homomorphic encryption for (self-)simulation; and briefly investigate the graphical structure of universes simulating universes simulating universes ..., among other issues. I end by describing some of the possible avenues for future research.
△ Less
Submitted 22 June, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Entropy production in communication channels
Authors:
Farita Tasnim,
Nahuel Freitas,
David H. Wolpert
Abstract:
In many complex systems, whether biological or artificial, the thermodynamic costs of communication among their components are large. These systems also tend to split information transmitted between any two components across multiple channels. A common hypothesis is that such inverse multiplexing strategies reduce total thermodynamic costs. So far, however, there have been no physics-based results…
▽ More
In many complex systems, whether biological or artificial, the thermodynamic costs of communication among their components are large. These systems also tend to split information transmitted between any two components across multiple channels. A common hypothesis is that such inverse multiplexing strategies reduce total thermodynamic costs. So far, however, there have been no physics-based results supporting this hypothesis. This gap existed partially because we have lacked a theoretical framework that addresses the interplay of thermodynamics and information in off-equilibrium systems. Here we present the first study that rigorously combines such a framework, stochastic thermodynamics, with Shannon information theory. We develop a minimal model that captures the fundamental features common to a wide variety of communication systems, and study the relationship between the entropy production of the communication process and the channel capacity, the canonical measure of the communication capability of a channel. In contrast to what is assumed in previous works not based on first principles, we show that the entropy production is not always a convex and monotonically increasing function of the channel capacity. However, those two properties are recovered for sufficiently high channel capacity. These results clarify when and how to split a single communication stream across multiple channels.
△ Less
Submitted 8 February, 2024; v1 submitted 8 February, 2023;
originally announced February 2023.
-
What can we know about that which we cannot even imagine?
Authors:
David H. Wolpert
Abstract:
In this essay I will consider a sequence of questions. The first questions concern the biological function of intelligence in general, and cognitive prostheses of human intelligence in particular. These will lead into questions concerning human language, perhaps the most important cognitive prosthesis humanity has ever developed. While it is traditional to rhapsodize about the cognitive power enca…
▽ More
In this essay I will consider a sequence of questions. The first questions concern the biological function of intelligence in general, and cognitive prostheses of human intelligence in particular. These will lead into questions concerning human language, perhaps the most important cognitive prosthesis humanity has ever developed. While it is traditional to rhapsodize about the cognitive power encapsulated in human language, I will emphasize how horribly limited human language is - and therefore how limited our cognitive abilities are, despite their being augmented with language. This will lead to questions of whether human mathematics, being ultimately formulated in terms of human language, is also deeply limited. I will then combine these questions to pose a partial, sort-of, sideways answer to the guiding concern of this essay: what we can ever discern about that we cannot even conceive?
△ Less
Submitted 17 March, 2024; v1 submitted 7 August, 2022;
originally announced August 2022.
-
The Past as a Stochastic Process
Authors:
David H. Wolpert,
Michael H. Price,
Stefani A. Crabtree,
Timothy A. Kohler,
Jurgen Jost,
James Evans,
Peter F. Stadler,
Hajime Shimao,
Manfred D. Laubichler
Abstract:
Historical processes manifest remarkable diversity. Nevertheless, scholars have long attempted to identify patterns and categorize historical actors and influences with some success. A stochastic process framework provides a structured approach for the analysis of large historical datasets that allows for detection of sometimes surprising patterns, identification of relevant causal actors both end…
▽ More
Historical processes manifest remarkable diversity. Nevertheless, scholars have long attempted to identify patterns and categorize historical actors and influences with some success. A stochastic process framework provides a structured approach for the analysis of large historical datasets that allows for detection of sometimes surprising patterns, identification of relevant causal actors both endogenous and exogenous to the process, and comparison between different historical cases. The combination of data, analytical tools and the organizing theoretical framework of stochastic processes complements traditional narrative approaches in history and archaeology.
△ Less
Submitted 10 December, 2021;
originally announced December 2021.
-
The Implications of the No-Free-Lunch Theorems for Meta-induction
Authors:
David H. Wolpert
Abstract:
The important recent book by G. Schurz appreciates that the no-free-lunch theorems (NFL) have major implications for the problem of (meta) induction. Here I review the NFL theorems, emphasizing that they do not only concern the case where there is a uniform prior -- they prove that there are "as many priors" (loosely speaking) for which any induction algorithm $A$ out-generalizes some induction al…
▽ More
The important recent book by G. Schurz appreciates that the no-free-lunch theorems (NFL) have major implications for the problem of (meta) induction. Here I review the NFL theorems, emphasizing that they do not only concern the case where there is a uniform prior -- they prove that there are "as many priors" (loosely speaking) for which any induction algorithm $A$ out-generalizes some induction algorithm $B$ as vice-versa. Importantly though, in addition to the NFL theorems, there are many {free lunch} theorems. In particular, the NFL theorems can only be used to compare the {marginal} expected performance of an induction algorithm $A$ with the marginal expected performance of an induction algorithm $B$. There is a rich set of free lunches which instead concern the statistical correlations among the generalization errors of induction algorithms. As I describe, the meta-induction algorithms that Schurz advocate as a "solution to Hume's problem" are just an example of such a free lunch based on correlations among the generalization errors of induction algorithms. I end by pointing out that the prior that Schurz advocates, which is uniform over bit frequencies rather than bit patterns, is contradicted by thousands of experiments in statistical physics and by the great success of the maximum entropy procedure in inductive inference.
△ Less
Submitted 27 July, 2022; v1 submitted 22 March, 2021;
originally announced March 2021.
-
Dependence of integrated, instantaneous, and fluctuating entropy production on the initial state in quantum and classical processes
Authors:
Artemy Kolchinsky,
David H. Wolpert
Abstract:
We consider the additional entropy production (EP) incurred by a fixed quantum or classical process on some initial state $ρ$, above the minimum EP incurred by the same process on any initial state. We show that this additional EP, which we term the "mismatch cost of $ρ$", has a universal information-theoretic form: it is given by the contraction of the relative entropy between $ρ$ and the least-d…
▽ More
We consider the additional entropy production (EP) incurred by a fixed quantum or classical process on some initial state $ρ$, above the minimum EP incurred by the same process on any initial state. We show that this additional EP, which we term the "mismatch cost of $ρ$", has a universal information-theoretic form: it is given by the contraction of the relative entropy between $ρ$ and the least-dissipative initial state $\varphi$ over time. We derive versions of this result for integrated EP incurred over the course of a process, for trajectory-level fluctuating EP, and for instantaneous EP rate. We also show that mismatch cost for fluctuating EP obeys an integral fluctuation theorem. Our results demonstrate a fundamental relationship between "thermodynamic irreversibility" (generation of EP) and "logical irreversibility" (inability to know the initial state corresponding to a given final state). We use this relationship to derive quantitative bounds on the thermodynamics of quantum error correction and to propose a thermodynamically-operationalized measure of the logical irreversibility of a quantum channel. Our results hold for both finite and infinite dimensional systems, and generalize beyond EP to many other thermodynamic costs, including nonadiabatic EP, free energy loss, and entropy gain.
△ Less
Submitted 5 June, 2022; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Noisy Deductive Reasoning: How Humans Construct Math, and How Math Constructs Universes
Authors:
David H. Wolpert,
David Kinney
Abstract:
We present a computational model of mathematical reasoning according to which mathematics is a fundamentally stochastic process. That is, on our model, whether or not a given formula is deemed a theorem in some axiomatic system is not a matter of certainty, but is instead governed by a probability distribution. We then show that this framework gives a compelling account of several aspects of mathe…
▽ More
We present a computational model of mathematical reasoning according to which mathematics is a fundamentally stochastic process. That is, on our model, whether or not a given formula is deemed a theorem in some axiomatic system is not a matter of certainty, but is instead governed by a probability distribution. We then show that this framework gives a compelling account of several aspects of mathematical practice. These include: 1) the way in which mathematicians generate research programs, 2) the applicability of Bayesian models of mathematical heuristics, 3) the role of abductive reasoning in mathematics, 4) the way in which multiple proofs of a proposition can strengthen our degree of belief in that proposition, and 5) the nature of the hypothesis that there are multiple formal systems that are isomorphic to physically possible universes. Thus, by embracing a model of mathematics as not perfectly predictable, we generate a new and fruitful perspective on the epistemology and practice of mathematics.
△ Less
Submitted 28 October, 2020;
originally announced December 2020.
-
Work, entropy production, and thermodynamics of information under protocol constraints
Authors:
Artemy Kolchinsky,
David H. Wolpert
Abstract:
In many real-world situations, there are constraints on the ways in which a physical system can be manipulated. We investigate the entropy production (EP) and extractable work involved in bringing a system from some initial distribution $p$ to some final distribution $p'$, given that the set of master equations available to the driving protocol obeys some constraints. We first derive general bound…
▽ More
In many real-world situations, there are constraints on the ways in which a physical system can be manipulated. We investigate the entropy production (EP) and extractable work involved in bringing a system from some initial distribution $p$ to some final distribution $p'$, given that the set of master equations available to the driving protocol obeys some constraints. We first derive general bounds on EP and extractable work, as well as a decomposition of the nonequilibrium free energy into an "accessible free energy" (which can be extracted as work, given a set of constraints) and an "inaccessible free energy" (which must be dissipated as EP). In a similar vein, we consider the thermodynamics of information in the presence of constraints, and decompose the information acquired in a measurement into "accessible" and "inaccessible" components. This decomposition allows us to consider the thermodynamic efficiency of different measurements of the same system, given a set of constraints. We use our framework to analyze protocols subject to symmetry, modularity, and coarse-grained constraints, and consider various examples including the Szilard box, the 2D Ising model, and a multi-particle flashing ratchet.
△ Less
Submitted 19 October, 2021; v1 submitted 24 August, 2020;
originally announced August 2020.
-
What is important about the No Free Lunch theorems?
Authors:
David H. Wolpert
Abstract:
The No Free Lunch theorems prove that under a uniform distribution over induction problems (search problems or learning problems), all induction algorithms perform equally. As I discuss in this chapter, the importance of the theorems arises by using them to analyze scenarios involving {non-uniform} distributions, and to compare different algorithms, without any assumption about the distribution ov…
▽ More
The No Free Lunch theorems prove that under a uniform distribution over induction problems (search problems or learning problems), all induction algorithms perform equally. As I discuss in this chapter, the importance of the theorems arises by using them to analyze scenarios involving {non-uniform} distributions, and to compare different algorithms, without any assumption about the distribution over problems at all. In particular, the theorems prove that {anti}-cross-validation (choosing among a set of candidate algorithms based on which has {worst} out-of-sample behavior) performs as well as cross-validation, unless one makes an assumption -- which has never been formalized -- about how the distribution over induction problems, on the one hand, is related to the set of algorithms one is choosing among using (anti-)cross validation, on the other. In addition, they establish strong caveats concerning the significance of the many results in the literature which establish the strength of a particular algorithm without assuming a particular distribution. They also motivate a ``dictionary'' between supervised learning and improve blackbox optimization, which allows one to ``translate'' techniques from supervised learning into the domain of blackbox optimization, thereby strengthening blackbox optimization algorithms. In addition to these topics, I also briefly discuss their implications for philosophy of science.
△ Less
Submitted 21 July, 2020;
originally announced July 2020.
-
Minimal entropy production due to constraints on rate matrix dependencies in multipartite processes
Authors:
David H Wolpert
Abstract:
I consider multipartite processes in which there are constraints on each subsystem's rate matrix, restricting which other subsystems can directly affect its dynamics. I derive a strictly nonzero lower bound on the minimal achievable entropy production rate of the process in terms of these constraints on the rate matrices of its subsystems. The bound is based on constructing counterfactual rate mat…
▽ More
I consider multipartite processes in which there are constraints on each subsystem's rate matrix, restricting which other subsystems can directly affect its dynamics. I derive a strictly nonzero lower bound on the minimal achievable entropy production rate of the process in terms of these constraints on the rate matrices of its subsystems. The bound is based on constructing counterfactual rate matrices, in which some subsystems are held fixed while the others are allowed to evolve. This bound is related to the "learning rate" of stationary bipartite systems, and more generally to the "information flow" in bipartite systems.
△ Less
Submitted 13 May, 2020; v1 submitted 7 January, 2020;
originally announced January 2020.
-
Thermodynamic costs of Turing Machines
Authors:
Artemy Kolchinsky,
David H. Wolpert
Abstract:
Turing Machines (TMs) are the canonical model of computation in computer science and physics. We combine techniques from algorithmic information theory and stochastic thermodynamics to analyze the thermodynamic costs of TMs. We consider two different ways of realizing a given TM with a physical process. The first realization is designed to be thermodynamically reversible when fed with random input…
▽ More
Turing Machines (TMs) are the canonical model of computation in computer science and physics. We combine techniques from algorithmic information theory and stochastic thermodynamics to analyze the thermodynamic costs of TMs. We consider two different ways of realizing a given TM with a physical process. The first realization is designed to be thermodynamically reversible when fed with random input bits. The second realization is designed to generate less heat, up to an additive constant, than any realization that is computable (i.e., consistent with the physical Church-Turing thesis). We consider three different thermodynamic costs: the heat generated when the TM is run on each input (which we refer to as the "heat function"), the minimum heat generated when a TM is run with an input that results in some desired output (which we refer to as the "thermodynamic complexity" of the output, in analogy to the Kolmogorov complexity), and the expected heat on the input distribution that minimizes entropy production. For universal TMs, we show for both realizations that the thermodynamic complexity of any desired output is bounded by a constant (unlike the conventional Kolmogorov complexity), while the expected amount of generated heat is infinite. We also show that any computable realization faces a fundamental tradeoff between heat generation, the Kolmogorov complexity of its heat function, and the Kolmogorov complexity of its input-output map. We demonstrate this tradeoff by analyzing the thermodynamics of erasing a long string.
△ Less
Submitted 20 August, 2020; v1 submitted 10 December, 2019;
originally announced December 2019.
-
Uncertainty relations and fluctuation theorems for Bayes nets
Authors:
David H. Wolpert
Abstract:
Recent research has considered the stochastic thermodynamics of multiple interacting systems, representing the overall system as a Bayes net. I derive fluctuation theorems governing the entropy production (EP)of arbitrary sets of the systems in such a Bayes net. I also derive ``conditional'' fluctuation theorems, governing the distribution of EP in one set of systems conditioned on the EP of a dif…
▽ More
Recent research has considered the stochastic thermodynamics of multiple interacting systems, representing the overall system as a Bayes net. I derive fluctuation theorems governing the entropy production (EP)of arbitrary sets of the systems in such a Bayes net. I also derive ``conditional'' fluctuation theorems, governing the distribution of EP in one set of systems conditioned on the EP of a different set of systems. I then derive thermodynamic uncertainty relations relating the EP of the overall system to the precisions of probability currents within the individual systems.
△ Less
Submitted 1 June, 2020; v1 submitted 6 November, 2019;
originally announced November 2019.
-
Stochastic thermodynamics of computation
Authors:
David H. Wolpert
Abstract:
One of the major resource requirements of computers - ranging from biological cells to human brains to high-performance (engineered) computers - is the energy used to run them. Those costs of performing a computation have long been a focus of research in physics, going back to the early work of Landauer. One of the most prominent aspects of computers is that they are inherently nonequilibrium syst…
▽ More
One of the major resource requirements of computers - ranging from biological cells to human brains to high-performance (engineered) computers - is the energy used to run them. Those costs of performing a computation have long been a focus of research in physics, going back to the early work of Landauer. One of the most prominent aspects of computers is that they are inherently nonequilibrium systems. However, the early research was done when nonequilibrium statistical physics was in its infancy, which meant the work was formulated in terms of equilibrium statistical physics. Since then there have been major breakthroughs in nonequilibrium statistical physics, which are allowing us to investigate the myriad aspects of the relationship between statistical physics and computation, extending well beyond the issue of how much work is required to erase a bit. In this paper I review some of this recent work on the `stochastic thermodynamics of computation'. After reviewing the salient parts of information theory, computer science theory, and stochastic thermodynamics, I summarize what has been learned about the entropic costs of performing a broad range of computations, extending from bit erasure to loop-free circuits to logically reversible circuits to information ratchets to Turing machines. These results reveal new, challenging engineering problems for how to design computers to have minimal thermodynamic costs. They also allow us to start to combine computer science theory and stochastic thermodynamics at a foundational level, thereby expanding both.
△ Less
Submitted 16 February, 2023; v1 submitted 14 May, 2019;
originally announced May 2019.
-
Thermodynamics of computing with circuits
Authors:
David Hilton Wolpert,
Artemy Kolchinsky
Abstract:
Digital computers implement computations using circuits, as do many naturally occurring systems (e.g., gene regulatory networks). The topology of any such circuit restricts which variables may be physically coupled during the operation of a circuit. We investigate how such restrictions on the physical coupling affects the thermodynamic costs of running the circuit. To do this we first calculate th…
▽ More
Digital computers implement computations using circuits, as do many naturally occurring systems (e.g., gene regulatory networks). The topology of any such circuit restricts which variables may be physically coupled during the operation of a circuit. We investigate how such restrictions on the physical coupling affects the thermodynamic costs of running the circuit. To do this we first calculate the minimal additional entropy production that arises when we run a given gate in a circuit. We then build on this calculation, to analyze how the thermodynamic costs of implementing a computation with a full circuit, comprising multiple connected gates, depends on the topology of that circuit. This analysis provides a rich new set of optimization problems that must be addressed by any designer of a circuit, if they wish to minimize thermodynamic costs.
△ Less
Submitted 20 July, 2023; v1 submitted 11 June, 2018;
originally announced June 2018.
-
Number of hidden states needed to physically implement a given conditional distribution
Authors:
Jeremy A. Owen,
Artemy Kolchinsky,
David H. Wolpert
Abstract:
We consider the problem of how to construct a physical process over a finite state space $X$ that applies some desired conditional distribution $P$ to initial states to produce final states. This problem arises often in the thermodynamics of computation and nonequilibrium statistical physics more generally (e.g., when designing processes to implement some desired computation, feedback controller,…
▽ More
We consider the problem of how to construct a physical process over a finite state space $X$ that applies some desired conditional distribution $P$ to initial states to produce final states. This problem arises often in the thermodynamics of computation and nonequilibrium statistical physics more generally (e.g., when designing processes to implement some desired computation, feedback controller, or Maxwell demon). It was previously known that some conditional distributions cannot be implemented using any master equation that involves just the states in $X$. However, here we show that any conditional distribution $P$ can in fact be implemented---if additional "hidden" states not in $X$ are available. Moreover, we show that it is always possible to implement $P$ in a thermodynamically reversible manner. We then investigate a novel cost of the physical resources needed to implement a given distribution $P$: the minimal number of hidden states needed to do so. We calculate this cost exactly for the special case where $P$ represents a single-valued function, and provide an upper bound for the general case, in terms of the nonnegative rank of $P$. These results show that having access to one extra binary degree of freedom, thus doubling the total number of states, is sufficient to implement any $P$ with a master equation in a thermodynamically reversible way, if there are no constraints on the allowed form of the master equation. (Such constraints can greatly increase the minimal needed number of hidden states.) Our results also imply that for certain $P$ that can be implemented without hidden states, having hidden states permits an implementation that generates less heat.
△ Less
Submitted 13 October, 2019; v1 submitted 3 September, 2017;
originally announced September 2017.
-
A space-time tradeoff for implementing a function with master equation dynamics
Authors:
David H. Wolpert,
Artemy Kolchinsky,
Jeremy A. Owen
Abstract:
Master equations are commonly used to model the dynamics of physical systems, including systems that implement single-valued functions like a computer's update step. However, many such functions cannot be implemented by any master equation, even approximately, which raises the question of how they can occur in the real world. Here we show how any function over some "visible" states can be implemen…
▽ More
Master equations are commonly used to model the dynamics of physical systems, including systems that implement single-valued functions like a computer's update step. However, many such functions cannot be implemented by any master equation, even approximately, which raises the question of how they can occur in the real world. Here we show how any function over some "visible" states can be implemented with master equation dynamics--if the dynamics exploits additional, "hidden" states at intermediate times. We also show that any master equation implementing a function can be decomposed into a sequence of "hidden" timesteps, demarcated by changes in what state-to-state transitions have nonzero probability. In many real-world situations there is a cost both for more hidden states and for more hidden timesteps. Accordingly, we derive a "space-time" tradeoff between the number of hidden states and the number of hidden timesteps needed to implement any given function.
△ Less
Submitted 21 April, 2019; v1 submitted 28 August, 2017;
originally announced August 2017.
-
Nonlinear Information Bottleneck
Authors:
Artemy Kolchinsky,
Brendan D. Tracey,
David H. Wolpert
Abstract:
Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$. IB works by encoding $X$ in a compressed "bottleneck" random variable $M$ from which $Y$ can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been cons…
▽ More
Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$. IB works by encoding $X$ in a compressed "bottleneck" random variable $M$ from which $Y$ can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been considered for only two limited cases: discrete $X$ and $Y$ with small state spaces, and continuous $X$ and $Y$ with a Gaussian joint distribution (in which case optimal encoding and decoding maps are linear). We propose a method for performing IB on arbitrarily-distributed discrete and/or continuous $X$ and $Y$, while allowing for nonlinear encoding and decoding maps. Our approach relies on a novel non-parametric upper bound for mutual information. We describe how to implement our method using neural networks. We then show that it achieves better performance than the recently-proposed "variational IB" method on several real-world datasets.
△ Less
Submitted 30 November, 2019; v1 submitted 5 May, 2017;
originally announced May 2017.
-
Dependence of dissipation on the initial distribution over states
Authors:
Artemy Kolchinsky,
David H. Wolpert
Abstract:
We analyze how the amount of work dissipated by a fixed nonequilibrium process depends on the initial distribution over states. Specifically, we compare the amount of dissipation when the process is used with some specified initial distribution to the minimal amount of dissipation possible for any initial distribution. We show that the difference between those two amounts of dissipation is given b…
▽ More
We analyze how the amount of work dissipated by a fixed nonequilibrium process depends on the initial distribution over states. Specifically, we compare the amount of dissipation when the process is used with some specified initial distribution to the minimal amount of dissipation possible for any initial distribution. We show that the difference between those two amounts of dissipation is given by a simple information-theoretic function that depends only on the initial and final state distributions. Crucially, this difference is independent of the details of the process relating those distributions. We then consider how dissipation depends on the initial distribution for a 'computer', i.e., a nonequilibrium process whose dynamics over coarse-grained macrostates implement some desired input-output map. We show that our results still apply when stated in terms of distributions over the computer's coarse-grained macrostates. This can be viewed as a novel thermodynamic cost of computation, reflecting changes in the distribution over inputs rather than the logical dynamics of the computation.
△ Less
Submitted 22 August, 2017; v1 submitted 4 July, 2016;
originally announced July 2016.
-
Optimal high-level descriptions of dynamical systems
Authors:
David H. Wolpert,
Joshua A. Grochow,
Eric Libby,
Simon DeDeo
Abstract:
To analyze high-dimensional systems, many fields in science and engineering rely on high-level descriptions, sometimes called "macrostates," "coarse-grainings," or "effective theories". Examples of such descriptions include the thermodynamic properties of a large collection of point particles undergoing reversible dynamics, the variables in a macroeconomic model describing the individuals that par…
▽ More
To analyze high-dimensional systems, many fields in science and engineering rely on high-level descriptions, sometimes called "macrostates," "coarse-grainings," or "effective theories". Examples of such descriptions include the thermodynamic properties of a large collection of point particles undergoing reversible dynamics, the variables in a macroeconomic model describing the individuals that participate in an economy, and the summary state of a cell composed of a large set of biochemical networks.
Often these high-level descriptions are constructed without considering the ultimate reason for needing them in the first place. Here, we formalize and quantify one such purpose: the need to predict observables of interest concerning the high-dimensional system with as high accuracy as possible, while minimizing the computational cost of doing so. The resulting State Space Compression (SSC) framework provides a guide for how to solve for the {optimal} high-level description of a given dynamical system, rather than constructing it based on human intuition alone.
In this preliminary report, we introduce SSC, and illustrate it with several information-theoretic quantifications of "accuracy", all with different implications for the optimal compression. We also discuss some other possible applications of SSC beyond the goal of accurate prediction. These include SSC as a measure of the complexity of a dynamical system, and as a way to quantify information flow between the scales of a system.
△ Less
Submitted 3 June, 2015; v1 submitted 25 September, 2014;
originally announced September 2014.
-
Predicting the behavior of interacting humans by fusing data from multiple sources
Authors:
Erik J. Schlicht,
Ritchie Lee,
David H. Wolpert,
Mykel J. Kochenderfer,
Brendan Tracey
Abstract:
Multi-fidelity methods combine inexpensive low-fidelity simulations with costly but highfidelity simulations to produce an accurate model of a system of interest at minimal cost. They have proven useful in modeling physical systems and have been applied to engineering problems such as wing-design optimization. During human-in-the-loop experimentation, it has become increasingly common to use onlin…
▽ More
Multi-fidelity methods combine inexpensive low-fidelity simulations with costly but highfidelity simulations to produce an accurate model of a system of interest at minimal cost. They have proven useful in modeling physical systems and have been applied to engineering problems such as wing-design optimization. During human-in-the-loop experimentation, it has become increasingly common to use online platforms, like Mechanical Turk, to run low-fidelity experiments to gather human performance data in an efficient manner. One concern with these experiments is that the results obtained from the online environment generalize poorly to the actual domain of interest. To address this limitation, we extend traditional multi-fidelity approaches to allow us to combine fewer data points from high-fidelity human-in-the-loop experiments with plentiful but less accurate data from low-fidelity experiments to produce accurate models of how humans interact. We present both model-based and model-free methods, and summarize the predictive performance of each method under dierent conditions.
△ Less
Submitted 9 August, 2014;
originally announced August 2014.
-
Value of information in noncooperative games
Authors:
Nils Bertschinger,
David H. Wolpert,
Eckehard Olbrich,
Juergen Jost
Abstract:
In some games, additional information hurts a player, e.g., in games with first-mover advantage, the second-mover is hurt by seeing the first-mover's move. What properties of a game determine whether it has such negative "value of information" for a particular player? Can a game have negative value of information for all players? To answer such questions, we generalize the definition of marginal u…
▽ More
In some games, additional information hurts a player, e.g., in games with first-mover advantage, the second-mover is hurt by seeing the first-mover's move. What properties of a game determine whether it has such negative "value of information" for a particular player? Can a game have negative value of information for all players? To answer such questions, we generalize the definition of marginal utility of a good to define the marginal utility of a parameter vector specifying a game. So rather than analyze the global structure of the relationship between a game's parameter vector and player behavior, as in previous work, we focus on the local structure of that relationship. This allows us to prove that generically, every game can have negative marginal value of information, unless one imposes a priori constraints on allowed changes to the game's parameter vector. We demonstrate these and related results numerically, and discuss their implications.
△ Less
Submitted 30 January, 2015; v1 submitted 27 December, 2013;
originally announced January 2014.
-
Counter-Factual Reinforcement Learning: How to Model Decision-Makers That Anticipate The Future
Authors:
Ritchie Lee,
David H. Wolpert,
James Bono,
Scott Backhaus,
Russell Bent,
Brendan Tracey
Abstract:
This paper introduces a novel framework for modeling interacting humans in a multi-stage game. This "iterated semi network-form game" framework has the following desirable characteristics: (1) Bounded rational players, (2) strategic players (i.e., players account for one another's reward functions when predicting one another's behavior), and (3) computational tractability even on real-world system…
▽ More
This paper introduces a novel framework for modeling interacting humans in a multi-stage game. This "iterated semi network-form game" framework has the following desirable characteristics: (1) Bounded rational players, (2) strategic players (i.e., players account for one another's reward functions when predicting one another's behavior), and (3) computational tractability even on real-world systems. We achieve these benefits by combining concepts from game theory and reinforcement learning. To be precise, we extend the bounded rational "level-K reasoning" model to apply to games over multiple stages. Our extension allows the decomposition of the overall modeling problem into a series of smaller ones, each of which can be solved by standard reinforcement learning algorithms. We call this hybrid approach "level-K reinforcement learning". We investigate these ideas in a cyber battle scenario over a smart power grid and discuss the relationship between the behavior predicted by our model and what one might expect of real human defenders and attackers.
△ Less
Submitted 3 July, 2012;
originally announced July 2012.
-
Predicting the behavior of interacting humans by fusing data from multiple sources
Authors:
Erik J. Schlicht,
Ritchie Lee,
David H. Wolpert,
Mykel J. Kochenderfer,
Brendan Tracey
Abstract:
Multi-fidelity methods combine inexpensive low-fidelity simulations with costly but high-fidelity simulations to produce an accurate model of a system of interest at minimal cost. They have proven useful in modeling physical systems and have been applied to engineering problems such as wing-design optimization. During human-in-the-loop experimentation, it has become increasingly common to use onli…
▽ More
Multi-fidelity methods combine inexpensive low-fidelity simulations with costly but high-fidelity simulations to produce an accurate model of a system of interest at minimal cost. They have proven useful in modeling physical systems and have been applied to engineering problems such as wing-design optimization. During human-in-the-loop experimentation, it has become increasingly common to use online platforms, like Mechanical Turk, to run low-fidelity experiments to gather human performance data in an efficient manner. One concern with these experiments is that the results obtained from the online environment generalize poorly to the actual domain of interest. To address this limitation, we extend traditional multi-fidelity approaches to allow us to combine fewer data points from high-fidelity human-in-the-loop experiments with plentiful but less accurate data from low-fidelity experiments to produce accurate models of how humans interact. We present both model-based and model-free methods, and summarize the predictive performance of each method under different conditions.
△ Less
Submitted 26 June, 2012;
originally announced June 2012.
-
Collective Intelligence, Data Routing and Braess' Paradox
Authors:
K. Tumer,
D. H. Wolpert
Abstract:
We consider the problem of designing the the utility functions of the utility-maximizing agents in a multi-agent system so that they work synergistically to maximize a global utility. The particular problem domain we explore is the control of network routing by placing agents on all the routers in the network. Conventional approaches to this task have the agents all use the Ideal S…
▽ More
We consider the problem of designing the the utility functions of the utility-maximizing agents in a multi-agent system so that they work synergistically to maximize a global utility. The particular problem domain we explore is the control of network routing by placing agents on all the routers in the network. Conventional approaches to this task have the agents all use the Ideal Shortest Path routing Algorithm (ISPA). We demonstrate that in many cases, due to the side-effects of one agent's actions on another agent's performance, having agents use ISPA's is suboptimal as far as global aggregate cost is concerned, even when they are only used to route infinitesimally small amounts of traffic. The utility functions of the individual agents are not "aligned" with the global utility, intuitively speaking. As a particular example of this we present an instance of Braess' paradox in which adding new links to a network whose agents all use the ISPA results in a decrease in overall throughput. We also demonstrate that load-balancing, in which the agents' decisions are collectively made to optimize the global cost incurred by all traffic currently being routed, is suboptimal as far as global cost averaged across time is concerned. This is also due to 'side-effects', in this case of current routing decision on future traffic. The mathematics of Collective Intelligence (COIN) is concerned precisely with the issue of avoiding such deleterious side-effects in multi-agent systems, both over time and space. We present key concepts from that mathematics and use them to derive an algorithm whose ideal version should have better performance than that of having all agents use the ISPA, even in the infinitesimal limit. We present experiments verifying this, and also showing that a machine-learning-based version of this COIN algorithm in which costs are only imprecisely estimated via empirical means (a version potentially applicable in the real world) also outperforms the ISPA, despite having access to less information than does the ISPA. In particular, this COIN algorithm almost always avoids Braess' paradox.
△ Less
Submitted 9 June, 2011;
originally announced June 2011.
-
Game theoretic modeling of pilot behavior during mid-air encounters
Authors:
Ritchie Lee,
David H. Wolpert
Abstract:
We show how to combine Bayes nets and game theory to predict the behavior of hybrid systems involving both humans and automated components. We call this novel framework "Semi Network-Form Games," and illustrate it by predicting aircraft pilot behavior in potential near mid-air collisions. At present, at the beginning of such potential collisions, a collision avoidance system in the aircraft cockpi…
▽ More
We show how to combine Bayes nets and game theory to predict the behavior of hybrid systems involving both humans and automated components. We call this novel framework "Semi Network-Form Games," and illustrate it by predicting aircraft pilot behavior in potential near mid-air collisions. At present, at the beginning of such potential collisions, a collision avoidance system in the aircraft cockpit advises the pilots what to do to avoid the collision. However studies of mid-air encounters have found wide variability in pilot responses to avoidance system advisories. In particular, pilots rarely perfectly execute the recommended maneuvers, despite the fact that the collision avoidance system's effectiveness relies on their doing so. Rather pilots decide their actions based on all information available to them (advisory, instrument readings, visual observations). We show how to build this aspect into a semi network-form game model of the encounter and then present computational simulations of the resultant model.
△ Less
Submitted 11 April, 2011; v1 submitted 26 March, 2011;
originally announced March 2011.
-
Hysteresis effects of changing parameters of noncooperative games
Authors:
David H. Wolpert,
Michael Harre,
Eckehard Olbrich,
Nils Bertschinger,
Juergen Jost
Abstract:
We adapt the method used by Jaynes to derive the equilibria of statistical physics to instead derive equilibria of bounded rational game theory. We analyze the dependence of these equilibria on the parameters of the underlying game, focusing on hysteresis effects. In particular, we show that by gradually imposing individual-specific tax rates on the players of the game, and then gradually removing…
▽ More
We adapt the method used by Jaynes to derive the equilibria of statistical physics to instead derive equilibria of bounded rational game theory. We analyze the dependence of these equilibria on the parameters of the underlying game, focusing on hysteresis effects. In particular, we show that by gradually imposing individual-specific tax rates on the players of the game, and then gradually removing those taxes, the players move from a poor equilibrium to one that is better for all of them.
△ Less
Submitted 27 October, 2010;
originally announced October 2010.
-
What does Newcomb's paradox teach us?
Authors:
David H. Wolpert,
Gregory Benford
Abstract:
In Newcomb's paradox you choose to receive either the contents of a particular closed box, or the contents of both that closed box and another one. Before you choose, a prediction algorithm deduces your choice, and fills the two boxes based on that deduction. Newcomb's paradox is that game theory appears to provide two conflicting recommendations for what choice you should make in this scenario. W…
▽ More
In Newcomb's paradox you choose to receive either the contents of a particular closed box, or the contents of both that closed box and another one. Before you choose, a prediction algorithm deduces your choice, and fills the two boxes based on that deduction. Newcomb's paradox is that game theory appears to provide two conflicting recommendations for what choice you should make in this scenario. We analyze Newcomb's paradox using a recent extension of game theory in which the players set conditional probability distributions in a Bayes net. We show that the two game theory recommendations in Newcomb's scenario have different presumptions for what Bayes net relates your choice and the algorithm's prediction. We resolve the paradox by proving that these two Bayes nets are incompatible. We also show that the accuracy of the algorithm's prediction, the focus of much previous work, is irrelevant. In addition we show that Newcomb's scenario only provides a contradiction between game theory's expected utility and dominance principles if one is sloppy in specifying the underlying Bayes net. We also show that Newcomb's paradox is time-reversal invariant; both the paradox and its resolution are unchanged if the algorithm makes its `prediction' after you make your choice rather than before.
△ Less
Submitted 5 March, 2010;
originally announced March 2010.
-
Physical limits of inference
Authors:
David H. Wolpert
Abstract:
I show that physical devices that perform observation, prediction, or recollection share an underlying mathematical structure. I call devices with that structure "inference devices". I present a set of existence and impossibility results concerning inference devices. These results hold independent of the precise physical laws governing our universe. In a limited sense, the impossibility results…
▽ More
I show that physical devices that perform observation, prediction, or recollection share an underlying mathematical structure. I call devices with that structure "inference devices". I present a set of existence and impossibility results concerning inference devices. These results hold independent of the precise physical laws governing our universe. In a limited sense, the impossibility results establish that Laplace was wrong to claim that even in a classical, non-chaotic universe the future can be unerringly predicted, given sufficient knowledge of the present. Alternatively, these impossibility results can be viewed as a non-quantum mechanical "uncertainty principle". Next I explore the close connections between the mathematics of inference devices and of Turing Machines. In particular, the impossibility results for inference devices are similar to the Halting theorem for TM's. Furthermore, one can define an analog of Universal TM's (UTM's) for inference devices. I call those analogs "strong inference devices". I use strong inference devices to define the "inference complexity" of an inference task, which is the analog of the Kolmogorov complexity of computing a string. However no universe can contain more than one strong inference device. So whereas the Kolmogorov complexity of a string is arbitrary up to specification of the UTM, there is no such arbitrariness in the inference complexity of an inference task. I end by discussing the philosophical implications of these results, e.g., for whether the universe "is" a computer.
△ Less
Submitted 23 October, 2008; v1 submitted 9 August, 2007;
originally announced August 2007.
-
Parametric Learning and Monte Carlo Optimization
Authors:
David H. Wolpert,
Dev G. Rajnarayan
Abstract:
This paper uncovers and explores the close relationship between Monte Carlo Optimization of a parametrized integral (MCO), Parametric machine-Learning (PL), and `blackbox' or `oracle'-based optimization (BO). We make four contributions. First, we prove that MCO is mathematically identical to a broad class of PL problems. This identity potentially provides a new application domain for all broadly…
▽ More
This paper uncovers and explores the close relationship between Monte Carlo Optimization of a parametrized integral (MCO), Parametric machine-Learning (PL), and `blackbox' or `oracle'-based optimization (BO). We make four contributions. First, we prove that MCO is mathematically identical to a broad class of PL problems. This identity potentially provides a new application domain for all broadly applicable PL techniques: MCO. Second, we introduce immediate sampling, a new version of the Probability Collectives (PC) algorithm for blackbox optimization. Immediate sampling transforms the original BO problem into an MCO problem. Accordingly, by combining these first two contributions, we can apply all PL techniques to BO. In our third contribution we validate this way of improving BO by demonstrating that cross-validation and bagging improve immediate sampling. Finally, conventional MC and MCO procedures ignore the relationship between the sample point locations and the associated values of the integrand; only the values of the integrand at those locations are considered. We demonstrate that one can exploit the sample location information using PL techniques, for example by forming a fit of the sample locations to the associated values of the integrand. This provides an additional way to apply PL techniques to improve MCO.
△ Less
Submitted 10 April, 2007;
originally announced April 2007.
-
A Predictive Theory of Games
Authors:
David H. Wolpert
Abstract:
Conventional noncooperative game theory hypothesizes that the joint strategy of a set of players in a game must satisfy an "equilibrium concept". All other joint strategies are considered impossible; the only issue is what equilibrium concept is "correct". This hypothesis violates the desiderata underlying probability theory. Indeed, probability theory renders moot the problem of what equilibriu…
▽ More
Conventional noncooperative game theory hypothesizes that the joint strategy of a set of players in a game must satisfy an "equilibrium concept". All other joint strategies are considered impossible; the only issue is what equilibrium concept is "correct". This hypothesis violates the desiderata underlying probability theory. Indeed, probability theory renders moot the problem of what equilibrium concept is correct - every joint strategy can arise with non-zero probability. Rather than a first-principles derivation of an equilibrium concept, game theory requires a first-principles derivation of a distribution over joint (mixed) strategies. This paper shows how information theory can provide such a distribution over joint strategies. If a scientist external to the game wants to distill such a distribution to a point prediction, that prediction should be set by decision theory, using their (!) loss function. So the predicted joint strategy - the "equilibrium concept" - varies with the external scientist's loss function. It is shown here that in many games, having a probability distribution with support restricted to Nash equilibria - as stipulated by conventional game theory - is impossible. It is also show how to: i) Derive an information-theoretic quantification of a player's degree of rationality; ii) Derive bounded rationality as a cost of computation; iii) Elaborate the close formal relationship between game theory and statistical physics; iv) Use this relationship to extend game theory to allow stochastically varying numbers of players.
△ Less
Submitted 7 December, 2005;
originally announced December 2005.
-
Metrics for more than two points at once
Authors:
David H. Wolpert
Abstract:
The conventional definition of a topological metric over a space specifies properties that must be obeyed by any measure of "how separated" two points in that space are. Here it is shown how to extend that definition, and in particular the triangle inequality, to concern arbitrary numbers of points. Such a measure of how separated the points within a collection are can be bootstrapped, to measur…
▽ More
The conventional definition of a topological metric over a space specifies properties that must be obeyed by any measure of "how separated" two points in that space are. Here it is shown how to extend that definition, and in particular the triangle inequality, to concern arbitrary numbers of points. Such a measure of how separated the points within a collection are can be bootstrapped, to measure "how separated" from each other are two (or more) collections. The measure presented here also allows fractional membership of an element in a collection. This means it directly concerns measures of ``how spread out" a probability distribution over a space is. When such a measure is bootstrapped to compare two collections, it allows us to measure how separated two probability distributions are, or more generally, how separated a distribution of distributions is.
△ Less
Submitted 15 April, 2004;
originally announced April 2004.
-
Distributed Control by Lagrangian Steepest Descent
Authors:
David H. Wolpert,
Stefan Bieniawski
Abstract:
Often adaptive, distributed control can be viewed as an iterated game between independent players. The coupling between the players' mixed strategies, arising as the system evolves from one instant to the next, is determined by the system designer. Information theory tells us that the most likely joint strategy of the players, given a value of the expectation of the overall control objective fun…
▽ More
Often adaptive, distributed control can be viewed as an iterated game between independent players. The coupling between the players' mixed strategies, arising as the system evolves from one instant to the next, is determined by the system designer. Information theory tells us that the most likely joint strategy of the players, given a value of the expectation of the overall control objective function, is the minimizer of a Lagrangian function of the joint strategy. So the goal of the system designer is to speed evolution of the joint strategy to that Lagrangian minimizing point, lower the expectated value of the control objective function, and repeat. Here we elaborate the theory of algorithms that do this using local descent procedures, and that thereby achieve efficient, adaptive, distributed control.
△ Less
Submitted 9 March, 2004;
originally announced March 2004.
-
Information Theory - The Bridge Connecting Bounded Rational Game Theory and Statistical Physics
Authors:
David H. Wolpert
Abstract:
A long-running difficulty with conventional game theory has been how to modify it to accommodate the bounded rationality of all real-world players. A recurring issue in statistical physics is how best to approximate joint probability distributions with decoupled (and therefore far more tractable) distributions. This paper shows that the same information theoretic mathematical structure, known as…
▽ More
A long-running difficulty with conventional game theory has been how to modify it to accommodate the bounded rationality of all real-world players. A recurring issue in statistical physics is how best to approximate joint probability distributions with decoupled (and therefore far more tractable) distributions. This paper shows that the same information theoretic mathematical structure, known as Product Distribution (PD) theory, addresses both issues. In this, PD theory not only provides a principled formulation of bounded rationality and a set of new types of mean field theory in statistical physics. It also shows that those topics are fundamentally one and the same.
△ Less
Submitted 19 February, 2004;
originally announced February 2004.
-
Product Distribution Field Theory
Authors:
David H. Wolpert
Abstract:
This paper presents a novel way to approximate a distribution governing a system of coupled particles with a product of independent distributions. The approach is an extension of mean field theory that allows the independent distributions to live in a different space from the system, and thereby capture statistical dependencies in that system. It also allows different Hamiltonians for each indep…
▽ More
This paper presents a novel way to approximate a distribution governing a system of coupled particles with a product of independent distributions. The approach is an extension of mean field theory that allows the independent distributions to live in a different space from the system, and thereby capture statistical dependencies in that system. It also allows different Hamiltonians for each independent distribution, to facilitate Monte Carlo estimation of those distributions. The approach leads to a novel energy-minimization algorithm in which each coordinate Monte Carlo estimates an associated spectrum, and then independently sets its state by sampling a Boltzmann distribution across that spectrum. It can also be used for high-dimensional numerical integration, (constrained) combinatorial optimization, and adaptive distributed control. This approach also provides a simple, physics-based derivation of the powerful approximate energy-minimization algorithms semi-formally derived in \cite{wowh00, wotu02c, wolp03a}. In addition it suggests many improvements to those algorithms, and motivates a new (bounded rationality) game theory equilibrium concept.
△ Less
Submitted 24 July, 2003;
originally announced July 2003.
-
On the computational capabilities of physical systems part II: relationship with conventional computer science
Authors:
David H. Wolpert
Abstract:
In the first of this pair of papers, it was proven that that no physical computer can correctly carry out all computational tasks that can be posed to it. The generality of this result follows from its use of a novel definition of computation, ``physical computation''. This second paper of the pair elaborates the mathematical structure and impossibility results associated with physical computati…
▽ More
In the first of this pair of papers, it was proven that that no physical computer can correctly carry out all computational tasks that can be posed to it. The generality of this result follows from its use of a novel definition of computation, ``physical computation''. This second paper of the pair elaborates the mathematical structure and impossibility results associated with physical computation. Analogues of Chomsky hierarcy results concerning universal Turing Machines and the Halting theorem are derived, as are results concerning the (im)possibility of certain kinds of error-correcting codes. In addition, an analogue of algorithmic information complexity, ``prediction complexity'', is elaborated. A task-independent bound is derived on how much the prediction complexity of a computational task can differ for two different universal physical computers used to solve that task, a bound similar to the ``encoding'' bound governing how much the algorithm information complexity of a Turing machine calculation can differ for two universal Turing machines. Finally, it is proven that either the Hamiltonian of our universe proscribes a certain type of computation, or prediction complexity is unique (unlike algorithmic information complexity).
△ Less
Submitted 22 May, 2000;
originally announced May 2000.
-
On the computational capabilities of physical systems part I: the impossibility of infallible computation
Authors:
David H. Wolpert
Abstract:
In this first of two papers, strong limits on the accuracy of physical computation are established. First it is proven that there cannot be a physical computer C to which one can pose any and all computational tasks concerning the physical universe. Next it is proven that no physical computer C can correctly carry out any computational task in the subset of such tasks that can be posed to C. As…
▽ More
In this first of two papers, strong limits on the accuracy of physical computation are established. First it is proven that there cannot be a physical computer C to which one can pose any and all computational tasks concerning the physical universe. Next it is proven that no physical computer C can correctly carry out any computational task in the subset of such tasks that can be posed to C. As a particular example, this means that there cannot be a physical computer that can, for any physical system external to that computer, take the specification of that external system's state as input and then correctly predict its future state before that future state actually occurs. The results also mean that there cannot exist an infallible, general-purpose observation apparatus, and that there cannot be an infallible, general-purpose control apparatus. These results do not rely on systems that are infinite, and/or non-classical, and/or obey chaotic dynamics. They also hold even if one uses an infinitely fast, infinitely dense computer, with computational powers greater than that of a Turing Machine.
△ Less
Submitted 22 May, 2000;
originally announced May 2000.
-
Avoiding Braess' Paradox through Collective Intelligence
Authors:
Kagan Tumer,
David H. Wolpert
Abstract:
In an Ideal Shortest Path Algorithm (ISPA), at each moment each router in a network sends all of its traffic down the path that will incur the lowest cost to that traffic. In the limit of an infinitesimally small amount of traffic for a particular router, its routing that traffic via an ISPA is optimal, as far as cost incurred by that traffic is concerned. We demonstrate though that in many case…
▽ More
In an Ideal Shortest Path Algorithm (ISPA), at each moment each router in a network sends all of its traffic down the path that will incur the lowest cost to that traffic. In the limit of an infinitesimally small amount of traffic for a particular router, its routing that traffic via an ISPA is optimal, as far as cost incurred by that traffic is concerned. We demonstrate though that in many cases, due to the side-effects of one router's actions on another routers performance, having routers use ISPA's is suboptimal as far as global aggregate cost is concerned, even when only used to route infinitesimally small amounts of traffic. As a particular example of this we present an instance of Braess' paradox for ISPA's, in which adding new links to a network decreases overall throughput. We also demonstrate that load-balancing, in which the routing decisions are made to optimize the global cost incurred by all traffic currently being routed, is suboptimal as far as global cost averaged across time is concerned. This is also due to "side-effects", in this case of current routing decision on future traffic.
The theory of COllective INtelligence (COIN) is concerned precisely with the issue of avoiding such deleterious side-effects. We present key concepts from that theory and use them to derive an idealized algorithm whose performance is better than that of the ISPA, even in the infinitesimal limit. We present experiments verifying this, and also showing that a machine-learning-based version of this COIN algorithm in which costs are only imprecisely estimated (a version potentially applicable in the real world) also outperforms the ISPA, despite having access to less information than does the ISPA. In particular, this COIN algorithm avoids Braess' paradox.
△ Less
Submitted 20 December, 1999;
originally announced December 1999.
-
Adaptivity in Agent-Based Routing for Data Networks
Authors:
David H. Wolpert,
Sergey Kirshner,
Chris J. Merz,
Kagan Tumer
Abstract:
Adaptivity, both of the individual agents and of the interaction structure among the agents, seems indispensable for scaling up multi-agent systems (MAS's) in noisy environments. One important consideration in designing adaptive agents is choosing their action spaces to be as amenable as possible to machine learning techniques, especially to reinforcement learning (RL) techniques. One important…
▽ More
Adaptivity, both of the individual agents and of the interaction structure among the agents, seems indispensable for scaling up multi-agent systems (MAS's) in noisy environments. One important consideration in designing adaptive agents is choosing their action spaces to be as amenable as possible to machine learning techniques, especially to reinforcement learning (RL) techniques. One important way to have the interaction structure connecting agents itself be adaptive is to have the intentions and/or actions of the agents be in the input spaces of the other agents, much as in Stackelberg games. We consider both kinds of adaptivity in the design of a MAS to control network packet routing.
We demonstrate on the OPNET event-driven network simulator the perhaps surprising fact that simply changing the action space of the agents to be better suited to RL can result in very large improvements in their potential performance: at their best settings, our learning-amenable router agents achieve throughputs up to three and one half times better than that of the standard Bellman-Ford routing algorithm, even when the Bellman-Ford protocol traffic is maintained. We then demonstrate that much of that potential improvement can be realized by having the agents learn their settings when the agent interaction structure is itself adaptive.
△ Less
Submitted 20 December, 1999;
originally announced December 1999.
-
An Introduction to Collective Intelligence
Authors:
David H. Wolpert,
Kagan Tumer
Abstract:
This paper surveys the emerging science of how to design a ``COllective INtelligence'' (COIN). A COIN is a large multi-agent system where:
(i) There is little to no centralized communication or control; and
(ii) There is a provided world utility function that rates the possible histories of the full system.
In particular, we are interested in COINs in which each agent runs a reinforcement…
▽ More
This paper surveys the emerging science of how to design a ``COllective INtelligence'' (COIN). A COIN is a large multi-agent system where:
(i) There is little to no centralized communication or control; and
(ii) There is a provided world utility function that rates the possible histories of the full system.
In particular, we are interested in COINs in which each agent runs a reinforcement learning (RL) algorithm. Rather than use a conventional modeling approach (e.g., model the system dynamics, and hand-tune agents to cooperate), we aim to solve the COIN design problem implicitly, via the ``adaptive'' character of the RL algorithms of each of the agents. This approach introduces an entirely new, profound design problem: Assuming the RL algorithms are able to achieve high rewards, what reward functions for the individual agents will, when pursued by those agents, result in high world utility? In other words, what reward functions will best ensure that we do not have phenomena like the tragedy of the commons, Braess's paradox, or the liquidity trap?
Although still very young, research specifically concentrating on the COIN design problem has already resulted in successes in artificial domains, in particular in packet-routing, the leader-follower problem, and in variants of Arthur's El Farol bar problem. It is expected that as it matures and draws upon other disciplines related to COINs, this research will greatly expand the range of tasks addressable by human engineers. Moreover, in addition to drawing on them, such a fully developed scie nce of COIN design may provide much insight into other already established scientific fields, such as economics, game theory, and population biology.
△ Less
Submitted 17 August, 1999;
originally announced August 1999.
-
Collective Intelligence for Control of Distributed Dynamical Systems
Authors:
David H. Wolpert,
Kevin R. Wheeler,
Kagan Tumer
Abstract:
We consider the El Farol bar problem, also known as the minority game (W. B. Arthur, ``The American Economic Review'', 84(2): 406--411 (1994), D. Challet and Y.C. Zhang, ``Physica A'', 256:514 (1998)). We view it as an instance of the general problem of how to configure the nodal elements of a distributed dynamical system so that they do not ``work at cross purposes'', in that their collective d…
▽ More
We consider the El Farol bar problem, also known as the minority game (W. B. Arthur, ``The American Economic Review'', 84(2): 406--411 (1994), D. Challet and Y.C. Zhang, ``Physica A'', 256:514 (1998)). We view it as an instance of the general problem of how to configure the nodal elements of a distributed dynamical system so that they do not ``work at cross purposes'', in that their collective dynamics avoids frustration and thereby achieves a provided global goal. We summarize a mathematical theory for such configuration applicable when (as in the bar problem) the global goal can be expressed as minimizing a global energy function and the nodes can be expressed as minimizers of local free energy functions. We show that a system designed with that theory performs nearly optimally for the bar problem.
△ Less
Submitted 17 August, 1999;
originally announced August 1999.
-
General Principles of Learning-Based Multi-Agent Systems
Authors:
David H. Wolpert,
Kevin R. Wheeler,
Kagan Tumer
Abstract:
We consider the problem of how to design large decentralized multi-agent systems (MAS's) in an automated fashion, with little or no hand-tuning. Our approach has each agent run a reinforcement learning algorithm. This converts the problem into one of how to automatically set/update the reward functions for each of the agents so that the global goal is achieved. In particular we do not want the a…
▽ More
We consider the problem of how to design large decentralized multi-agent systems (MAS's) in an automated fashion, with little or no hand-tuning. Our approach has each agent run a reinforcement learning algorithm. This converts the problem into one of how to automatically set/update the reward functions for each of the agents so that the global goal is achieved. In particular we do not want the agents to ``work at cross-purposes'' as far as the global goal is concerned. We use the term artificial COllective INtelligence (COIN) to refer to systems that embody solutions to this problem. In this paper we present a summary of a mathematical framework for COINs. We then investigate the real-world applicability of the core concepts of that framework via two computer experiments: we show that our COINs perform near optimally in a difficult variant of Arthur's bar problem (and in particular avoid the tragedy of the commons for that problem), and we also illustrate optimal performance for our COINs in the leader-follower problem.
△ Less
Submitted 10 May, 1999;
originally announced May 1999.
-
Using Collective Intelligence to Route Internet Traffic
Authors:
David H. Wolpert,
Kagan Tumer,
Jeremy Frank
Abstract:
A COllective INtelligence (COIN) is a set of interacting reinforcement learning (RL) algorithms designed in an automated fashion so that their collective behavior optimizes a global utility function. We summarize the theory of COINs, then present experiments using that theory to design COINs to control internet traffic routing. These experiments indicate that COINs outperform all previously inve…
▽ More
A COllective INtelligence (COIN) is a set of interacting reinforcement learning (RL) algorithms designed in an automated fashion so that their collective behavior optimizes a global utility function. We summarize the theory of COINs, then present experiments using that theory to design COINs to control internet traffic routing. These experiments indicate that COINs outperform all previously investigated RL-based, shortest path routing algorithms.
△ Less
Submitted 10 May, 1999;
originally announced May 1999.