-
Emergent Ferromagnetism at LaFeO3/SrTiO3 Interface Arising from Strain-induced Spin-State Transition
Authors:
Menglin Zhu,
Joseph Lanier,
Sevim Polat Genlik,
Jose G. Flores,
Victor da Cruz Pinha Barbosa,
Mohit Randeria,
Patrick M. Woodward,
Maryam Ghazisaeidi,
Fengyuan Yang,
**woo Hwang
Abstract:
Creating new interfacial magnetic states with desired functionalities is attractive for fundamental studies and spintronics applications. The emergence of interfacial magnetic phases demands the fabrication of pristine interfaces and the characterization and understanding of atomic structure as well as electronic, magnetic, and orbital degrees of freedom at the interface. Here, we report a novel i…
▽ More
Creating new interfacial magnetic states with desired functionalities is attractive for fundamental studies and spintronics applications. The emergence of interfacial magnetic phases demands the fabrication of pristine interfaces and the characterization and understanding of atomic structure as well as electronic, magnetic, and orbital degrees of freedom at the interface. Here, we report a novel interfacial insulating ferromagnetic order in antiferromagnetic LaFeO3 grown on SrTiO3, characterized by a combination of electron microscopy and spectroscopy, magnetometry, and density functional theory. The epitaxial strain drives a spin-state disproportionation in the interfacial layer of LaFeO3, which leads to a checkerboard arrangement of low- and high-spin Fe3+ ions inside smaller and larger FeO6 octahedra, respectively. Ferromagnetism at the interface arises from superexchange interactions between the low- and high-spin Fe3+. The detailed understanding of creation of emergent magnetism illustrates the potential of designing and controlling orbital degrees of freedom at the interface to realize novel phases and functionalities for future spin-electronic applications.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.
-
Goodness-of-fit and utility estimation: what's possible and what's not
Authors:
Joshua Lanier,
John K. -H. Quah
Abstract:
A goodness-of-fit index measures the consistency of consumption data with a given model of utility-maximization. We show that for the class of well-behaved (i.e., continuous and increasing) utility functions there is no goodness-of-fit index that is continuous and accurate, where the latter means that a perfect score is obtained if and only if a dataset can be rationalized by a well-behaved utilit…
▽ More
A goodness-of-fit index measures the consistency of consumption data with a given model of utility-maximization. We show that for the class of well-behaved (i.e., continuous and increasing) utility functions there is no goodness-of-fit index that is continuous and accurate, where the latter means that a perfect score is obtained if and only if a dataset can be rationalized by a well-behaved utility function. While many standard goodness-of-fit indices are inaccurate we show that these indices are (in a sense we make precise) essentially accurate. Goodness-of-fit indices are typically generated by loss functions and we find that standard loss functions usually do not yield a best-fitting utility function when they are minimized. Nonetheless, welfare comparisons can be made by working out a robust preference relation from the data.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Revealed preference and revealed preference cycles: a survey
Authors:
Paweł Dziewulski,
Joshua Lanier,
John K. -H. Quah
Abstract:
Afriat's Theorem (1967) states that a dataset can be thought of as being generated by a consumer maximizing a continuous and increasing utility function if and only if it is free of revealed preference cycles containing a strict relation. The latter property is often known by its acronym, GARP (for generalized axiom of revealed preference). This paper surveys extensions and applications of Afriat'…
▽ More
Afriat's Theorem (1967) states that a dataset can be thought of as being generated by a consumer maximizing a continuous and increasing utility function if and only if it is free of revealed preference cycles containing a strict relation. The latter property is often known by its acronym, GARP (for generalized axiom of revealed preference). This paper surveys extensions and applications of Afriat's seminal result. We focus on those results where the consistency of a dataset with the maximization of a utility function satisfying some property can be characterized by a suitably modified version of GARP.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Money Pumps and Bounded Rationality
Authors:
Joshua Lanier,
Matthew Polisson,
John K. -H. Quah
Abstract:
The standard criterion of rationality in economics is the maximization of a utility function that is stable across multiple observations of an agent's choice behavior. In this paper, we discuss two notions of the money pump that characterize two corresponding notions of utility-maximization. We explain the senses in which the amount of money that can be pumped from a consumer is a useful measure o…
▽ More
The standard criterion of rationality in economics is the maximization of a utility function that is stable across multiple observations of an agent's choice behavior. In this paper, we discuss two notions of the money pump that characterize two corresponding notions of utility-maximization. We explain the senses in which the amount of money that can be pumped from a consumer is a useful measure of the consumer's departure from utility-maximization.
△ Less
Submitted 7 April, 2024;
originally announced April 2024.
-
Real-time Animation Generation and Control on Rigged Models via Large Language Models
Authors:
Han Huang,
Fernanda De La Torre,
Cathy Mengying Fang,
Andrzej Banburski-Fahey,
Judith Amores,
Jaron Lanier
Abstract:
We introduce a novel method for real-time animation control and generation on rigged models using natural language input. First, we embed a large language model (LLM) in Unity to output structured texts that can be parsed into diverse and realistic animations. Second, we illustrate LLM's potential to enable flexible state transition between existing animations. We showcase the robustness of our ap…
▽ More
We introduce a novel method for real-time animation control and generation on rigged models using natural language input. First, we embed a large language model (LLM) in Unity to output structured texts that can be parsed into diverse and realistic animations. Second, we illustrate LLM's potential to enable flexible state transition between existing animations. We showcase the robustness of our approach through qualitative results on various rigged models and motions.
△ Less
Submitted 15 February, 2024; v1 submitted 26 October, 2023;
originally announced October 2023.
-
LLMR: Real-time Prompting of Interactive Worlds using Large Language Models
Authors:
Fernanda De La Torre,
Cathy Mengying Fang,
Han Huang,
Andrzej Banburski-Fahey,
Judith Amores Fernandez,
Jaron Lanier
Abstract:
We present Large Language Model for Mixed Reality (LLMR), a framework for the real-time creation and modification of interactive Mixed Reality experiences using LLMs. LLMR leverages novel strategies to tackle difficult cases where ideal training data is scarce, or where the design goal requires the synthesis of internal dynamics, intuitive analysis, or advanced interactivity. Our framework relies…
▽ More
We present Large Language Model for Mixed Reality (LLMR), a framework for the real-time creation and modification of interactive Mixed Reality experiences using LLMs. LLMR leverages novel strategies to tackle difficult cases where ideal training data is scarce, or where the design goal requires the synthesis of internal dynamics, intuitive analysis, or advanced interactivity. Our framework relies on text interaction and the Unity game engine. By incorporating techniques for scene understanding, task planning, self-debugging, and memory management, LLMR outperforms the standard GPT-4 by 4x in average error rate. We demonstrate LLMR's cross-platform interoperability with several example worlds, and evaluate it on a variety of creation and modification tasks to show that it can produce and edit diverse objects, tools, and scenes. Finally, we conducted a usability study (N=11) with a diverse set that revealed participants had positive experiences with the system and would use it again.
△ Less
Submitted 22 March, 2024; v1 submitted 21 September, 2023;
originally announced September 2023.
-
Homeomorphism groups of 2-manifolds with the virtual Rokhlin property
Authors:
Justin Lanier,
Nicholas G. Vlamis
Abstract:
We introduce and motivate the definition of the virtual Rokhlin property for topological groups. We then classify the 2-manifolds whose homeomorphism groups have the virtual Rokhlin property. We also establish the analogous result for map** class groups of 2-manifolds.
We introduce and motivate the definition of the virtual Rokhlin property for topological groups. We then classify the 2-manifolds whose homeomorphism groups have the virtual Rokhlin property. We also establish the analogous result for map** class groups of 2-manifolds.
△ Less
Submitted 23 June, 2024; v1 submitted 24 August, 2023;
originally announced August 2023.
-
Selective Perception: Optimizing State Descriptions with Reinforcement Learning for Language Model Actors
Authors:
Kolby Nottingham,
Yasaman Razeghi,
Kyungmin Kim,
JB Lanier,
Pierre Baldi,
Roy Fox,
Sameer Singh
Abstract:
Large language models (LLMs) are being applied as actors for sequential decision making tasks in domains such as robotics and games, utilizing their general world knowledge and planning abilities. However, previous work does little to explore what environment state information is provided to LLM actors via language. Exhaustively describing high-dimensional states can impair performance and raise i…
▽ More
Large language models (LLMs) are being applied as actors for sequential decision making tasks in domains such as robotics and games, utilizing their general world knowledge and planning abilities. However, previous work does little to explore what environment state information is provided to LLM actors via language. Exhaustively describing high-dimensional states can impair performance and raise inference costs for LLM actors. Previous LLM actors avoid the issue by relying on hand-engineered, task-specific protocols to determine which features to communicate about a state and which to leave out. In this work, we propose Brief Language INputs for DEcision-making Responses (BLINDER), a method for automatically selecting concise state descriptions by learning a value function for task-conditioned state descriptions. We evaluate BLINDER on the challenging video game NetHack and a robotic manipulation task. Our method improves task success rate, reduces input size and compute costs, and generalizes between LLM actors.
△ Less
Submitted 21 July, 2023;
originally announced July 2023.
-
Normal generators of Torelli groups
Authors:
Lei Chen,
Justin Lanier
Abstract:
Johnson showed that the genus 1 bounding pair maps generate the Torelli group of a surface when its genus is at least 3. We show that this generalizes: apart from straightforward exceptions, the bounding pair maps of any fixed genus also generate the Torelli group. We give some applications of this result to normal subgroups of map** class groups acting on curve complexes.
Johnson showed that the genus 1 bounding pair maps generate the Torelli group of a surface when its genus is at least 3. We show that this generalizes: apart from straightforward exceptions, the bounding pair maps of any fixed genus also generate the Torelli group. We give some applications of this result to normal subgroups of map** class groups acting on curve complexes.
△ Less
Submitted 22 February, 2023;
originally announced February 2023.
-
Strong On-Chip Microwave Photon-Magnon Coupling Using Ultra-low Dam** Epitaxial Y3Fe5O12 Films at 2 Kelvin
Authors:
Side Guo,
Daniel Russell,
Joseph Lanier,
Haotian Da,
P. Chris Hammel,
Fengyuan Yang
Abstract:
Y3Fe5O12 is arguably the best magnetic material for magnonic quantum information science (QIS) because of its extremely low dam**. We report ultralow dam** at 2 K in epitaxial Y3Fe5O12 thin films grown on a diamagnetic Y3Sc2Ga3O12 substrate that contains no rare-earth elements. Using these ultralow dam** YIG films, we demonstrate for the first time strong coupling between magnons in patterne…
▽ More
Y3Fe5O12 is arguably the best magnetic material for magnonic quantum information science (QIS) because of its extremely low dam**. We report ultralow dam** at 2 K in epitaxial Y3Fe5O12 thin films grown on a diamagnetic Y3Sc2Ga3O12 substrate that contains no rare-earth elements. Using these ultralow dam** YIG films, we demonstrate for the first time strong coupling between magnons in patterned YIG thin films and microwave photons in a superconducting Nb resonator. This result paves the road towards scalable hybrid quantum systems that integrate superconducting microwave resonators, YIG film magnon conduits, and superconducting qubits into on-chip QIS devices.
△ Less
Submitted 3 December, 2022;
originally announced December 2022.
-
Steps towards prompt-based creation of virtual worlds
Authors:
Jasmine Roberts,
Andrzej Banburski-Fahey,
Jaron Lanier
Abstract:
Large language models trained for code generation can be applied to speaking virtual worlds into existence (creating virtual worlds). In this work we show that prompt-based methods can both accelerate in-VR level editing, as well as can become part of gameplay rather than just part of game development. As an example, we present Codex VR Pong which shows non-deterministic game mechanics using gener…
▽ More
Large language models trained for code generation can be applied to speaking virtual worlds into existence (creating virtual worlds). In this work we show that prompt-based methods can both accelerate in-VR level editing, as well as can become part of gameplay rather than just part of game development. As an example, we present Codex VR Pong which shows non-deterministic game mechanics using generative processes to not only create static content but also non-trivial interactions between 3D objects. This demonstration naturally leads to an integral discussion on how one would evaluate and benchmark experiences created by generative models - as there are no qualitative or quantitative metrics that apply in these scenarios. We conclude by discussing impending challenges of AI-assisted co-creation in VR.
△ Less
Submitted 10 November, 2022;
originally announced November 2022.
-
Twisting cubic rabbits
Authors:
Justin Lanier,
Rebecca R. Winarski
Abstract:
We solve an infinite family of twisted polynomial problems that are cubic generalizations of Hubbard's twisted rabbit problem. We show how the result of twisting by a power of a certain Dehn twist depends on the 9-adic expansion of the power. For the cubic rabbit with three post-critical points, we also give an algorithmic solution to the twisting problem for the full pure map** class group.
We solve an infinite family of twisted polynomial problems that are cubic generalizations of Hubbard's twisted rabbit problem. We show how the result of twisting by a power of a certain Dehn twist depends on the 9-adic expansion of the power. For the cubic rabbit with three post-critical points, we also give an algorithmic solution to the twisting problem for the full pure map** class group.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments
Authors:
JB Lanier,
Stephen McAleer,
Pierre Baldi,
Roy Fox
Abstract:
Robust reinforcement learning (RL) considers the problem of learning policies that perform well in the worst case among a set of possible environment parameter values. In real-world environments, choosing the set of possible values for robust RL can be a difficult task. When that set is specified too narrowly, the agent will be left vulnerable to reasonable parameter values unaccounted for. When s…
▽ More
Robust reinforcement learning (RL) considers the problem of learning policies that perform well in the worst case among a set of possible environment parameter values. In real-world environments, choosing the set of possible values for robust RL can be a difficult task. When that set is specified too narrowly, the agent will be left vulnerable to reasonable parameter values unaccounted for. When specified too broadly, the agent will be too cautious. In this paper, we propose Feasible Adversarial Robust RL (FARR), a novel problem formulation and objective for automatically determining the set of environment parameter values over which to be robust. FARR implicitly defines the set of feasible parameter values as those on which an agent could achieve a benchmark reward given enough training resources. By formulating this problem as a two-player zero-sum game, optimizing the FARR objective jointly produces an adversarial distribution over parameter values with feasible support and a policy robust over this feasible parameter set. We demonstrate that approximate Nash equilibria for this objective can be found using a variation of the PSRO algorithm. Furthermore, we show that an optimal agent trained with FARR is more robust to feasible adversarial parameter selection than with existing minimax, domain-randomization, and regret objectives in a parameterized gridworld and three MuJoCo control environments.
△ Less
Submitted 3 October, 2022; v1 submitted 19 July, 2022;
originally announced July 2022.
-
Self-Play PSRO: Toward Optimal Populations in Two-Player Zero-Sum Games
Authors:
Stephen McAleer,
JB Lanier,
Kevin Wang,
Pierre Baldi,
Roy Fox,
Tuomas Sandholm
Abstract:
In competitive two-agent environments, deep reinforcement learning (RL) methods based on the \emph{Double Oracle (DO)} algorithm, such as \emph{Policy Space Response Oracles (PSRO)} and \emph{Anytime PSRO (APSRO)}, iteratively add RL best response policies to a population. Eventually, an optimal mixture of these population policies will approximate a Nash equilibrium. However, these methods might…
▽ More
In competitive two-agent environments, deep reinforcement learning (RL) methods based on the \emph{Double Oracle (DO)} algorithm, such as \emph{Policy Space Response Oracles (PSRO)} and \emph{Anytime PSRO (APSRO)}, iteratively add RL best response policies to a population. Eventually, an optimal mixture of these population policies will approximate a Nash equilibrium. However, these methods might need to add all deterministic policies before converging. In this work, we introduce \emph{Self-Play PSRO (SP-PSRO)}, a method that adds an approximately optimal stochastic policy to the population in each iteration. Instead of adding only deterministic best responses to the opponent's least exploitable population mixture, SP-PSRO also learns an approximately optimal stochastic policy and adds it to the population as well. As a result, SP-PSRO empirically tends to converge much faster than APSRO and in many games converges in just a few iterations.
△ Less
Submitted 13 July, 2022;
originally announced July 2022.
-
Non-local Field Theory from Matrix Models
Authors:
Andrzej Banburski,
Jaron Lanier,
Vasudev Shyam,
Lee Smolin,
Yigit Yargic
Abstract:
We show that a class of matrix theories can be understood as an extension of quantum field theory which has non-local interactions. This reformulation is based on the Wigner-Weyl transformation, and the interactions take the form of Moyal product on a doubled geometry. We recover local dynamics on the spacetime as a low-energy limit. This framework opens up the possibility for studying novel high-…
▽ More
We show that a class of matrix theories can be understood as an extension of quantum field theory which has non-local interactions. This reformulation is based on the Wigner-Weyl transformation, and the interactions take the form of Moyal product on a doubled geometry. We recover local dynamics on the spacetime as a low-energy limit. This framework opens up the possibility for studying novel high-energy phenomena, including the unification of gauge and geometric symmetries in a gauge theory.
△ Less
Submitted 27 June, 2022;
originally announced June 2022.
-
Anytime PSRO for Two-Player Zero-Sum Games
Authors:
Stephen McAleer,
Kevin Wang,
John Lanier,
Marc Lanctot,
Pierre Baldi,
Tuomas Sandholm,
Roy Fox
Abstract:
Policy space response oracles (PSRO) is a multi-agent reinforcement learning algorithm that has achieved state-of-the-art performance in very large two-player zero-sum games. PSRO is based on the tabular double oracle (DO) method, an algorithm that is guaranteed to converge to a Nash equilibrium, but may increase exploitability from one iteration to the next. We propose anytime double oracle (ADO)…
▽ More
Policy space response oracles (PSRO) is a multi-agent reinforcement learning algorithm that has achieved state-of-the-art performance in very large two-player zero-sum games. PSRO is based on the tabular double oracle (DO) method, an algorithm that is guaranteed to converge to a Nash equilibrium, but may increase exploitability from one iteration to the next. We propose anytime double oracle (ADO), a tabular double oracle algorithm for 2-player zero-sum games that is guaranteed to converge to a Nash equilibrium while decreasing exploitability from one iteration to the next. Unlike DO, in which the restricted distribution is based on the restricted game formed by each player's strategy sets, ADO finds the restricted distribution for each player that minimizes its exploitability against any policy in the full, unrestricted game. We also propose a method of finding this restricted distribution via a no-regret algorithm updated against best responses, called RM-BR DO. Finally, we propose anytime PSRO (APSRO), a version of ADO that calculates best responses via reinforcement learning. In experiments on Leduc poker and random normal form games, we show that our methods achieve far lower exploitability than DO and PSRO and decrease exploitability monotonically.
△ Less
Submitted 28 January, 2022; v1 submitted 19 January, 2022;
originally announced January 2022.
-
A Cubic Matrix Action for the Standard Model and Beyond
Authors:
Yigit Yargic,
Jaron Lanier,
Lee Smolin,
Dave Wecker
Abstract:
We propose a new framework for matrix theories that are equivalent to field theories on a toroidal spacetime. The correspondence is accomplished via infinite Toeplitz matrices whose entries match the field degrees of freedom on an energy-momentum lattice, thereby replacing the background geometry with matrix indices. These matrix theories can then be embedded into the purely cubic action of a sing…
▽ More
We propose a new framework for matrix theories that are equivalent to field theories on a toroidal spacetime. The correspondence is accomplished via infinite Toeplitz matrices whose entries match the field degrees of freedom on an energy-momentum lattice, thereby replacing the background geometry with matrix indices. These matrix theories can then be embedded into the purely cubic action of a single matrix and combined into a common universality class. We reconstruct the Standard Model action in this framework and discuss its extensions within the same class.
△ Less
Submitted 11 January, 2022;
originally announced January 2022.
-
Constraining map** class group homomorphisms using finite subgroups
Authors:
Lei Chen,
Justin Lanier
Abstract:
We classify homomorphisms from map** class groups by using finite subgroups. First, we give a new proof of a result of Aramayona--Souto that homomorphisms between map** class groups of closed surfaces are trivial for a range of genera. Second, we show that only finitely many map** class groups of closed surfaces have non-trivial homomorphisms into $\text{Homeo}(\mathbb{S}^n)$ for any $n$. We…
▽ More
We classify homomorphisms from map** class groups by using finite subgroups. First, we give a new proof of a result of Aramayona--Souto that homomorphisms between map** class groups of closed surfaces are trivial for a range of genera. Second, we show that only finitely many map** class groups of closed surfaces have non-trivial homomorphisms into $\text{Homeo}(\mathbb{S}^n)$ for any $n$. We also prove that every homomorphism from $\text{Mod}(S_g)$ to $\text{Homeo}(\mathbb{S}^2)$ or $\text{Homeo}(\mathbb{S}^3)$ is trivial if $g\ge 3$, extending a result of Franks--Handel.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
Curve graphs of surfaces with finite-invariance index 1
Authors:
Justin Lanier,
Marissa Loving
Abstract:
In this note we make progress toward a conjecture of Durham--Fanoni--Vlamis, showing that every infinite-type surface with finite-invariance index 1 and no nondisplaceable compact subsurfaces fails to have a good curve graph, that is, a connected graph where vertices represent homotopy classes of essential simple closed curves and where the natural map** class group action has infinite diameter…
▽ More
In this note we make progress toward a conjecture of Durham--Fanoni--Vlamis, showing that every infinite-type surface with finite-invariance index 1 and no nondisplaceable compact subsurfaces fails to have a good curve graph, that is, a connected graph where vertices represent homotopy classes of essential simple closed curves and where the natural map** class group action has infinite diameter orbits. Our arguments use tools developed by Mann--Rafi in their study of the coarse geometry of big map** class groups.
△ Less
Submitted 14 September, 2021; v1 submitted 28 June, 2021;
originally announced June 2021.
-
Improving Social Welfare While Preserving Autonomy via a Pareto Mediator
Authors:
Stephen McAleer,
John Lanier,
Michael Dennis,
Pierre Baldi,
Roy Fox
Abstract:
Machine learning algorithms often make decisions on behalf of agents with varied and sometimes conflicting interests. In domains where agents can choose to take their own action or delegate their action to a central mediator, an open question is how mediators should take actions on behalf of delegating agents. The main existing approach uses delegating agents to punish non-delegating agents in an…
▽ More
Machine learning algorithms often make decisions on behalf of agents with varied and sometimes conflicting interests. In domains where agents can choose to take their own action or delegate their action to a central mediator, an open question is how mediators should take actions on behalf of delegating agents. The main existing approach uses delegating agents to punish non-delegating agents in an attempt to get all agents to delegate, which tends to be costly for all. We introduce a Pareto Mediator which aims to improve outcomes for delegating agents without making any of them worse off. Our experiments in random normal form games, a restaurant recommendation game, and a reinforcement learning sequential social dilemma show that the Pareto Mediator greatly increases social welfare. Also, even when the Pareto Mediator is based on an incorrect model of agent utility, performance gracefully degrades to the pre-intervention level, due to the individual autonomy preserved by the voluntary mediator.
△ Less
Submitted 7 June, 2021;
originally announced June 2021.
-
Map** class groups with the Rokhlin property
Authors:
Justin Lanier,
Nicholas G. Vlamis
Abstract:
We classify the connected orientable 2-manifolds whose map** class groups have a dense conjugacy class. We also show that the map** class group of a connected orientable 2-manifold has a comeager conjugacy class if and only if the map** class group is trivial.
We classify the connected orientable 2-manifolds whose map** class groups have a dense conjugacy class. We also show that the map** class group of a connected orientable 2-manifold has a comeager conjugacy class if and only if the map** class group is trivial.
△ Less
Submitted 19 July, 2022; v1 submitted 24 May, 2021;
originally announced May 2021.
-
The Autodidactic Universe
Authors:
Stephon Alexander,
William J. Cunningham,
Jaron Lanier,
Lee Smolin,
Stefan Stanojevic,
Michael W. Toomey,
Dave Wecker
Abstract:
We present an approach to cosmology in which the Universe learns its own physical laws. It does so by exploring a landscape of possible laws, which we express as a certain class of matrix models. We discover maps that put each of these matrix models in correspondence with both a gauge/gravity theory and a mathematical model of a learning machine, such as a deep recurrent, cyclic neural network. Th…
▽ More
We present an approach to cosmology in which the Universe learns its own physical laws. It does so by exploring a landscape of possible laws, which we express as a certain class of matrix models. We discover maps that put each of these matrix models in correspondence with both a gauge/gravity theory and a mathematical model of a learning machine, such as a deep recurrent, cyclic neural network. This establishes a correspondence between each solution of the physical theory and a run of a neural network. This correspondence is not an equivalence, partly because gauge theories emerge from $N \rightarrow \infty $ limits of the matrix models, whereas the same limits of the neural networks used here are not well-defined. We discuss in detail what it means to say that learning takes place in autodidactic systems, where there is no supervision. We propose that if the neural network model can be said to learn without supervision, the same can be said for the corresponding physical theory. We consider other protocols for autodidactic physical systems, such as optimization of graph variety, subset-replication using self-attention and look-ahead, geometrogenesis guided by reinforcement learning, structural learning using renormalization group techniques, and extensions. These protocols together provide a number of directions in which to explore the origin of physical laws based on putting machine learning architectures in correspondence with physical theories.
△ Less
Submitted 2 September, 2021; v1 submitted 28 March, 2021;
originally announced April 2021.
-
XDO: A Double Oracle Algorithm for Extensive-Form Games
Authors:
Stephen McAleer,
John Lanier,
Kevin Wang,
Pierre Baldi,
Roy Fox
Abstract:
Policy Space Response Oracles (PSRO) is a reinforcement learning (RL) algorithm for two-player zero-sum games that has been empirically shown to find approximate Nash equilibria in large games. Although PSRO is guaranteed to converge to an approximate Nash equilibrium and can handle continuous actions, it may take an exponential number of iterations as the number of information states (infostates)…
▽ More
Policy Space Response Oracles (PSRO) is a reinforcement learning (RL) algorithm for two-player zero-sum games that has been empirically shown to find approximate Nash equilibria in large games. Although PSRO is guaranteed to converge to an approximate Nash equilibrium and can handle continuous actions, it may take an exponential number of iterations as the number of information states (infostates) grows. We propose Extensive-Form Double Oracle (XDO), an extensive-form double oracle algorithm for two-player zero-sum games that is guaranteed to converge to an approximate Nash equilibrium linearly in the number of infostates. Unlike PSRO, which mixes best responses at the root of the game, XDO mixes best responses at every infostate. We also introduce Neural XDO (NXDO), where the best response is learned through deep RL. In tabular experiments on Leduc poker, we find that XDO achieves an approximate Nash equilibrium in a number of iterations an order of magnitude smaller than PSRO. Experiments on a modified Leduc poker game and Oshi-Zumo show that tabular XDO achieves a lower exploitability than CFR with the same amount of computation. We also find that NXDO outperforms PSRO and NFSP on a sequential multidimensional continuous-action game. NXDO is the first deep RL method that can find an approximate Nash equilibrium in high-dimensional continuous-action sequential games. Experiment code is available at https://github.com/indylab/nxdo.
△ Less
Submitted 28 January, 2022; v1 submitted 10 March, 2021;
originally announced March 2021.
-
Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games
Authors:
Stephen McAleer,
John Lanier,
Roy Fox,
Pierre Baldi
Abstract:
Finding approximate Nash equilibria in zero-sum imperfect-information games is challenging when the number of information states is large. Policy Space Response Oracles (PSRO) is a deep reinforcement learning algorithm grounded in game theory that is guaranteed to converge to an approximate Nash equilibrium. However, PSRO requires training a reinforcement learning policy at each iteration, making…
▽ More
Finding approximate Nash equilibria in zero-sum imperfect-information games is challenging when the number of information states is large. Policy Space Response Oracles (PSRO) is a deep reinforcement learning algorithm grounded in game theory that is guaranteed to converge to an approximate Nash equilibrium. However, PSRO requires training a reinforcement learning policy at each iteration, making it too slow for large games. We show through counterexamples and experiments that DCH and Rectified PSRO, two existing approaches to scaling up PSRO, fail to converge even in small games. We introduce Pipeline PSRO (P2SRO), the first scalable general method for finding approximate Nash equilibria in large zero-sum imperfect-information games. P2SRO is able to parallelize PSRO with convergence guarantees by maintaining a hierarchical pipeline of reinforcement learning workers, each training against the policies generated by lower levels in the hierarchy. We show that unlike existing methods, P2SRO converges to an approximate Nash equilibrium, and does so faster as the number of parallel workers increases, across a variety of imperfect information games. We also introduce an open-source environment for Barrage Stratego, a variant of Stratego with an approximate game tree complexity of $10^{50}$. P2SRO is able to achieve state-of-the-art performance on Barrage Stratego and beats all existing bots. Experiment code is available athttps://github.com/JBLanier/pipeline-psro.
△ Less
Submitted 18 February, 2021; v1 submitted 15 June, 2020;
originally announced June 2020.
-
ColosseumRL: A Framework for Multiagent Reinforcement Learning in $N$-Player Games
Authors:
Alexander Shmakov,
John Lanier,
Stephen McAleer,
Rohan Achar,
Cristina Lopes,
Pierre Baldi
Abstract:
Much of recent success in multiagent reinforcement learning has been in two-player zero-sum games. In these games, algorithms such as fictitious self-play and minimax tree search can converge to an approximate Nash equilibrium. While playing a Nash equilibrium strategy in a two-player zero-sum game is optimal, in an $n$-player general sum game, it becomes a much less informative solution concept.…
▽ More
Much of recent success in multiagent reinforcement learning has been in two-player zero-sum games. In these games, algorithms such as fictitious self-play and minimax tree search can converge to an approximate Nash equilibrium. While playing a Nash equilibrium strategy in a two-player zero-sum game is optimal, in an $n$-player general sum game, it becomes a much less informative solution concept. Despite the lack of a satisfying solution concept, $n$-player games form the vast majority of real-world multiagent situations. In this paper we present a new framework for research in reinforcement learning in $n$-player games. We hope that by analyzing behavior learned by agents in these environments the community can better understand this important research area and move toward meaningful solution concepts and research directions. The implementation and additional information about this framework can be found at https://colosseumrl.igb.uci.edu/.
△ Less
Submitted 9 December, 2019;
originally announced December 2019.
-
OffWorld Gym: open-access physical robotics environment for real-world reinforcement learning benchmark and research
Authors:
Ashish Kumar,
Toby Buckley,
John B. Lanier,
Qiaozhi Wang,
Alicia Kavelaars,
Ilya Kuzovkin
Abstract:
Success stories of applied machine learning can be traced back to the datasets and environments that were put forward as challenges for the community. The challenge that the community sets as a benchmark is usually the challenge that the community eventually solves. The ultimate challenge of reinforcement learning research is to train real agents to operate in the real environment, but until now t…
▽ More
Success stories of applied machine learning can be traced back to the datasets and environments that were put forward as challenges for the community. The challenge that the community sets as a benchmark is usually the challenge that the community eventually solves. The ultimate challenge of reinforcement learning research is to train real agents to operate in the real environment, but until now there has not been a common real-world RL benchmark. In this work, we present a prototype real-world environment from OffWorld Gym -- a collection of real-world environments for reinforcement learning in robotics with free public remote access. Close integration into existing ecosystem allows the community to start using OffWorld Gym without any prior experience in robotics and takes away the burden of managing a physical robotics system, abstracting it under a familiar API. We introduce a navigation task, where a robot has to reach a visual beacon on an uneven terrain using only the camera input and provide baseline results in both the real environment and the simulated replica. To start training, visit https://gym.offworld.ai
△ Less
Submitted 14 December, 2020; v1 submitted 18 October, 2019;
originally announced October 2019.
-
Recognizing topological polynomials by lifting trees
Authors:
James Belk,
Justin Lanier,
Dan Margalit,
Rebecca R. Winarski
Abstract:
We give a simple algorithm that determines whether a given post-critically finite topological polynomial is Thurston equivalent to a polynomial. If it is, the algorithm produces the Hubbard tree; otherwise, the algorithm produces the canonical obstruction. Our approach is rooted in geometric group theory, using iteration on a simplicial complex of trees, and building on work of Nekrashevych. As on…
▽ More
We give a simple algorithm that determines whether a given post-critically finite topological polynomial is Thurston equivalent to a polynomial. If it is, the algorithm produces the Hubbard tree; otherwise, the algorithm produces the canonical obstruction. Our approach is rooted in geometric group theory, using iteration on a simplicial complex of trees, and building on work of Nekrashevych. As one application of our methods, we resolve the polynomial case of Pilgrim's finite global attractor conjecture. We also give a new solution to Hubbard's twisted rabbit problem, and we state and solve several generalizations of Hubbard's problem where the number of post-critical points is arbitrarily large.
△ Less
Submitted 24 November, 2021; v1 submitted 18 June, 2019;
originally announced June 2019.
-
Curiosity-Driven Multi-Criteria Hindsight Experience Replay
Authors:
John B. Lanier,
Stephen McAleer,
Pierre Baldi
Abstract:
Dealing with sparse rewards is a longstanding challenge in reinforcement learning. The recent use of hindsight methods have achieved success on a variety of sparse-reward tasks, but they fail on complex tasks such as stacking multiple blocks with a robot arm in simulation. Curiosity-driven exploration using the prediction error of a learned dynamics model as an intrinsic reward has been shown to b…
▽ More
Dealing with sparse rewards is a longstanding challenge in reinforcement learning. The recent use of hindsight methods have achieved success on a variety of sparse-reward tasks, but they fail on complex tasks such as stacking multiple blocks with a robot arm in simulation. Curiosity-driven exploration using the prediction error of a learned dynamics model as an intrinsic reward has been shown to be effective for exploring a number of sparse-reward environments. We present a method that combines hindsight with curiosity-driven exploration and curriculum learning in order to solve the challenging sparse-reward block stacking task. We are the first to stack more than two blocks using only sparse reward without human demonstrations.
△ Less
Submitted 9 June, 2019;
originally announced June 2019.
-
Centers of subgroups of big map** class groups and the Tits alternative
Authors:
Justin Lanier,
Marissa Loving
Abstract:
In this note we show that many subgroups of map** class groups of infinite-type surfaces without boundary have trivial centers, including all normal subgroups. Using similar techniques, we show that every nontrivial normal subgroup of a big map** class group contains a nonabelian free group. In contrast, we show that no big map** class group satisfies the strong Tits alternative enjoyed by f…
▽ More
In this note we show that many subgroups of map** class groups of infinite-type surfaces without boundary have trivial centers, including all normal subgroups. Using similar techniques, we show that every nontrivial normal subgroup of a big map** class group contains a nonabelian free group. In contrast, we show that no big map** class group satisfies the strong Tits alternative enjoyed by finite-type map** class groups. We also give examples of big map** class groups that fail to satisfy even the classical Tits alternative and give a proof that every countable group appears as a subgroup of some big map** class group.
△ Less
Submitted 22 April, 2019;
originally announced April 2019.
-
Adding a point to configurations in closed balls
Authors:
Lei Chen,
Nir Gadish,
Justin Lanier
Abstract:
We answer the question of when a new point can be added in a continuous way to configurations of $n$ distinct points in a closed ball of arbitrary dimension. We show that this is possible given an ordered configuration of $n$ points if and only if $n \neq 1$. On the other hand, when the points are not ordered and the dimension of the ball is at least 2, a point can be added continuously if and onl…
▽ More
We answer the question of when a new point can be added in a continuous way to configurations of $n$ distinct points in a closed ball of arbitrary dimension. We show that this is possible given an ordered configuration of $n$ points if and only if $n \neq 1$. On the other hand, when the points are not ordered and the dimension of the ball is at least 2, a point can be added continuously if and only if $n = 2$. These results generalize the Brouwer fixed-point theorem, which gives the negative answer when $n=1$. We also show that when $n=2$, there is a unique solution to both the ordered and unordered versions of the problem up to homotopy.
△ Less
Submitted 7 May, 2019; v1 submitted 18 September, 2018;
originally announced September 2018.
-
How to hear the shape of a billiard table
Authors:
Aaron Calderon,
Solly Coles,
Diana Davis,
Justin Lanier,
Andre Oliveira
Abstract:
The bounce spectrum of a polygonal billiard table is the collection of all bi-infinite sequences of edge labels corresponding to billiard trajectories on the table. We give methods for reconstructing from the bounce spectrum of a polygonal billiard table both the cyclic ordering of its edge labels and the sizes of its angles. We also show that it is impossible to reconstruct the exact shape of a p…
▽ More
The bounce spectrum of a polygonal billiard table is the collection of all bi-infinite sequences of edge labels corresponding to billiard trajectories on the table. We give methods for reconstructing from the bounce spectrum of a polygonal billiard table both the cyclic ordering of its edge labels and the sizes of its angles. We also show that it is impossible to reconstruct the exact shape of a polygonal billiard table from any finite collection of finite words from its bounce spectrum.
△ Less
Submitted 25 June, 2018;
originally announced June 2018.
-
Normal generators for map** class groups are abundant
Authors:
Justin Lanier,
Dan Margalit
Abstract:
We provide a simple criterion for an element of the map** class group of a closed surface to have normal closure equal to the whole map** class group. We apply this to show that every nontrivial periodic map** class that is not a hyperelliptic involution is a normal generator for the map** class group when the genus is at least 3. We also give many examples of pseudo-Anosov normal generato…
▽ More
We provide a simple criterion for an element of the map** class group of a closed surface to have normal closure equal to the whole map** class group. We apply this to show that every nontrivial periodic map** class that is not a hyperelliptic involution is a normal generator for the map** class group when the genus is at least 3. We also give many examples of pseudo-Anosov normal generators, answering a question of D. D. Long. In fact we show that every pseudo-Anosov map** class with stretch factor less than $\sqrt{2}$ is a normal generator. Even more, we give pseudo-Anosov normal generators with arbitrarily large stretch factors and arbitrarily large translation lengths on the curve graph, disproving a conjecture of Ivanov.
△ Less
Submitted 1 June, 2020; v1 submitted 9 May, 2018;
originally announced May 2018.
-
Generating map** class groups with elements of fixed finite order
Authors:
Justin Lanier
Abstract:
We show that for any $k$ at least $6$ and $g$ sufficiently large, the map** class group of a surface of genus $g$ can be generated by three elements of order $k$. We also show that this can be done with four elements of order $5$. We additionally prove similar results for some permutation groups, linear groups, and automorphism groups of free groups.
We show that for any $k$ at least $6$ and $g$ sufficiently large, the map** class group of a surface of genus $g$ can be generated by three elements of order $k$. We also show that this can be done with four elements of order $5$. We additionally prove similar results for some permutation groups, linear groups, and automorphism groups of free groups.
△ Less
Submitted 12 October, 2017;
originally announced October 2017.