Search | arXiv e-print repository

Evolution through Large Models

Authors: Joel Lehman, Jonathan Gordon, Shawn Jain, Kamal Ndousse, Cathy Yeh, Kenneth O. Stanley

Abstract: This paper pursues the insight that large language models (LLMs) trained to generate code can vastly improve the effectiveness of mutation operators applied to programs in genetic programming (GP). Because such LLMs benefit from training data that includes sequential changes and modifications, they can approximate likely changes that humans would make. To highlight the breadth of implications of s… ▽ More This paper pursues the insight that large language models (LLMs) trained to generate code can vastly improve the effectiveness of mutation operators applied to programs in genetic programming (GP). Because such LLMs benefit from training data that includes sequential changes and modifications, they can approximate likely changes that humans would make. To highlight the breadth of implications of such evolution through large models (ELM), in the main experiment ELM combined with MAP-Elites generates hundreds of thousands of functional examples of Python programs that output working ambulating robots in the Sodarace domain, which the original LLM had never seen in pre-training. These examples then help to bootstrap training a new conditional language model that can output the right walker for a particular terrain. The ability to bootstrap new models that can output appropriate artifacts for a given context in a domain where zero training data was previously available carries implications for open-endedness, deep learning, and reinforcement learning. These implications are explored here in depth in the hope of inspiring new directions of research now opened up by ELM. △ Less

Submitted 17 June, 2022; originally announced June 2022.

arXiv:2106.12070 [pdf, other]

Towards Consistent Predictive Confidence through Fitted Ensembles

Authors: Navid Kardan, Ankit Sharma, Kenneth O. Stanley

Abstract: Deep neural networks are behind many of the recent successes in machine learning applications. However, these models can produce overconfident decisions while encountering out-of-distribution (OOD) examples or making a wrong prediction. This inconsistent predictive confidence limits the integration of independently-trained learning models into a larger system. This paper introduces separable conce… ▽ More Deep neural networks are behind many of the recent successes in machine learning applications. However, these models can produce overconfident decisions while encountering out-of-distribution (OOD) examples or making a wrong prediction. This inconsistent predictive confidence limits the integration of independently-trained learning models into a larger system. This paper introduces separable concept learning framework to realistically measure the performance of classifiers in presence of OOD examples. In this setup, several instances of a classifier are trained on different parts of a partition of the set of classes. Later, the performance of the combination of these models is evaluated on a separate test set. Unlike current OOD detection techniques, this framework does not require auxiliary OOD datasets and does not separate classification from detection performance. Furthermore, we present a new strong baseline for more consistent predictive confidence in deep models, called fitted ensembles, where overconfident predictions are rectified by transformed versions of the original classification task. Fitted ensembles can naturally detect OOD examples without requiring auxiliary data by observing contradicting predictions among its components. Experiments on MNIST, SVHN, CIFAR-10/100, and ImageNet show fitted ensemble significantly outperform conventional ensembles on OOD examples and are possible to scale. △ Less

Submitted 22 June, 2021; originally announced June 2021.

Comments: IJCNN 2021

arXiv:2005.13092 [pdf, other]

Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search

Authors: Aditya Rawal, Joel Lehman, Felipe Petroski Such, Jeff Clune, Kenneth O. Stanley

Abstract: Neural Architecture Search (NAS) explores a large space of architectural motifs -- a compute-intensive process that often involves ground-truth evaluation of each motif by instantiating it within a large network, and training and evaluating the network with thousands of domain-specific data samples. Inspired by how biological motifs such as cells are sometimes extracted from their natural environm… ▽ More Neural Architecture Search (NAS) explores a large space of architectural motifs -- a compute-intensive process that often involves ground-truth evaluation of each motif by instantiating it within a large network, and training and evaluating the network with thousands of domain-specific data samples. Inspired by how biological motifs such as cells are sometimes extracted from their natural environment and studied in an artificial Petri dish setting, this paper proposes the Synthetic Petri Dish model for evaluating architectural motifs. In the Synthetic Petri Dish, architectural motifs are instantiated in very small networks and evaluated using very few learned synthetic data samples (to effectively approximate performance in the full problem). The relative performance of motifs in the Synthetic Petri Dish can substitute for their ground-truth performance, thus accelerating the most expensive step of NAS. Unlike other neural network-based prediction models that parse the structure of the motif to estimate its performance, the Synthetic Petri Dish predicts motif performance by training the actual motif in an artificial setting, thus deriving predictions from its true intrinsic properties. Experiments in this paper demonstrate that the Synthetic Petri Dish can therefore predict the performance of new motifs with significantly higher accuracy, especially when insufficient ground truth data is available. Our hope is that this work can inspire a new research direction in studying the performance of extracted components of models in an alternative controlled setting. △ Less

Submitted 26 May, 2020; originally announced May 2020.

arXiv:2004.12919 [pdf, other]

doi 10.1038/s41586-020-03157-9

First return, then explore

Authors: Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, Jeff Clune

Abstract: The promise of reinforcement learning is to solve complex sequential decision problems autonomously by specifying a high-level reward function only. However, reinforcement learning algorithms struggle when, as is often the case, simple and intuitive rewards provide sparse and deceptive feedback. Avoiding these pitfalls requires thoroughly exploring the environment, but creating algorithms that can… ▽ More The promise of reinforcement learning is to solve complex sequential decision problems autonomously by specifying a high-level reward function only. However, reinforcement learning algorithms struggle when, as is often the case, simple and intuitive rewards provide sparse and deceptive feedback. Avoiding these pitfalls requires thoroughly exploring the environment, but creating algorithms that can do so remains one of the central challenges of the field. We hypothesise that the main impediment to effective exploration originates from algorithms forgetting how to reach previously visited states ("detachment") and from failing to first return to a state before exploring from it ("derailment"). We introduce Go-Explore, a family of algorithms that addresses these two challenges directly through the simple principles of explicitly remembering promising states and first returning to such states before intentionally exploring. Go-Explore solves all heretofore unsolved Atari games and surpasses the state of the art on all hard-exploration games, with orders of magnitude improvements on the grand challenges Montezuma's Revenge and Pitfall. We also demonstrate the practical potential of Go-Explore on a sparse-reward pick-and-place robotics task. Additionally, we show that adding a goal-conditioned policy can further improve Go-Explore's exploration efficiency and enable it to handle stochasticity throughout training. The substantial performance gains from Go-Explore suggest that the simple principles of remembering states, returning to them, and exploring from them are a powerful and general approach to exploration, an insight that may prove critical to the creation of truly intelligent learning agents. △ Less

Submitted 16 September, 2021; v1 submitted 27 April, 2020; originally announced April 2020.

Comments: 47 pages, 14 figures, 4 tables; reorganized sections and modified SI text extensively; added reference to the published version, changed title to published title; added reference to published unformatted pdf

Journal ref: Nature 590, 580-586 (2021)

arXiv:2003.11164 [pdf, other]

Fiber: A Platform for Efficient Development and Distributed Training for Reinforcement Learning and Population-Based Methods

Authors: Jiale Zhi, Rui Wang, Jeff Clune, Kenneth O. Stanley

Abstract: Recent advances in machine learning are consistently enabled by increasing amounts of computation. Reinforcement learning (RL) and population-based methods in particular pose unique challenges for efficiency and flexibility to the underlying distributed computing frameworks. These challenges include frequent interaction with simulations, the need for dynamic scaling, and the need for a user interf… ▽ More Recent advances in machine learning are consistently enabled by increasing amounts of computation. Reinforcement learning (RL) and population-based methods in particular pose unique challenges for efficiency and flexibility to the underlying distributed computing frameworks. These challenges include frequent interaction with simulations, the need for dynamic scaling, and the need for a user interface with low adoption cost and consistency across different backends. In this paper we address these challenges while still retaining development efficiency and flexibility for both research and practical applications by introducing Fiber, a scalable distributed computing framework for RL and population-based methods. Fiber aims to significantly expand the accessibility of large-scale parallel computation to users of otherwise complicated RL and population-based approaches without the need to for specialized computational expertise. △ Less

Submitted 24 March, 2020; originally announced March 2020.

arXiv:2003.08536 [pdf, other]

Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions

Authors: Rui Wang, Joel Lehman, Aditya Rawal, Jiale Zhi, Yulun Li, Jeff Clune, Kenneth O. Stanley

Abstract: Creating open-ended algorithms, which generate their own never-ending stream of novel and appropriately challenging learning opportunities, could help to automate and accelerate progress in machine learning. A recent step in this direction is the Paired Open-Ended Trailblazer (POET), an algorithm that generates and solves its own challenges, and allows solutions to goal-switch between challenges t… ▽ More Creating open-ended algorithms, which generate their own never-ending stream of novel and appropriately challenging learning opportunities, could help to automate and accelerate progress in machine learning. A recent step in this direction is the Paired Open-Ended Trailblazer (POET), an algorithm that generates and solves its own challenges, and allows solutions to goal-switch between challenges to avoid local optima. However, the original POET was unable to demonstrate its full creative potential because of limitations of the algorithm itself and because of external issues including a limited problem space and lack of a universal progress measure. Importantly, both limitations pose impediments not only for POET, but for the pursuit of open-endedness in general. Here we introduce and empirically validate two new innovations to the original algorithm, as well as two external innovations designed to help elucidate its full potential. Together, these four advances enable the most open-ended algorithmic demonstration to date. The algorithmic innovations are (1) a domain-general measure of how meaningfully novel new challenges are, enabling the system to potentially create and solve interesting challenges endlessly, and (2) an efficient heuristic for determining when agents should goal-switch from one problem to another (hel** open-ended search better scale). Outside the algorithm itself, to enable a more definitive demonstration of open-endedness, we introduce (3) a novel, more flexible way to encode environmental challenges, and (4) a generic measure of the extent to which a system continues to exhibit open-ended innovation. Enhanced POET produces a diverse range of sophisticated behaviors that solve a wide range of environmental challenges, many of which cannot be solved through other means. △ Less

Submitted 13 April, 2020; v1 submitted 18 March, 2020; originally announced March 2020.

Comments: 23 pages, 14 figures

arXiv:2002.10585 [pdf, other]

Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity

Authors: Thomas Miconi, Aditya Rawal, Jeff Clune, Kenneth O. Stanley

Abstract: The impressive lifelong learning in animal brains is primarily enabled by plastic changes in synaptic connectivity. Importantly, these changes are not passive, but are actively controlled by neuromodulation, which is itself under the control of the brain. The resulting self-modifying abilities of the brain play an important role in learning and adaptation, and are a major basis for biological rein… ▽ More The impressive lifelong learning in animal brains is primarily enabled by plastic changes in synaptic connectivity. Importantly, these changes are not passive, but are actively controlled by neuromodulation, which is itself under the control of the brain. The resulting self-modifying abilities of the brain play an important role in learning and adaptation, and are a major basis for biological reinforcement learning. Here we show for the first time that artificial neural networks with such neuromodulated plasticity can be trained with gradient descent. Extending previous work on differentiable Hebbian plasticity, we propose a differentiable formulation for the neuromodulation of plasticity. We show that neuromodulated plasticity improves the performance of neural networks on both reinforcement learning and supervised learning tasks. In one task, neuromodulated plastic LSTMs with millions of parameters outperform standard LSTMs on a benchmark language modeling task (controlling for the number of parameters). We conclude that differentiable neuromodulation of plasticity offers a powerful new framework for training neural networks. △ Less

Submitted 24 February, 2020; originally announced February 2020.

Comments: Presented at the 7th International Conference on Learning Representations (ICLR 2019)

Journal ref: 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019

arXiv:2002.09571 [pdf, other]

Learning to Continually Learn

Authors: Shawn Beaulieu, Lapo Frati, Thomas Miconi, Joel Lehman, Kenneth O. Stanley, Jeff Clune, Nick Cheney

Abstract: Continual lifelong learning requires an agent or model to learn many sequentially ordered tasks, building on previous knowledge without catastrophically forgetting it. Much work has gone towards preventing the default tendency of machine learning models to catastrophically forget, yet virtually all such work involves manually-designed solutions to the problem. We instead advocate meta-learning a s… ▽ More Continual lifelong learning requires an agent or model to learn many sequentially ordered tasks, building on previous knowledge without catastrophically forgetting it. Much work has gone towards preventing the default tendency of machine learning models to catastrophically forget, yet virtually all such work involves manually-designed solutions to the problem. We instead advocate meta-learning a solution to catastrophic forgetting, allowing AI to learn to continually learn. Inspired by neuromodulatory processes in the brain, we propose A Neuromodulated Meta-Learning Algorithm (ANML). It differentiates through a sequential learning process to meta-learn an activation-gating function that enables context-dependent selective activation within a deep neural network. Specifically, a neuromodulatory (NM) neural network gates the forward pass of another (otherwise normal) neural network called the prediction learning network (PLN). The NM network also thus indirectly controls selective plasticity (i.e. the backward pass of) the PLN. ANML enables continual learning without catastrophic forgetting at scale: it produces state-of-the-art continual learning performance, sequentially learning as many as 600 classes (over 9,000 SGD updates). △ Less

Submitted 3 March, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

arXiv:2001.01683 [pdf, other]

Deep Innovation Protection: Confronting the Credit Assignment Problem in Training Heterogeneous Neural Architectures

Authors: Sebastian Risi, Kenneth O. Stanley

Abstract: Deep reinforcement learning approaches have shown impressive results in a variety of different domains, however, more complex heterogeneous architectures such as world models require the different neural components to be trained separately instead of end-to-end. While a simple genetic algorithm recently showed end-to-end training is possible, it failed to solve a more complex 3D task. This paper p… ▽ More Deep reinforcement learning approaches have shown impressive results in a variety of different domains, however, more complex heterogeneous architectures such as world models require the different neural components to be trained separately instead of end-to-end. While a simple genetic algorithm recently showed end-to-end training is possible, it failed to solve a more complex 3D task. This paper presents a method called Deep Innovation Protection (DIP) that addresses the credit assignment problem in training complex heterogenous neural network models end-to-end for such environments. The main idea behind the approach is to employ multiobjective optimization to temporally reduce the selection pressure on specific components in multi-component network, allowing other components to adapt. We investigate the emergent representations of these evolved networks, which learn to predict properties important for the survival of the agent, without the need for a specific forward-prediction loss. △ Less

Submitted 23 February, 2021; v1 submitted 29 December, 2019; originally announced January 2020.

arXiv:1912.07768 [pdf, other]

Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data

Authors: Felipe Petroski Such, Aditya Rawal, Joel Lehman, Kenneth O. Stanley, Jeff Clune

Abstract: This paper investigates the intriguing question of whether we can create learning algorithms that automatically generate training data, learning environments, and curricula in order to help AI agents rapidly learn. We show that such algorithms are possible via Generative Teaching Networks (GTNs), a general approach that is, in theory, applicable to supervised, unsupervised, and reinforcement learn… ▽ More This paper investigates the intriguing question of whether we can create learning algorithms that automatically generate training data, learning environments, and curricula in order to help AI agents rapidly learn. We show that such algorithms are possible via Generative Teaching Networks (GTNs), a general approach that is, in theory, applicable to supervised, unsupervised, and reinforcement learning, although our experiments only focus on the supervised case. GTNs are deep neural networks that generate data and/or training environments that a learner (e.g. a freshly initialized neural network) trains on for a few SGD steps before being tested on a target task. We then differentiate through the entire learning process via meta-gradients to update the GTN parameters to improve performance on the target task. GTNs have the beneficial property that they can theoretically generate any type of data or training environment, making their potential impact large. This paper introduces GTNs, discusses their potential, and showcases that they can substantially accelerate learning. We also demonstrate a practical and exciting application of GTNs: accelerating the evaluation of candidate architectures for neural architecture search (NAS), which is rate-limited by such evaluations, enabling massive speed-ups in NAS. GTN-NAS improves the NAS state of the art, finding higher performing architectures when controlling for the search proposal mechanism. GTN-NAS also is competitive with the overall state of the art approaches, which achieve top performance while using orders of magnitude less computation than typical NAS methods. Speculating forward, GTNs may represent a first step toward the ambitious goal of algorithms that generate their own training data and, in doing so, open a variety of interesting new research questions and directions. △ Less

Submitted 16 December, 2019; originally announced December 2019.

arXiv:1909.04430 [pdf]

doi 10.1162/artl_a_00291

An Overview of Open-Ended Evolution: Editorial Introduction to the Open-Ended Evolution II Special Issue

Authors: Norman Packard, Mark A. Bedau, Alastair Channon, Takashi Ikegami, Steen Rasmussen, Kenneth O. Stanley, Tim Taylor

Abstract: Nature's spectacular inventiveness, reflected in the enormous diversity of form and function displayed by the biosphere, is a feature of life that distinguishes living most strongly from nonliving. It is, therefore, not surprising that this aspect of life should become a central focus of artificial life. We have known since Darwin that the diversity is produced dynamically, through the process of… ▽ More Nature's spectacular inventiveness, reflected in the enormous diversity of form and function displayed by the biosphere, is a feature of life that distinguishes living most strongly from nonliving. It is, therefore, not surprising that this aspect of life should become a central focus of artificial life. We have known since Darwin that the diversity is produced dynamically, through the process of evolution; this has led life's creative productivity to be called Open-Ended Evolution (OEE) in the field. This article introduces the second of two special issues on current research in OEE and provides an overview of the contents of both special issues. Most of the work was presented at a workshop on open-ended evolution that was held as a part of the 2018 Conference on Artificial Life in Tokyo, and much of it had antecedents in two previous workshops on open-ended evolution at artificial life conferences in Cancun and York. We present a simplified categorization of OEE and summarize progress in the field as represented by the articles in this special issue. △ Less

Submitted 10 September, 2019; originally announced September 2019.

Comments: This article is published in the Artificial Life journal (https://www.mitpressjournals.org/loi/artl) and is copyright (c) 2019 Massachusetts Institute of Technology. It it posted on arXiv.org after the publication embargo period in accordance with MIT Press Journals' author posting guidelines (https://www.mitpressjournals.org/for_authors#authorposting)

Journal ref: Artificial Life, 25(2), pp. 93-103, 2019

arXiv:1907.06077 [pdf, other]

Evolvability ES: Scalable and Direct Optimization of Evolvability

Authors: Alexander Gajewski, Jeff Clune, Kenneth O. Stanley, Joel Lehman

Abstract: Designing evolutionary algorithms capable of uncovering highly evolvable representations is an open challenge; such evolvability is important because it accelerates evolution and enables fast adaptation to changing circumstances. This paper introduces evolvability ES, an evolutionary algorithm designed to explicitly and efficiently optimize for evolvability, i.e. the ability to further adapt. The… ▽ More Designing evolutionary algorithms capable of uncovering highly evolvable representations is an open challenge; such evolvability is important because it accelerates evolution and enables fast adaptation to changing circumstances. This paper introduces evolvability ES, an evolutionary algorithm designed to explicitly and efficiently optimize for evolvability, i.e. the ability to further adapt. The insight is that it is possible to derive a novel objective in the spirit of natural evolution strategies that maximizes the diversity of behaviors exhibited when an individual is subject to random mutations, and that efficiently scales with computation. Experiments in 2-D and 3-D locomotion tasks highlight the potential of evolvability ES to generate solutions with tens of thousands of parameters that can quickly be adapted to solve different tasks and that can productively seed further evolution. We further highlight a connection between evolvability and a recent and popular gradient-based meta-learning algorithm called MAML; results show that evolvability ES can perform competitively with MAML and that it discovers solutions with distinct properties. The conclusion is that evolvability ES opens up novel research directions for studying and exploiting the potential of evolvable representations for deep neural networks. △ Less

Submitted 13 July, 2019; originally announced July 2019.

Comments: Published in GECCO 2019

arXiv:1906.08857 [pdf, other]

Deep Neuroevolution of Recurrent and Discrete World Models

Authors: Sebastian Risi, Kenneth O. Stanley

Abstract: Neural architectures inspired by our own human cognitive system, such as the recently introduced world models, have been shown to outperform traditional deep reinforcement learning (RL) methods in a variety of different domains. Instead of the relatively simple architectures employed in most RL experiments, world models rely on multiple different neural components that are responsible for visual i… ▽ More Neural architectures inspired by our own human cognitive system, such as the recently introduced world models, have been shown to outperform traditional deep reinforcement learning (RL) methods in a variety of different domains. Instead of the relatively simple architectures employed in most RL experiments, world models rely on multiple different neural components that are responsible for visual information processing, memory, and decision-making. However, so far the components of these models have to be trained separately and through a variety of specialized training methods. This paper demonstrates the surprising finding that models with the same precise parts can be instead efficiently trained end-to-end through a genetic algorithm (GA), reaching a comparable performance to the original world model by solving a challenging car racing task. An analysis of the evolved visual and memory system indicates that they include a similar effective representation to the system trained through gradient descent. Additionally, in contrast to gradient descent methods that struggle with discrete variables, GAs also work directly with such representations, opening up opportunities for classical planning in latent space. This paper adds additional evidence on the effectiveness of deep neuroevolution for tasks that require the intricate orchestration of multiple components in complex heterogeneous architectures. △ Less

Submitted 28 April, 2019; originally announced June 2019.

arXiv:1901.10995 [pdf, other]

Go-Explore: a New Approach for Hard-Exploration Problems

Authors: Adrien Ecoffet, Joost Huizinga, Joel Lehman, Kenneth O. Stanley, Jeff Clune

Abstract: A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma's Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. To… ▽ More A grand challenge in reinforcement learning is intelligent exploration, especially when rewards are sparse or deceptive. Two Atari games serve as benchmarks for such hard-exploration domains: Montezuma's Revenge and Pitfall. On both games, current RL algorithms perform poorly, even those with intrinsic motivation, which is the dominant method to improve performance on hard-exploration domains. To address this shortfall, we introduce a new algorithm called Go-Explore. It exploits the following principles: (1) remember previously visited states, (2) first return to a promising state (without exploration), then explore from it, and (3) solve simulated environments through any available means (including by introducing determinism), then robustify via imitation learning. The combined effect of these principles is a dramatic performance improvement on hard-exploration problems. On Montezuma's Revenge, Go-Explore scores a mean of over 43k points, almost 4 times the previous state of the art. Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma's Revenge. Its max performance of nearly 18 million surpasses the human world record, meeting even the strictest definition of "superhuman" performance. On Pitfall, Go-Explore with domain knowledge is the first algorithm to score above zero. Its mean score of almost 60k points exceeds expert human performance. Because Go-Explore produces high-performing demonstrations automatically and cheaply, it also outperforms imitation learning work where humans provide solution demonstrations. Go-Explore opens up many new research directions into improving it and weaving its insights into current RL algorithms. It may also enable progress on previously unsolvable hard-exploration problems in many domains, especially those that harness a simulator during training (e.g. robotics). △ Less

Submitted 26 February, 2021; v1 submitted 30 January, 2019; originally announced January 2019.

Comments: 37 pages, 14 figures; added references to Goyal et al. and Oh et al., updated reference to Colas et al; updated author emails; point readers to updated paper

arXiv:1901.01753 [pdf, other]

Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions

Authors: Rui Wang, Joel Lehman, Jeff Clune, Kenneth O. Stanley

Abstract: While the history of machine learning so far largely encompasses a series of problems posed by researchers and algorithms that learn their solutions, an important question is whether the problems themselves can be generated by the algorithm at the same time as they are being solved. Such a process would in effect build its own diverse and expanding curricula, and the solutions to problems at vario… ▽ More While the history of machine learning so far largely encompasses a series of problems posed by researchers and algorithms that learn their solutions, an important question is whether the problems themselves can be generated by the algorithm at the same time as they are being solved. Such a process would in effect build its own diverse and expanding curricula, and the solutions to problems at various stages would become step** stones towards solving even more challenging problems later in the process. The Paired Open-Ended Trailblazer (POET) algorithm introduced in this paper does just that: it pairs the generation of environmental challenges and the optimization of agents to solve those challenges. It simultaneously explores many different paths through the space of possible problems and solutions and, critically, allows these step**-stone solutions to transfer between problems if better, catalyzing innovation. The term open-ended signifies the intriguing potential for algorithms like POET to continue to create novel and increasingly complex capabilities without bound. Our results show that POET produces a diverse range of sophisticated behaviors that solve a wide range of environmental challenges, many of which cannot be solved by direct optimization alone, or even through a direct-path curriculum-building control algorithm introduced to highlight the critical role of open-endedness in solving ambitious challenges. The ability to transfer solutions from one environment to another proves essential to unlocking the full potential of the system as a whole, demonstrating the unpredictable nature of fortuitous step** stones. We hope that POET will inspire a new push towards open-ended discovery across many domains, where algorithms like POET can blaze a trail through their interesting possible manifestations and solutions. △ Less

Submitted 20 February, 2019; v1 submitted 7 January, 2019; originally announced January 2019.

Comments: 28 pages, 9 figures

arXiv:1805.01141 [pdf, other]

VINE: An Open Source Interactive Data Visualization Tool for Neuroevolution

Authors: Rui Wang, Jeff Clune, Kenneth O. Stanley

Abstract: Recent advances in deep neuroevolution have demonstrated that evolutionary algorithms, such as evolution strategies (ES) and genetic algorithms (GA), can scale to train deep neural networks to solve difficult reinforcement learning (RL) problems. However, it remains a challenge to analyze and interpret the underlying process of neuroevolution in such high dimensions. To begin to address this chall… ▽ More Recent advances in deep neuroevolution have demonstrated that evolutionary algorithms, such as evolution strategies (ES) and genetic algorithms (GA), can scale to train deep neural networks to solve difficult reinforcement learning (RL) problems. However, it remains a challenge to analyze and interpret the underlying process of neuroevolution in such high dimensions. To begin to address this challenge, this paper presents an interactive data visualization tool called VINE (Visual Inspector for NeuroEvolution) aimed at hel** neuroevolution researchers and end-users better understand and explore this family of algorithms. VINE works seamlessly with a breadth of neuroevolution algorithms, including ES and GA, and addresses the difficulty of observing the underlying dynamics of the learning process through an interactive visualization of the evolving agent's behavior characterizations over generations. As neuroevolution scales to neural networks with millions or more connections, visualization tools like VINE that offer fresh insight into the underlying dynamics of evolution become increasingly valuable and important for inspiring new innovations and applications. △ Less

Submitted 3 May, 2018; originally announced May 2018.

arXiv:1804.02464 [pdf, other]

Differentiable plasticity: training plastic neural networks with backpropagation

Authors: Thomas Miconi, Jeff Clune, Kenneth O. Stanley

Abstract: How can we build agents that keep learning from experience, quickly and efficiently, after their initial training? Here we take inspiration from the main mechanism of learning in biological brains: synaptic plasticity, carefully tuned by evolution to produce efficient lifelong learning. We show that plasticity, just like connection weights, can be optimized by gradient descent in large (millions o… ▽ More How can we build agents that keep learning from experience, quickly and efficiently, after their initial training? Here we take inspiration from the main mechanism of learning in biological brains: synaptic plasticity, carefully tuned by evolution to produce efficient lifelong learning. We show that plasticity, just like connection weights, can be optimized by gradient descent in large (millions of parameters) recurrent networks with Hebbian plastic connections. First, recurrent plastic networks with more than two million parameters can be trained to memorize and reconstruct sets of novel, high-dimensional 1000+ pixels natural images not seen during training. Crucially, traditional non-plastic recurrent networks fail to solve this task. Furthermore, trained plastic networks can also solve generic meta-learning tasks such as the Omniglot task, with competitive results and little parameter overhead. Finally, in reinforcement learning settings, plastic networks outperform a non-plastic equivalent in a maze exploration task. We conclude that differentiable plasticity may provide a powerful novel approach to the learning-to-learn problem. △ Less

Submitted 31 July, 2018; v1 submitted 6 April, 2018; originally announced April 2018.

Comments: Presented at ICML 2018

Journal ref: Proceedings of the 35th International Conference on Machine Learning (ICML2018), Stockholm, Sweden, PMLR 80, 2018

arXiv:1803.03453 [pdf, other]

The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities

Authors: Joel Lehman, Jeff Clune, Dusan Misevic, Christoph Adami, Lee Altenberg, Julie Beaulieu, Peter J. Bentley, Samuel Bernard, Guillaume Beslon, David M. Bryson, Patryk Chrabaszcz, Nick Cheney, Antoine Cully, Stephane Doncieux, Fred C. Dyer, Kai Olav Ellefsen, Robert Feldt, Stephan Fischer, Stephanie Forrest, Antoine Frénoy, Christian Gagné, Leni Le Goff, Laura M. Grabowski, Babak Hodjat, Frank Hutter , et al. (28 additional authors not shown)

Abstract: Biological evolution provides a creative fount of complex and subtle adaptations, often surprising the scientists who discover them. However, because evolution is an algorithmic process that transcends the substrate in which it occurs, evolution's creativity is not limited to nature. Indeed, many researchers in the field of digital evolution have observed their evolving algorithms and organisms su… ▽ More Biological evolution provides a creative fount of complex and subtle adaptations, often surprising the scientists who discover them. However, because evolution is an algorithmic process that transcends the substrate in which it occurs, evolution's creativity is not limited to nature. Indeed, many researchers in the field of digital evolution have observed their evolving algorithms and organisms subverting their intentions, exposing unrecognized bugs in their code, producing unexpected adaptations, or exhibiting outcomes uncannily convergent with ones in nature. Such stories routinely reveal creativity by evolution in these digital worlds, but they rarely fit into the standard scientific narrative. Instead they are often treated as mere obstacles to be overcome, rather than results that warrant study in their own right. The stories themselves are traded among researchers through oral tradition, but that mode of information transmission is inefficient and prone to error and outright loss. Moreover, the fact that these stories tend to be shared only among practitioners means that many natural scientists do not realize how interesting and lifelike digital organisms are and how natural their evolution can be. To our knowledge, no collection of such anecdotes has been published before. This paper is the crowd-sourced product of researchers in the fields of artificial life and evolutionary computation who have provided first-hand accounts of such cases. It thus serves as a written, fact-checked collection of scientifically important and even entertaining stories. In doing so we also present here substantial evidence that the existence and importance of evolutionary surprises extends beyond the natural world, and may indeed be a universal property of all complex evolving systems. △ Less

Submitted 21 November, 2019; v1 submitted 9 March, 2018; originally announced March 2018.

arXiv:1712.06568 [pdf, other]

ES Is More Than Just a Traditional Finite-Difference Approximator

Authors: Joel Lehman, Jay Chen, Jeff Clune, Kenneth O. Stanley

Abstract: An evolution strategy (ES) variant based on a simplification of a natural evolution strategy recently attracted attention because it performs surprisingly well in challenging deep reinforcement learning domains. It searches for neural network parameters by generating perturbations to the current set of parameters, checking their performance, and moving in the aggregate direction of higher reward.… ▽ More An evolution strategy (ES) variant based on a simplification of a natural evolution strategy recently attracted attention because it performs surprisingly well in challenging deep reinforcement learning domains. It searches for neural network parameters by generating perturbations to the current set of parameters, checking their performance, and moving in the aggregate direction of higher reward. Because it resembles a traditional finite-difference approximation of the reward gradient, it can naturally be confused with one. However, this ES optimizes for a different gradient than just reward: It optimizes for the average reward of the entire population, thereby seeking parameters that are robust to perturbation. This difference can channel ES into distinct areas of the search space relative to gradient descent, and also consequently to networks with distinct properties. This unique robustness-seeking property, and its consequences for optimization, are demonstrated in several domains. They include humanoid locomotion, where networks from policy gradient-based reinforcement learning are significantly less robust to parameter perturbation than ES-based policies solving the same task. While the implications of such robustness and robustness-seeking remain open to further study, this work's main contribution is to highlight such differences and their potential importance. △ Less

Submitted 1 May, 2018; v1 submitted 18 December, 2017; originally announced December 2017.

arXiv:1712.06567 [pdf, other]

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

Authors: Felipe Petroski Such, Vashisht Madhavan, Edoardo Conti, Joel Lehman, Kenneth O. Stanley, Jeff Clune

Abstract: Deep artificial neural networks (DNNs) are typically trained via gradient-based learning algorithms, namely backpropagation. Evolution strategies (ES) can rival backprop-based algorithms such as Q-learning and policy gradients on challenging deep reinforcement learning (RL) problems. However, ES can be considered a gradient-based algorithm because it performs stochastic gradient descent via an ope… ▽ More Deep artificial neural networks (DNNs) are typically trained via gradient-based learning algorithms, namely backpropagation. Evolution strategies (ES) can rival backprop-based algorithms such as Q-learning and policy gradients on challenging deep reinforcement learning (RL) problems. However, ES can be considered a gradient-based algorithm because it performs stochastic gradient descent via an operation similar to a finite-difference approximation of the gradient. That raises the question of whether non-gradient-based evolutionary algorithms can work at DNN scales. Here we demonstrate they can: we evolve the weights of a DNN with a simple, gradient-free, population-based genetic algorithm (GA) and it performs well on hard deep RL problems, including Atari and humanoid locomotion. The Deep GA successfully evolves networks with over four million free parameters, the largest neural networks ever evolved with a traditional evolutionary algorithm. These results (1) expand our sense of the scale at which GAs can operate, (2) suggest intriguingly that in some cases following the gradient is not the best choice for optimizing performance, and (3) make immediately available the multitude of neuroevolution techniques that improve performance. We demonstrate the latter by showing that combining DNNs with novelty search, which encourages exploration on tasks with deceptive or sparse reward functions, can solve a high-dimensional problem on which reward-maximizing algorithms (e.g.\ DQN, A3C, ES, and the GA) fail. Additionally, the Deep GA is faster than ES, A3C, and DQN (it can train Atari in ${\raise.17ex\hbox{$\scriptstyle\sim$}}$4 hours on one desktop or ${\raise.17ex\hbox{$\scriptstyle\sim$}}$1 hour distributed on 720 cores), and enables a state-of-the-art, up to 10,000-fold compact encoding technique. △ Less

Submitted 20 April, 2018; v1 submitted 18 December, 2017; originally announced December 2017.

arXiv:1712.06564 [pdf, other]

On the Relationship Between the OpenAI Evolution Strategy and Stochastic Gradient Descent

Authors: Xingwen Zhang, Jeff Clune, Kenneth O. Stanley

Abstract: Because stochastic gradient descent (SGD) has shown promise optimizing neural networks with millions of parameters and few if any alternatives are known to exist, it has moved to the heart of leading approaches to reinforcement learning (RL). For that reason, the recent result from OpenAI showing that a particular kind of evolution strategy (ES) can rival the performance of SGD-based deep RL metho… ▽ More Because stochastic gradient descent (SGD) has shown promise optimizing neural networks with millions of parameters and few if any alternatives are known to exist, it has moved to the heart of leading approaches to reinforcement learning (RL). For that reason, the recent result from OpenAI showing that a particular kind of evolution strategy (ES) can rival the performance of SGD-based deep RL methods with large neural networks provoked surprise. This result is difficult to interpret in part because of the lingering ambiguity on how ES actually relates to SGD. The aim of this paper is to significantly reduce this ambiguity through a series of MNIST-based experiments designed to uncover their relationship. As a simple supervised problem without domain noise (unlike in most RL), MNIST makes it possible (1) to measure the correlation between gradients computed by ES and SGD and (2) then to develop an SGD-based proxy that accurately predicts the performance of different ES population sizes. These innovations give a new level of insight into the real capabilities of ES, and lead also to some unconventional means for applying ES to supervised problems that shed further light on its differences from SGD. Incorporating these lessons, the paper concludes by demonstrating that ES can achieve 99% accuracy on MNIST, a number higher than any previously published result for any evolutionary method. While not by any means suggesting that ES should substitute for SGD in supervised learning, the suite of experiments herein enables more informed decisions on the application of ES within RL and other paradigms. △ Less

Submitted 18 December, 2017; originally announced December 2017.

arXiv:1712.06563 [pdf, other]

Safe Mutations for Deep and Recurrent Neural Networks through Output Gradients

Authors: Joel Lehman, Jay Chen, Jeff Clune, Kenneth O. Stanley

Abstract: While neuroevolution (evolving neural networks) has a successful track record across a variety of domains from reinforcement learning to artificial life, it is rarely applied to large, deep neural networks. A central reason is that while random mutation generally works in low dimensions, a random perturbation of thousands or millions of weights is likely to break existing functionality, providing… ▽ More While neuroevolution (evolving neural networks) has a successful track record across a variety of domains from reinforcement learning to artificial life, it is rarely applied to large, deep neural networks. A central reason is that while random mutation generally works in low dimensions, a random perturbation of thousands or millions of weights is likely to break existing functionality, providing no learning signal even if some individual weight changes were beneficial. This paper proposes a solution by introducing a family of safe mutation (SM) operators that aim within the mutation operator itself to find a degree of change that does not alter network behavior too much, but still facilitates exploration. Importantly, these SM operators do not require any additional interactions with the environment. The most effective SM variant capitalizes on the intriguing opportunity to scale the degree of mutation of each individual weight according to the sensitivity of the network's outputs to that weight, which requires computing the gradient of outputs with respect to the weights (instead of the gradient of error, as in conventional deep learning). This safe mutation through gradients (SM-G) operator dramatically increases the ability of a simple genetic algorithm-based neuroevolution method to find solutions in high-dimensional domains that require deep and/or recurrent neural networks (which tend to be particularly brittle to mutation), including domains that require processing raw pixels. By improving our ability to evolve deep neural networks, this new safer approach to mutation expands the scope of domains amenable to neuroevolution. △ Less

Submitted 1 May, 2018; v1 submitted 18 December, 2017; originally announced December 2017.

arXiv:1712.06560 [pdf, other]

Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents

Authors: Edoardo Conti, Vashisht Madhavan, Felipe Petroski Such, Joel Lehman, Kenneth O. Stanley, Jeff Clune

Abstract: Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e.g. hours vs. days) because they parallelize better. However, many RL problems require directed exploration because they have reward functions that are… ▽ More Evolution strategies (ES) are a family of black-box optimization algorithms able to train deep neural networks roughly as well as Q-learning and policy gradient methods on challenging deep reinforcement learning (RL) problems, but are much faster (e.g. hours vs. days) because they parallelize better. However, many RL problems require directed exploration because they have reward functions that are sparse or deceptive (i.e. contain local optima), and it is unknown how to encourage such exploration with ES. Here we show that algorithms that have been invented to promote directed exploration in small-scale evolved neural networks via populations of exploring agents, specifically novelty search (NS) and quality diversity (QD) algorithms, can be hybridized with ES to improve its performance on sparse or deceptive deep RL tasks, while retaining scalability. Our experiments confirm that the resultant new algorithms, NS-ES and two QD algorithms, NSR-ES and NSRA-ES, avoid local optima encountered by ES to achieve higher performance on Atari and simulated robots learning to walk around a deceptive trap. This paper thus introduces a family of fast, scalable algorithms for reinforcement learning that are capable of directed exploration. It also adds this new family of exploration algorithms to the RL toolbox and raises the interesting possibility that analogous algorithms with multiple simultaneous paths of exploration might also combine well with existing RL algorithms outside ES. △ Less

Submitted 29 October, 2018; v1 submitted 18 December, 2017; originally announced December 2017.

arXiv:1704.05143 [pdf, other]

doi 10.1162/artl_a_00263

The Emergence of Canalization and Evolvability in an Open-Ended, Interactive Evolutionary System

Authors: Joost Huizinga, Kenneth O. Stanley, Jeff Clune

Abstract: Natural evolution has produced a tremendous diversity of functional organisms. Many believe an essential component of this process was the evolution of evolvability, whereby evolution speeds up its ability to innovate by generating a more adaptive pool of offspring. One hypothesized mechanism for evolvability is developmental canalization, wherein certain dimensions of variation become more likely… ▽ More Natural evolution has produced a tremendous diversity of functional organisms. Many believe an essential component of this process was the evolution of evolvability, whereby evolution speeds up its ability to innovate by generating a more adaptive pool of offspring. One hypothesized mechanism for evolvability is developmental canalization, wherein certain dimensions of variation become more likely to be traversed and others are prevented from being explored (e.g. offspring tend to have similarly sized legs, and mutations affect the length of both legs, not each leg individually). While ubiquitous in nature, canalization almost never evolves in computational simulations of evolution. Not only does that deprive us of in silico models in which to study the evolution of evolvability, but it also raises the question of which conditions give rise to this form of evolvability. Answering this question would shed light on why such evolvability emerged naturally and could accelerate engineering efforts to harness evolution to solve important engineering challenges. In this paper we reveal a unique system in which canalization did emerge in computational evolution. We document that genomes entrench certain dimensions of variation that were frequently explored during their evolutionary history. The genetic representation of these organisms also evolved to be highly modular and hierarchical, and we show that these organizational properties correlate with increased fitness. Interestingly, the type of computational evolutionary experiment that produced this evolvability was very different from traditional digital evolution in that there was no objective, suggesting that open-ended, divergent evolutionary processes may be necessary for the evolution of evolvability. △ Less

Submitted 14 February, 2019; v1 submitted 17 April, 2017; originally announced April 2017.

Comments: SI can be found at: http://www.evolvingai.org/files/SI_0.zip

Journal ref: Artificial life, 24(3), pp.157-181 (2018)

arXiv:1703.10371 [pdf, other]

doi 10.1016/j.neunet.2018.07.013

Born to Learn: the Inspiration, Progress, and Future of Evolved Plastic Artificial Neural Networks

Authors: Andrea Soltoggio, Kenneth O. Stanley, Sebastian Risi

Abstract: Biological plastic neural networks are systems of extraordinary computational capabilities shaped by evolution, development, and lifetime learning. The interplay of these elements leads to the emergence of adaptive behavior and intelligence. Inspired by such intricate natural phenomena, Evolved Plastic Artificial Neural Networks (EPANNs) use simulated evolution in-silico to breed plastic neural ne… ▽ More Biological plastic neural networks are systems of extraordinary computational capabilities shaped by evolution, development, and lifetime learning. The interplay of these elements leads to the emergence of adaptive behavior and intelligence. Inspired by such intricate natural phenomena, Evolved Plastic Artificial Neural Networks (EPANNs) use simulated evolution in-silico to breed plastic neural networks with a large variety of dynamics, architectures, and plasticity rules: these artificial systems are composed of inputs, outputs, and plastic components that change in response to experiences in an environment. These systems may autonomously discover novel adaptive algorithms, and lead to hypotheses on the emergence of biological adaptation. EPANNs have seen considerable progress over the last two decades. Current scientific and technological advances in artificial neural networks are now setting the conditions for radically new approaches and results. In particular, the limitations of hand-designed networks could be overcome by more flexible and innovative solutions. This paper brings together a variety of inspiring ideas that define the field of EPANNs. The main methods and results are reviewed. Finally, new opportunities and developments are presented. △ Less

Submitted 8 August, 2018; v1 submitted 30 March, 2017; originally announced March 2017.

Journal ref: Neural Networks, 2018

arXiv:1609.02226 [pdf, other]

Fitted Learning: Models with Awareness of their Limits

Authors: Navid Kardan, Kenneth O. Stanley

Abstract: Though deep learning has pushed the boundaries of classification forward, in recent years hints of the limits of standard classification have begun to emerge. Problems such as fooling, adding new classes over time, and the need to retrain learning models only for small changes to the original problem all point to a potential shortcoming in the classic classification regime, where a comprehensive a… ▽ More Though deep learning has pushed the boundaries of classification forward, in recent years hints of the limits of standard classification have begun to emerge. Problems such as fooling, adding new classes over time, and the need to retrain learning models only for small changes to the original problem all point to a potential shortcoming in the classic classification regime, where a comprehensive a priori knowledge of the possible classes or concepts is critical. Without such knowledge, classifiers misjudge the limits of their knowledge and overgeneralization therefore becomes a serious obstacle to consistent performance. In response to these challenges, this paper extends the classic regime by reframing classification instead with the assumption that concepts present in the training set are only a sample of the hypothetical final set of concepts. To bring learning models into this new paradigm, a novel elaboration of standard architectures called the competitive overcomplete output layer (COOL) neural network is introduced. Experiments demonstrate the effectiveness of COOL by applying it to fooling, separable concept learning, one-class neural networks, and standard classification benchmarks. The results suggest that, unlike conventional classifiers, the amount of generalization in COOL networks can be tuned to match the problem. △ Less

Submitted 9 July, 2018; v1 submitted 7 September, 2016; originally announced September 2016.

arXiv:1407.3000 [pdf, ps, other]

A Proposed Infrastructure for Adding Online Interaction to Any Evolutionary Domain

Authors: Paul Szerlip, Kenneth O. Stanley

Abstract: To address the difficulty of creating online collaborative evolutionary systems, this paper presents a new prototype library called Worldwide Infrastructure for Neuroevolution (WIN) and its accompanying site WIN Online (http://winark.org/). The WIN library is a collection of software packages built on top of Node.js that reduce the complexity of creating fully persistent, online, and interactive (… ▽ More To address the difficulty of creating online collaborative evolutionary systems, this paper presents a new prototype library called Worldwide Infrastructure for Neuroevolution (WIN) and its accompanying site WIN Online (http://winark.org/). The WIN library is a collection of software packages built on top of Node.js that reduce the complexity of creating fully persistent, online, and interactive (or automated) evolutionary platforms around any domain. WIN Online is the public interface for WIN, providing an online collection of domains built with the WIN library that lets novice and expert users browse and meaningfully contribute to ongoing experiments. The long term goal of WIN is to make it trivial to connect any platform to the world, providing both a stream of online users, and archives of data and discoveries for later extension by humans or computers. △ Less

Submitted 10 July, 2014; originally announced July 2014.

Comments: Presented at WebAL-1: Workshop on Artificial Life and the Web 2014 (arXiv:1406.2507)

Report number: WebAL1/2014/05

arXiv:1406.1833 [pdf, other]

Unsupervised Feature Learning through Divergent Discriminative Feature Accumulation

Authors: Paul A. Szerlip, Gregory Morse, Justin K. Pugh, Kenneth O. Stanley

Abstract: Unlike unsupervised approaches such as autoencoders that learn to reconstruct their inputs, this paper introduces an alternative approach to unsupervised feature learning called divergent discriminative feature accumulation (DDFA) that instead continually accumulates features that make novel discriminations among the training set. Thus DDFA features are inherently discriminative from the start eve… ▽ More Unlike unsupervised approaches such as autoencoders that learn to reconstruct their inputs, this paper introduces an alternative approach to unsupervised feature learning called divergent discriminative feature accumulation (DDFA) that instead continually accumulates features that make novel discriminations among the training set. Thus DDFA features are inherently discriminative from the start even though they are trained without knowledge of the ultimate classification problem. Interestingly, DDFA also continues to add new features indefinitely (so it does not depend on a hidden layer size), is not based on minimizing error, and is inherently divergent instead of convergent, thereby providing a unique direction of research for unsupervised feature learning. In this paper the quality of its learned features is demonstrated on the MNIST dataset, where its performance confirms that indeed DDFA is a viable technique for learning useful features. △ Less

Submitted 9 June, 2014; v1 submitted 6 June, 2014; originally announced June 2014.

Comments: Corrected citation formatting

arXiv:1302.1143 [pdf, other]

doi 10.1371/journal.pone.0062186

Evolvability Is Inevitable: Increasing Evolvability Without the Pressure to Adapt

Authors: Joel Lehman, Kenneth O. Stanley

Abstract: Why evolvability appears to have increased over evolutionary time is an important unresolved biological question. Unlike most candidate explanations, this paper proposes that increasing evolvability can result without any pressure to adapt. The insight is that if evolvability is heritable, then an unbiased drifting process across genotypes can still create a distribution of phenotypes biased towar… ▽ More Why evolvability appears to have increased over evolutionary time is an important unresolved biological question. Unlike most candidate explanations, this paper proposes that increasing evolvability can result without any pressure to adapt. The insight is that if evolvability is heritable, then an unbiased drifting process across genotypes can still create a distribution of phenotypes biased towards evolvability, because evolvable organisms diffuse more quickly through the space of possible phenotypes. Furthermore, because phenotypic divergence often correlates with founding niches, niche founders may on average be more evolvable, which through population growth provides a genotypic bias towards evolvability. Interestingly, the combination of these two mechanisms can lead to increasing evolvability without any pressure to out-compete other organisms, as demonstrated through experiments with a series of simulated models. Thus rather than from pressure to adapt, evolvability may inevitably result from any drift through genotypic space combined with evolution's passive tendency to accumulate niches. △ Less

Submitted 5 February, 2013; originally announced February 2013.

arXiv:1207.6682 [pdf, other]

Exploring Promising Step** Stones by Combining Novelty Search with Interactive Evolution

Authors: Brian G. Woolley, Kenneth O. Stanley

Abstract: The field of evolutionary computation is inspired by the achievements of natural evolution, in which there is no final objective. Yet the pursuit of objectives is ubiquitous in simulated evolution. A significant problem is that objective approaches assume that intermediate step** stones will increasingly resemble the final objective when in fact they often do not. The consequence is that while s… ▽ More The field of evolutionary computation is inspired by the achievements of natural evolution, in which there is no final objective. Yet the pursuit of objectives is ubiquitous in simulated evolution. A significant problem is that objective approaches assume that intermediate step** stones will increasingly resemble the final objective when in fact they often do not. The consequence is that while solutions may exist, searching for such objectives may not discover them. This paper highlights the importance of leveraging human insight during search as an alternative to articulating explicit objectives. In particular, a new approach called novelty-assisted interactive evolutionary computation (NA-IEC) combines human intuition with novelty search for the first time to facilitate the serendipitous discovery of agent behaviors. In this approach, the human user directs evolution by selecting what is interesting from the on-screen population of behaviors. However, unlike in typical IEC, the user can now request that the next generation be filled with novel descendants. The experimental results demonstrate that combining human insight with novelty search finds solutions significantly faster and at lower genomic complexities than fully-automated processes, including pure novelty search, suggesting an important role for human users in the search for solutions. △ Less

Submitted 27 July, 2012; originally announced July 2012.

Comments: 15 pages, 7 figures

ACM Class: I.2.6

arXiv:1107.0037 [pdf, ps]

doi 10.1613/jair.1338

Competitive Coevolution through Evolutionary Complexification

Authors: R. Miikkulainen, K. O. Stanley

Abstract: Two major goals in machine learning are the discovery and improvement of solutions to complex problems. In this paper, we argue that complexification, i.e. the incremental elaboration of solutions through adding new structure, achieves both these goals. We demonstrate the power of complexification through the NeuroEvolution of Augmenting Topologies (NEAT) method, which evolves increasingly compl… ▽ More Two major goals in machine learning are the discovery and improvement of solutions to complex problems. In this paper, we argue that complexification, i.e. the incremental elaboration of solutions through adding new structure, achieves both these goals. We demonstrate the power of complexification through the NeuroEvolution of Augmenting Topologies (NEAT) method, which evolves increasingly complex neural network architectures. NEAT is applied to an open-ended coevolutionary robot duel domain where robot controllers compete head to head. Because the robot duel domain supports a wide range of strategies, and because coevolution benefits from an escalating arms race, it serves as a suitable testbed for studying complexification. When compared to the evolution of networks with fixed structure, complexifying evolution discovers significantly more sophisticated strategies. The results suggest that in order to discover and improve complex solutions, evolution, and search in general, should be allowed to complexify as well as optimize. △ Less

Submitted 30 June, 2011; originally announced July 2011.

Journal ref: Journal Of Artificial Intelligence Research, Volume 21, pages 63-100, 2004

Showing 1–31 of 31 results for author: Stanley, K O