Search | arXiv e-print repository

"Give Me an Example Like This": Episodic Active Reinforcement Learning from Demonstrations

Authors: Muhan Hou, Koen Hindriks, A. E. Eiben, Kim Baraka

Abstract: Reinforcement Learning (RL) has achieved great success in sequential decision-making problems, but often at the cost of a large number of agent-environment interactions. To improve sample efficiency, methods like Reinforcement Learning from Expert Demonstrations (RLED) introduce external expert demonstrations to facilitate agent exploration during the learning process. In practice, these demonstra… ▽ More Reinforcement Learning (RL) has achieved great success in sequential decision-making problems, but often at the cost of a large number of agent-environment interactions. To improve sample efficiency, methods like Reinforcement Learning from Expert Demonstrations (RLED) introduce external expert demonstrations to facilitate agent exploration during the learning process. In practice, these demonstrations, which are often collected from human users, are costly and hence often constrained to a limited amount. How to select the best set of human demonstrations that is most beneficial for learning therefore becomes a major concern. This paper presents EARLY (Episodic Active Learning from demonstration querY), an algorithm that enables a learning agent to generate optimized queries of expert demonstrations in a trajectory-based feature space. Based on a trajectory-level estimate of uncertainty in the agent's current policy, EARLY determines the optimized timing and content for feature-based queries. By querying episodic demonstrations as opposed to isolated state-action pairs, EARLY improves the human teaching experience and achieves better learning performance. We validate the effectiveness of our method in three simulated navigation tasks of increasing difficulty. The results show that our method is able to achieve expert-level performance for all three tasks with convergence over 30\% faster than other baseline methods when demonstrations are generated by simulated oracle policies. The results of a follow-up pilot user study (N=18) further validate that our method can still maintain a significantly better convergence in the case of human expert demonstrators while achieving a better user experience in perceived task load and consuming significantly less human time. △ Less

Submitted 6 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

arXiv:2403.19545 [pdf, other]

Lamarckian Inheritance Improves Robot Evolution in Dynamic Environments

Authors: Jie Luo, Karine Miras, Carlo Longhi, Oliver Weissl, Agoston E. Eiben

Abstract: This study explores the integration of Lamarckian system into evolutionary robotics (ER), comparing it with the traditional Darwinian model across various environments. By adopting Lamarckian principles, where robots inherit learned traits, alongside Darwinian learning without inheritance, we investigate adaptation in dynamic settings. Our research, conducted in six distinct environmental setups,… ▽ More This study explores the integration of Lamarckian system into evolutionary robotics (ER), comparing it with the traditional Darwinian model across various environments. By adopting Lamarckian principles, where robots inherit learned traits, alongside Darwinian learning without inheritance, we investigate adaptation in dynamic settings. Our research, conducted in six distinct environmental setups, demonstrates that Lamarckian systems outperform Darwinian ones in adaptability and efficiency, particularly in challenging conditions. Our analysis highlights the critical role of the interplay between controller \& morphological evolution and environment adaptation, with parent-offspring similarities and newborn \&survivors before and after learning providing insights into the effectiveness of trait inheritance. Our findings suggest Lamarckian principles could significantly advance autonomous system design, highlighting the potential for more adaptable and robust robotic solutions in complex, real-world applications. These theoretical insights were validated using real physical robots, bridging the gap between simulation and practical application. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: Nature. arXiv admin note: substantial text overlap with arXiv:2309.13099; text overlap with arXiv:2303.12594, arXiv:2309.14387

arXiv:2402.04763 [pdf, other]

Emergence of specialized Collective Behaviors in Evolving Heterogeneous Swarms

Authors: Fuda van Diggelen, Matteo De Carlo, Nicolas Cambier, Eliseo Ferrante, A. E. Eiben

Abstract: Natural groups of animals, such as swarms of social insects, exhibit astonishing degrees of task specialization, useful to address complex tasks and to survive. This is supported by phenotypic plasticity: individuals sharing the same genotype that is expressed differently for different classes of individuals, each specializing in one task. In this work, we evolve a swarm of simulated robots with p… ▽ More Natural groups of animals, such as swarms of social insects, exhibit astonishing degrees of task specialization, useful to address complex tasks and to survive. This is supported by phenotypic plasticity: individuals sharing the same genotype that is expressed differently for different classes of individuals, each specializing in one task. In this work, we evolve a swarm of simulated robots with phenotypic plasticity to study the emergence of specialized collective behavior during an emergent perception task. Phenotypic plasticity is realized in the form of heterogeneity of behavior by dividing the genotype into two components, with one different neural network controller associated to each component. The whole genotype, expressing the behavior of the whole group through the two components, is subject to evolution with a single fitness function. We analyse the obtained behaviors and use the insights provided by these results to design an online regulatory mechanism. Our experiments show three main findings: 1) The sub-groups evolve distinct emergent behaviors. 2) The effectiveness of the whole swarm depends on the interaction between the two sub-groups, leading to a more robust performance than with singular sub-group behavior. 3) The online regulatory mechanism enhances overall performance and scalability. △ Less

Submitted 7 February, 2024; originally announced February 2024.

arXiv:2309.13908 [pdf, other]

A comparison of controller architectures and learning mechanisms for arbitrary robot morphologies

Authors: Jie Luo, Jakub Tomczak, Karine Miras, Agoston E. Eiben

Abstract: The main question this paper addresses is: What combination of a robot controller and a learning method should be used, if the morphology of the learning robot is not known in advance? Our interest is rooted in the context of morphologically evolving modular robots, but the question is also relevant in general, for system designers interested in widely applicable solutions. We perform an experimen… ▽ More The main question this paper addresses is: What combination of a robot controller and a learning method should be used, if the morphology of the learning robot is not known in advance? Our interest is rooted in the context of morphologically evolving modular robots, but the question is also relevant in general, for system designers interested in widely applicable solutions. We perform an experimental comparison of three controller-and-learner combinations: one approach where controllers are based on modelling animal locomotion (Central Pattern Generators, CPG) and the learner is an evolutionary algorithm, a completely different method using Reinforcement Learning (RL) with a neural network controller architecture, and a combination `in-between' where controllers are neural networks and the learner is an evolutionary algorithm. We apply these three combinations to a test suite of modular robots and compare their efficacy, efficiency, and robustness. Surprisingly, the usual CPG-based and RL-based options are outperformed by the in-between combination that is more robust and efficient than the other two setups. △ Less

Submitted 25 September, 2023; originally announced September 2023.

arXiv:2309.13099 [pdf, other]

Lamarck's Revenge: Inheritance of Learned Traits Can Make Robot Evolution Better

Authors: Jie Luo, Karine Miras, Jakub Tomczak, Agoston E. Eiben

Abstract: Evolutionary robot systems offer two principal advantages: an advanced way of develo** robots through evolutionary optimization and a special research platform to conduct what-if experiments regarding questions about evolution. Our study sits at the intersection of these. We investigate the question ``What if the 18th-century biologist Lamarck was not completely wrong and individual traits learn… ▽ More Evolutionary robot systems offer two principal advantages: an advanced way of develo** robots through evolutionary optimization and a special research platform to conduct what-if experiments regarding questions about evolution. Our study sits at the intersection of these. We investigate the question ``What if the 18th-century biologist Lamarck was not completely wrong and individual traits learned during a lifetime could be passed on to offspring through inheritance?'' We research this issue through simulations with an evolutionary robot framework where morphologies (bodies) and controllers (brains) of robots are evolvable and robots also can improve their controllers through learning during their lifetime. Within this framework, we compare a Lamarckian system, where learned bits of the brain are inheritable, with a Darwinian system, where they are not. Analyzing simulations based on these systems, we obtain new insights about Lamarckian evolution dynamics and the interaction between evolution and learning. Specifically, we show that Lamarckism amplifies the emergence of `morphological intelligence', the ability of a given robot body to acquire a good brain by learning, and identify the source of this success: `newborn' robots have a higher fitness because their inherited brains match their bodies better than those in a Darwinian system. △ Less

Submitted 22 September, 2023; originally announced September 2023.

Comments: preprint-nature scientific report. arXiv admin note: text overlap with arXiv:2303.12594

arXiv:2303.12594 [pdf, other]

A Comparative Study of Brain Reproduction Methods for Morphologically Evolving Robots

Authors: Jie Luo, Carlo Longhi, Agoston E. Eiben

Abstract: In the most extensive robot evolution systems, both the bodies and the brains of the robots undergo evolution and the brains of 'infant' robots are also optimized by a learning process immediately after 'birth'. This paper is concerned with the brain evolution mechanism in such a system. In particular, we compare four options obtained by combining asexual or sexual brain reproduction with Darwinia… ▽ More In the most extensive robot evolution systems, both the bodies and the brains of the robots undergo evolution and the brains of 'infant' robots are also optimized by a learning process immediately after 'birth'. This paper is concerned with the brain evolution mechanism in such a system. In particular, we compare four options obtained by combining asexual or sexual brain reproduction with Darwinian or Lamarckian evolution mechanisms. We conduct experiments in simulation with a system of evolvable modular robots on two different tasks. The results show that sexual reproduction of the robots' brains is preferable in the Darwinian framework, but the effect is the opposite in the Lamarckian system (both using the same infant learning method). Our experiments suggest that the overall best option is asexual reproduction combined with the Lamarckian framework, as it obtains better robots in terms of fitness than the other three. Considering the evolved morphologies, the different brain reproduction methods do not lead to differences. This result indicates that the morphology of the robot is mainly determined by the task and the environment, not by the brain reproduction methods. △ Less

Submitted 30 May, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

Comments: 8 pages, ALife

arXiv:2203.11585 [pdf, other]

Environment induced emergence of collective behaviour in evolving swarms with limited sensing

Authors: Fuda van Diggelen, Jie Luo, Tugay Alperen Karagüzel, Nicolas Cambier, Eliseo Ferrante, A. E. Eiben

Abstract: Designing controllers for robot swarms is challenging, because human developers have typically no good understanding of the link between the details of a controller that governs individual robots and the swarm behavior that is an indirect result of the interactions between swarm members and the environment. In this paper we investigate whether an evolutionary approach can mitigate this problem. We… ▽ More Designing controllers for robot swarms is challenging, because human developers have typically no good understanding of the link between the details of a controller that governs individual robots and the swarm behavior that is an indirect result of the interactions between swarm members and the environment. In this paper we investigate whether an evolutionary approach can mitigate this problem. We consider a very challenging task where robots with limited sensing and communication abilities must follow the gradient of an environmental feature and use Differential Evolution to evolve a neural network controller for simulated robots. We conduct a systematic study to measure the flexibility and scalability of the method by varying the size of the arena and number of robots in the swarm. The experiments confirm the feasibility of our approach, the evolved robot controllers induced swarm behavior that solved the task. We found that solutions evolved under the harshest conditions (where the environmental clues were the weakest) were the most flexible and that there is a sweet spot regarding the swarm size. Furthermore, we observed collective motion of the swarm, showcasing truly emergent behavior that was not represented in- and selected for during evolution. △ Less

Submitted 11 April, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

Comments: (1) Three authors contributed equally to this research

arXiv:2203.03967 [pdf, other]

Comparing lifetime learning methods for morphologically evolving robots

Authors: Fuda van Diggelen, Eliseo Ferrante, A. E. Eiben

Abstract: Evolving morphologies and controllers of robots simultaneously leads to a problem: Even if the parents have well-matching bodies and brains, the stochastic recombination can break this match and cause a body-brain mismatch in their offspring. We argue that this can be mitigated by having newborn robots perform a learning process that optimizes their inherited brain quickly after birth. We compare… ▽ More Evolving morphologies and controllers of robots simultaneously leads to a problem: Even if the parents have well-matching bodies and brains, the stochastic recombination can break this match and cause a body-brain mismatch in their offspring. We argue that this can be mitigated by having newborn robots perform a learning process that optimizes their inherited brain quickly after birth. We compare three different algorithms for doing this. To this end, we consider three algorithmic properties, efficiency, efficacy, and the sensitivity to differences in the morphologies of the robots that run the learning process. △ Less

Submitted 8 March, 2022; originally announced March 2022.

Comments: Associated code: https://github.com/fudavd/revolve/tree/learning

MSC Class: 68T40 (primary); 68W50 68T05 (Secondary)

arXiv:2111.09851 [pdf, other]

The Effects of Learning in Morphologically Evolving Robot Systems

Authors: Jie Luo, Aart Stuurman, Jakub M. Tomczak, Jacintha Ellers, Agoston E. Eiben

Abstract: Simultaneously evolving morphologies (bodies) and controllers (brains) of robots can cause a mismatch between the inherited body and brain in the offspring. To mitigate this problem, the addition of an infant learning period by the so-called Triangle of Life framework has been proposed relatively long ago. However, an empirical assessment is still lacking to-date. In this paper we investigate the… ▽ More Simultaneously evolving morphologies (bodies) and controllers (brains) of robots can cause a mismatch between the inherited body and brain in the offspring. To mitigate this problem, the addition of an infant learning period by the so-called Triangle of Life framework has been proposed relatively long ago. However, an empirical assessment is still lacking to-date. In this paper we investigate the effects of such a learning mechanism from different perspectives. Using extensive simulations we show that learning can greatly increase task performance and reduce the number of generations required to reach a certain fitness level compared to the purely evolutionary approach. Furthermore, although learning only directly affects the controllers, we demonstrate that the evolved morphologies will be also different. This provides a quantitative demonstration that changes in the brain can induce changes in the body. Finally, we examine the concept of morphological intelligence quantified by the ability of a given body to learn. We observe that the learning delta, the performance difference between the inherited and the learned brain, is growing throughout the evolutionary process. This shows that evolution is producing robots with an increasing plasticity, that is, consecutive generations are becoming better and better learners which in turn makes them better and better at the given task. All in all, our results demonstrate that the Triangle of Life is not only a concept of theoretical interest, but a system architecture with practical benefits. △ Less

Submitted 18 November, 2021; originally announced November 2021.

Comments: Frontiers in Robotics and AI. arXiv admin note: text overlap with arXiv:2107.08249

arXiv:2110.11187 [pdf, other]

Heritability in Morphological Robot Evolution

Authors: Matteo De Carlo, Eliseo Ferrante, Daan Zeeuwe, Jacintha Ellers, Gerben Meynen, A. E. Eiben

Abstract: In the field of evolutionary robotics, choosing the correct encoding is very complicated, especially when robots evolve both behaviours and morphologies at the same time. With the objective of improving our understanding of the map** process from encodings to functional robots, we introduce the biological notion of heritability, which captures the amount of phenotypic variation caused by genotyp… ▽ More In the field of evolutionary robotics, choosing the correct encoding is very complicated, especially when robots evolve both behaviours and morphologies at the same time. With the objective of improving our understanding of the map** process from encodings to functional robots, we introduce the biological notion of heritability, which captures the amount of phenotypic variation caused by genotypic variation. In our analysis we measure the heritability on the first generation of robots evolved from two different encodings, a direct encoding and an indirect encoding. In addition we investigate the interplay between heritability and phenotypic diversity through the course of an entire evolutionary process. In particular, we investigate how direct and indirect genotypes can exhibit preferences for exploration or exploitation throughout the course of evolution. We observe how an exploration or exploitation tradeoff can be more easily understood by examining patterns in heritability and phenotypic diversity. In conclusion, we show how heritability can be a useful tool to better understand the relationship between genotypes and phenotypes, especially helpful when designing more complicated systems where complex individuals and environments can adapt and influence each other. △ Less

Submitted 21 October, 2021; originally announced October 2021.

arXiv:2107.08249 [pdf, other]

Gait-learning with morphologically evolving robots generated by L-system

Authors: Jie Luo, Daan Zeeuwe, Agoston E. Eiben

Abstract: When controllers (brains) and morphologies (bodies) of robots simultaneously evolve, this can lead to a problem, namely the brain & body mismatch problem. In this research, we propose a solution of lifetime learning. We set up a system where modular robots can create offspring that inherit the bodies of parents by recombination and mutation. With regards to the brains of the offspring, we use two… ▽ More When controllers (brains) and morphologies (bodies) of robots simultaneously evolve, this can lead to a problem, namely the brain & body mismatch problem. In this research, we propose a solution of lifetime learning. We set up a system where modular robots can create offspring that inherit the bodies of parents by recombination and mutation. With regards to the brains of the offspring, we use two methods to create them. The first one entails solely evolution which means the brain of a robot child is inherited from its parents. The second approach is evolution plus learning which means the brain of a child is inherited as well, but additionally is developed by a learning algorithm - RevDEknn. We compare these two methods by running experiments in a simulator called Revolve and use efficiency, efficacy, and the morphology intelligence of the robots for the comparison. The experiments show that the evolution plus learning method does not only lead to a higher fitness level, but also to more morphologically evolving robots. This constitutes a quantitative demonstration that changes in the brain can induce changes in the body, leading to the concept of morphological intelligence, which is quantified by the learning delta, meaning the ability of a morphology to facilitate learning. △ Less

Submitted 7 October, 2021; v1 submitted 17 July, 2021; originally announced July 2021.

Comments: 9 pages, 11 figures, IEEE SSCI conference

arXiv:2107.05249 [pdf, other]

Impact of Energy Efficiency on the Morphology and Behaviour of Evolved Robots

Authors: Margarita Rebolledo, Daan Zeeuwe, Thomas Bartz-Beielstein, A. E. Eiben

Abstract: Most evolutionary robotics studies focus on evolving some targeted behavior without taking the energy usage into account. This limits the practical value of such systems because energy efficiency is an important property for real-world autonomous robots. In this paper, we mitigate this problem by extending our simulator with a battery model and taking energy consumption into account during fitness… ▽ More Most evolutionary robotics studies focus on evolving some targeted behavior without taking the energy usage into account. This limits the practical value of such systems because energy efficiency is an important property for real-world autonomous robots. In this paper, we mitigate this problem by extending our simulator with a battery model and taking energy consumption into account during fitness evaluations. Using this system we investigate how energy awareness affects the evolution of robots. Since our system is to evolve morphologies as well as controllers, the main research question is twofold: (i) what is the impact on the morphologies of the evolved robots, and (ii) what is the impact on the behavior of the evolved robots if energy consumption is included in the fitness evaluation? The results show that including the energy consumption in the fitness in a multi-objective fashion (by NSGA-II) reduces the average size of robot bodies while at the same time reducing their speed. However, robots generated without size reduction can achieve speeds comparable to robots from the baseline set. △ Less

Submitted 12 July, 2021; originally announced July 2021.

ACM Class: I.2.9

arXiv:2105.07960 [pdf, other]

doi 10.1145/3449726.3463171

Behavior-based Neuroevolutionary Training in Reinforcement Learning

Authors: Jörg Stork, Martin Zaefferer, Nils Eisler, Patrick Tichelmann, Thomas Bartz-Beielstein, A. E. Eiben

Abstract: In addition to their undisputed success in solving classical optimization problems, neuroevolutionary and population-based algorithms have become an alternative to standard reinforcement learning methods. However, evolutionary methods often lack the sample efficiency of standard value-based methods that leverage gathered state and value experience. If reinforcement learning for real-world problems… ▽ More In addition to their undisputed success in solving classical optimization problems, neuroevolutionary and population-based algorithms have become an alternative to standard reinforcement learning methods. However, evolutionary methods often lack the sample efficiency of standard value-based methods that leverage gathered state and value experience. If reinforcement learning for real-world problems with significant resource cost is considered, sample efficiency is essential. The enhancement of evolutionary algorithms with experience exploiting methods is thus desired and promises valuable insights. This work presents a hybrid algorithm that combines topology-changing neuroevolutionary optimization with value-based reinforcement learning. We illustrate how the behavior of policies can be used to create distance and loss functions, which benefit from stored experiences and calculated state values. They allow us to model behavior and perform a directed search in the behavior space by gradient-free evolutionary algorithms and surrogate-based optimization. For this purpose, we consolidate different methods to generate and optimize agent policies, creating a diverse population. We exemplify the performance of our algorithm on standard benchmarks and a purpose-built real-world problem. Our results indicate that combining methods can enhance the sample efficiency and learning speed for evolutionary approaches. △ Less

Submitted 17 May, 2021; originally announced May 2021.

arXiv:2104.05610 [pdf, other]

A coevolutionary approach to deep multi-agent reinforcement learning

Authors: Daan Klijn, A. E. Eiben

Abstract: Traditionally, Deep Artificial Neural Networks (DNN's) are trained through gradient descent. Recent research shows that Deep Neuroevolution (DNE) is also capable of evolving multi-million-parameter DNN's, which proved to be particularly useful in the field of Reinforcement Learning (RL). This is mainly due to its excellent scalability and simplicity compared to the traditional MDP-based RL methods… ▽ More Traditionally, Deep Artificial Neural Networks (DNN's) are trained through gradient descent. Recent research shows that Deep Neuroevolution (DNE) is also capable of evolving multi-million-parameter DNN's, which proved to be particularly useful in the field of Reinforcement Learning (RL). This is mainly due to its excellent scalability and simplicity compared to the traditional MDP-based RL methods. So far, DNE has only been applied to complex single-agent problems. As evolutionary methods are a natural choice for multi-agent problems, the question arises whether DNE can also be applied in a complex multi-agent setting. In this paper, we describe and validate a new approach based on Coevolution. To validate our approach, we benchmark two Deep Coevolutionary Algorithms on a range of multi-agent Atari games and compare our results against the results of Ape-X DQN. Our results show that these Deep Coevolutionary algorithms (1) can be successfully trained to play various games, (2) outperform Ape-X DQN in some of them, and therefore (3) show that Coevolution can be a viable approach to solving complex multi-agent decision-making problems. △ Less

Submitted 13 April, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

arXiv:2104.04269 [pdf, other]

Morpho-evolution with learning using a controller archive as an inheritance mechanism

Authors: Léni K. Le Goff, Edgar Buchanan, Emma Hart, Agoston E. Eiben, Wei Li, Matteo De Carlo, Alan F. Winfield, Matthew F. Hale, Robert Woolley, Mike Angus, Jon Timmis, Andy M. Tyrrell

Abstract: The joint optimisation of body-plan and control via evolutionary processes can be challenging in rich morphological spaces in which offspring can have body-plans that are very different from either of their parents. This causes a potential mismatch between the structure of an inherited controller and the new body. To address this, we propose a framework that combines an evolutionary algorithm to g… ▽ More The joint optimisation of body-plan and control via evolutionary processes can be challenging in rich morphological spaces in which offspring can have body-plans that are very different from either of their parents. This causes a potential mismatch between the structure of an inherited controller and the new body. To address this, we propose a framework that combines an evolutionary algorithm to generate body-plans and a learning algorithm to optimise the parameters of a neural controller. The topology of this controller is created once the body-plan of each offspring body-plan is generated. The key novelty of the approach is to add an external archive for storing learned controllers that map to explicit `types' of robots (where this is defined with respect the features of the body-plan). By learning from a controller with an appropriate structure inherited from the archive, rather than from a randomly initialised one, we show that both the speed and magnitude of learning increases over time when compared to an approach that starts from scratch, using two tasks and three environments. The framework also provides new insights into the complex interactions between evolution and learning. △ Less

Submitted 23 September, 2021; v1 submitted 9 April, 2021; originally announced April 2021.

Comments: 15 pages including 2 pages of supplementary materials, 16 figures, 1 table. Currently under review for the special issue of IEEE TCDS on Towards autonomous evolution, (re)production and learning in robotic eco-systems. https://www.york.ac.uk/robot-lab/are/ieee_special_issue_2020/

arXiv:2103.15908 [pdf, other]

pH-RL: A personalization architecture to bring reinforcement learning to health practice

Authors: Ali el Hassouni, Mark Hoogendoorn, Marketa Ciharova, Annet Kleiboer, Khadicha Amarti, Vesa Muhonen, Heleen Riper, A. E. Eiben

Abstract: While reinforcement learning (RL) has proven to be the approach of choice for tackling many complex problems, it remains challenging to develop and deploy RL agents in real-life scenarios successfully. This paper presents pH-RL (personalization in e-Health with RL) a general RL architecture for personalization to bring RL to health practice. pH-RL allows for various levels of personalization in he… ▽ More While reinforcement learning (RL) has proven to be the approach of choice for tackling many complex problems, it remains challenging to develop and deploy RL agents in real-life scenarios successfully. This paper presents pH-RL (personalization in e-Health with RL) a general RL architecture for personalization to bring RL to health practice. pH-RL allows for various levels of personalization in health applications and allows for online and batch learning. Furthermore, we provide a general-purpose implementation framework that can be integrated with various healthcare applications. We describe a step-by-step guideline for the successful deployment of RL policies in a mobile application. We implemented our open-source RL architecture and integrated it with the MoodBuster mobile application for mental health to provide messages to increase daily adherence to the online therapeutic modules. We then performed a comprehensive study with human participants over a sustained period. Our experimental results show that the developed policies learn to select appropriate actions consistently using only a few days' worth of data. Furthermore, we empirically demonstrate the stability of the learned policies during the study. △ Less

Submitted 30 March, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

arXiv:2012.06474 [pdf, other]

Generating Human-Like Movement: A Comparison Between Two Approaches Based on Environmental Features

Authors: A. Zonta, S. K. Smit, A. E. Eiben

Abstract: Modelling realistic human behaviours in simulation is an ongoing challenge that resides between several fields like social sciences, philosophy, and artificial intelligence. Human movement is a special type of behaviour driven by intent (e.g. to get groceries) and the surrounding environment (e.g. curiosity to see new interesting places). Services available online and offline do not normally consi… ▽ More Modelling realistic human behaviours in simulation is an ongoing challenge that resides between several fields like social sciences, philosophy, and artificial intelligence. Human movement is a special type of behaviour driven by intent (e.g. to get groceries) and the surrounding environment (e.g. curiosity to see new interesting places). Services available online and offline do not normally consider the environment when planning a path, which is decisive especially on a leisure trip. Two novel algorithms have been presented to generate human-like trajectories based on environmental features. The Attraction-Based A* algorithm includes in its computation information from the environmental features meanwhile, the Feature-Based A* algorithm also injects information from the real trajectories in its computation. The human-likeness aspect has been tested by a human expert judging the final generated trajectories as realistic. This paper presents a comparison between the two approaches in some key metrics like efficiency, efficacy, and hyper-parameters sensitivity. We show how, despite generating trajectories that are closer to the real one according to our predefined metrics, the Feature-Based A* algorithm fall short in time efficiency compared to the Attraction-Based A* algorithm, hindering the usability of the model in the real world. △ Less

Submitted 11 December, 2020; originally announced December 2020.

Comments: 31 pages, 16 figures, submitted to Expert Systems with Applications

arXiv:2010.09531 [pdf, other]

Learning Locomotion Skills in Evolvable Robots

Authors: Gong** Lan, Maarten van Hooft, Matteo De Carlo, Jakub M. Tomczak, A. E. Eiben

Abstract: The challenge of robotic reproduction -- making of new robots by recombining two existing ones -- has been recently cracked and physically evolving robot systems have come within reach. Here we address the next big hurdle: producing an adequate brain for a newborn robot. In particular, we address the task of targeted locomotion which is arguably a fundamental skill in any practical implementation.… ▽ More The challenge of robotic reproduction -- making of new robots by recombining two existing ones -- has been recently cracked and physically evolving robot systems have come within reach. Here we address the next big hurdle: producing an adequate brain for a newborn robot. In particular, we address the task of targeted locomotion which is arguably a fundamental skill in any practical implementation. We introduce a controller architecture and a generic learning method to allow a modular robot with an arbitrary shape to learn to walk towards a target and follow this target if it moves. Our approach is validated on three robots, a spider, a gecko, and their offspring, in three real-world scenarios. △ Less

Submitted 19 October, 2020; originally announced October 2020.

Comments: 12 pages

arXiv:2010.06456 [pdf, other]

Population-based Optimization for Kinetic Parameter Identification in Glycolytic Pathway in Saccharomyces cerevisiae

Authors: Ewelina Weglarz-Tomczak, Jakub M. Tomczak, Agoston E. Eiben, Stanley Brul

Abstract: Models in systems biology are mathematical descriptions of biological processes that are used to answer questions and gain a better understanding of biological phenomena. Dynamic models represent the network through rates of the production and consumption for the individual species. The ordinary differential equations that describe rates of the reactions in the model include a set of parameters. T… ▽ More Models in systems biology are mathematical descriptions of biological processes that are used to answer questions and gain a better understanding of biological phenomena. Dynamic models represent the network through rates of the production and consumption for the individual species. The ordinary differential equations that describe rates of the reactions in the model include a set of parameters. The parameters are important quantities to understand and analyze biological systems. Moreover, the perturbation of the kinetic parameters are correlated with upregulation of the system by cell-intrinsic and cell-extrinsic factors, including mutations and the environment changes. Here, we aim at using well-established models of biological pathways to identify parameter values and point their potential perturbation/deviation. We present our population-based optimization framework that is able to identify kinetic parameters in the dynamic model based on only input and output data (i.e., timecourses of selected metabolites). Our approach can deal with the identification of the non-measurable parameters as well as with discovering deviation of the parameters. We present our proposed optimization framework on the example of the well-studied glycolytic pathway in Saccharomyces cerevisiae. △ Less

Submitted 19 September, 2020; originally announced October 2020.

Comments: Code at https://github.com/jmtomczak/popi

arXiv:2007.04725 [pdf, other]

EVO-RL: Evolutionary-Driven Reinforcement Learning

Authors: Ahmed Hallawa, Thorsten Born, Anke Schmeink, Guido Dartmann, Arne Peine, Lukas Martin, Giovanni Iacca, A. E. Eiben, Gerd Ascheid

Abstract: In this work, we propose a novel approach for reinforcement learning driven by evolutionary computation. Our algorithm, dubbed as Evolutionary-Driven Reinforcement Learning (evo-RL), embeds the reinforcement learning algorithm in an evolutionary cycle, where we distinctly differentiate between purely evolvable (instinctive) behaviour versus purely learnable behaviour. Furthermore, we propose that… ▽ More In this work, we propose a novel approach for reinforcement learning driven by evolutionary computation. Our algorithm, dubbed as Evolutionary-Driven Reinforcement Learning (evo-RL), embeds the reinforcement learning algorithm in an evolutionary cycle, where we distinctly differentiate between purely evolvable (instinctive) behaviour versus purely learnable behaviour. Furthermore, we propose that this distinction is decided by the evolutionary process, thus allowing evo-RL to be adaptive to different environments. In addition, evo-RL facilitates learning on environments with rewardless states, which makes it more suited for real-world problems with incomplete information. To show that evo-RL leads to state-of-the-art performance, we present the performance of different state-of-the-art reinforcement learning algorithms when operating within evo-RL and compare it with the case when these same algorithms are executed independently. Results show that reinforcement learning algorithms embedded within our evo-RL approach significantly outperform the stand-alone versions of the same RL algorithms on OpenAI Gym control problems with rewardless states constrained by the same computational budget. △ Less

Submitted 10 July, 2020; v1 submitted 9 July, 2020; originally announced July 2020.

Comments: 9 pages, 7 figures

arXiv:2006.00025 [pdf, other]

Environmental regulation using Plasticoding for the evolution of robots

Authors: Karine Miras, Eliseo Ferrante, A. E. Eiben

Abstract: Evolutionary robot systems are usually affected by the properties of the environment indirectly through selection. In this paper, we present and investigate a system where the environment also has a direct effect: through regulation. We propose a novel robot encoding method where a genotype encodes multiple possible phenotypes, and the incarnation of a robot depends on the environmental conditions… ▽ More Evolutionary robot systems are usually affected by the properties of the environment indirectly through selection. In this paper, we present and investigate a system where the environment also has a direct effect: through regulation. We propose a novel robot encoding method where a genotype encodes multiple possible phenotypes, and the incarnation of a robot depends on the environmental conditions taking place in a determined moment of its life. This means that the morphology, controller, and behavior of a robot can change according to the environment. Importantly, this process of development can happen at any moment of a robot lifetime, according to its experienced environmental stimuli. We provide an empirical proof-of-concept, and the analysis of the experimental results shows that Plasticoding improves adaptation (task performance) while leading to different evolved morphologies, controllers, and behaviour. △ Less

Submitted 29 May, 2020; originally announced June 2020.

Comments: This paper was submitted to the Frontiers in Robotics and AI journal on the 22/02/2020, and is still under review

arXiv:2005.04166 [pdf, other]

Time Efficiency in Optimization with a Bayesian-Evolutionary Algorithm

Authors: Gong** Lan, Jakub M. Tomczak, Diederik M. Roijers, A. E. Eiben

Abstract: Not all generate-and-test search algorithms are created equal. Bayesian Optimization (BO) invests a lot of computation time to generate the candidate solution that best balances the predicted value and the uncertainty given all previous data, taking increasingly more time as the number of evaluations performed grows. Evolutionary Algorithms (EA) on the other hand rely on search heuristics that typ… ▽ More Not all generate-and-test search algorithms are created equal. Bayesian Optimization (BO) invests a lot of computation time to generate the candidate solution that best balances the predicted value and the uncertainty given all previous data, taking increasingly more time as the number of evaluations performed grows. Evolutionary Algorithms (EA) on the other hand rely on search heuristics that typically do not depend on all previous data and can be done in constant time. Both the BO and EA community typically assess their performance as a function of the number of evaluations. However, this is unfair once we start to compare the efficiency of these classes of algorithms, as the overhead times to generate candidate solutions are significantly different. We suggest to measure the efficiency of generate-and-test search algorithms as the expected gain in the objective value per unit of computation time spent. We observe that the preference of an algorithm to be used can change after a number of function evaluations. We therefore propose a new algorithm, a combination of Bayesian optimization and an Evolutionary Algorithm, BEA for short, that starts with BO, then transfers knowledge to an EA, and subsequently runs the EA. We compare the BEA with BO and the EA. The results show that BEA outperforms both BO and the EA in terms of time efficiency, and ultimately leads to better performance on well-known benchmark objective functions with many local optima. Moreover, we test the three algorithms on nine test cases of robot learning problems and here again we find that BEA outperforms the other algorithms. △ Less

Submitted 4 May, 2020; originally announced May 2020.

Comments: 13 pages, 10 Figures

arXiv:2002.02869 [pdf, other]

Differential Evolution with Reversible Linear Transformations

Authors: Jakub M. Tomczak, Ewelina Weglarz-Tomczak, Agoston E. Eiben

Abstract: Differential evolution (DE) is a well-known type of evolutionary algorithms (EA). Similarly to other EA variants it can suffer from small populations and loose diversity too quickly. This paper presents a new approach to mitigate this issue: We propose to generate new candidate solutions by utilizing reversible linear transformation applied to a triplet of solutions from the population. In other w… ▽ More Differential evolution (DE) is a well-known type of evolutionary algorithms (EA). Similarly to other EA variants it can suffer from small populations and loose diversity too quickly. This paper presents a new approach to mitigate this issue: We propose to generate new candidate solutions by utilizing reversible linear transformation applied to a triplet of solutions from the population. In other words, the population is enlarged by using newly generated individuals without evaluating their fitness. We assess our methods on three problems: (i) benchmark function optimization, (ii) discovering parameter values of the gene repressilator system, (iii) learning neural networks. The empirical results indicate that the proposed approach outperforms vanilla DE and a version of DE with applying differential mutation three times on all testbeds. △ Less

Submitted 7 February, 2020; originally announced February 2020.

Comments: Code: https://github.com/jmtomczak

arXiv:2001.07804 [pdf]

Learning Directed Locomotion in Modular Robots with Evolvable Morphologies

Authors: Gong** Lan, Matteo De Carlo, Fuda van Diggelen, Jakub M. Tomczak, Diederik M. Roijers, A. E. Eiben

Abstract: We generalize the well-studied problem of gait learning in modular robots in two dimensions. Firstly, we address locomotion in a given target direction that goes beyond learning a typical undirected gait. Secondly, rather than studying one fixed robot morphology we consider a test suite of different modular robots. This study is based on our interest in evolutionary robot systems where both morpho… ▽ More We generalize the well-studied problem of gait learning in modular robots in two dimensions. Firstly, we address locomotion in a given target direction that goes beyond learning a typical undirected gait. Secondly, rather than studying one fixed robot morphology we consider a test suite of different modular robots. This study is based on our interest in evolutionary robot systems where both morphologies and controllers evolve. In such a system, newborn robots have to learn to control their own body that is a random combination of the bodies of the parents. We apply and compare two learning algorithms, Bayesian optimization and HyperNEAT. The results of the experiments in simulation show that both methods successfully learn good controllers, but Bayesian optimization is more effective and efficient. We validate the best learned controllers by constructing three robots from the test suite in the real world and observe their fitness and actual trajectories. The obtained results indicate a reality gap that depends on the controllers and the shape of the robots, but overall the trajectories are adequate and follow the target directions successfully. △ Less

Submitted 21 January, 2020; originally announced January 2020.

Comments: 30 pages, 14 figures

arXiv:1912.10445 [pdf, other]

EvoMan: Game-playing Competition

Authors: Fabricio Olivetti de Franca, Denis Fantinato, Karine Miras, A. E. Eiben, Patricia A. Vargas

Abstract: This paper describes a competition proposal for evolving Intelligent Agents for the game-playing framework called EvoMan. The framework is based on the boss fights of the game called Mega Man II developed by Capcom. For this particular competition, the main goal is to beat all of the eight bosses using a generalist strategy. In other words, the competitors should train the agent to beat a set of t… ▽ More This paper describes a competition proposal for evolving Intelligent Agents for the game-playing framework called EvoMan. The framework is based on the boss fights of the game called Mega Man II developed by Capcom. For this particular competition, the main goal is to beat all of the eight bosses using a generalist strategy. In other words, the competitors should train the agent to beat a set of the bosses and then the agent will be evaluated by its performance against all eight bosses. At the end of this paper, the competitors are provided with baseline results so that they can have an intuition on how good their results are. △ Less

Submitted 4 January, 2020; v1 submitted 22 December, 2019; originally announced December 2019.

arXiv:1907.09300 [pdf, other]

doi 10.1145/3321707.3321829

Surrogate Models for Enhancing the Efficiency of Neuroevolution in Reinforcement Learning

Authors: Jörg Stork, Martin Zaefferer, Thomas Bartz-Beielstein, A. E. Eiben

Abstract: In the last years, reinforcement learning received a lot of attention. One method to solve reinforcement learning tasks is Neuroevolution, where neural networks are optimized by evolutionary algorithms. A disadvantage of Neuroevolution is that it can require numerous function evaluations, while not fully utilizing the available information from each fitness evaluation. This is especially problemat… ▽ More In the last years, reinforcement learning received a lot of attention. One method to solve reinforcement learning tasks is Neuroevolution, where neural networks are optimized by evolutionary algorithms. A disadvantage of Neuroevolution is that it can require numerous function evaluations, while not fully utilizing the available information from each fitness evaluation. This is especially problematic when fitness evaluations become expensive. To reduce the cost of fitness evaluations, surrogate models can be employed to partially replace the fitness function. The difficulty of surrogate modeling for Neuroevolution is the complex search space and how to compare different networks. To that end, recent studies showed that a kernel based approach, particular with phenotypic distance measures, works well. These kernels compare different networks via their behavior (phenotype) rather than their topology or encoding (genotype). In this work, we discuss the use of surrogate model-based Neuroevolution (SMB-NE) using a phenotypic distance for reinforcement learning. In detail, we investigate a) the potential of SMB-NE with respect to evaluation efficiency and b) how to select adequate input sets for the phenotypic distance measure in a reinforcement learning problem. The results indicate that we are able to considerably increase the evaluation efficiency using dynamic input sets. △ Less

Submitted 22 July, 2019; originally announced July 2019.

Comments: This is the authors version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Genetic and Evolutionary Computation Conference (GECCO 2019)

Journal ref: 2019, Genetic and Evolutionary Computation Conference (GECCO 2019), Prague, Czech Republic. ACM, New York, NY, USA

arXiv:1901.05704 [pdf, other]

doi 10.1038/s42256-018-0009-9

Evolving embodied intelligence from materials to machines

Authors: David Howard, Agoston E. Eiben, Danielle Frances Kennedy, Jean-Baptiste Mouret, Philip Valencia, Dave Winkler

Abstract: Natural lifeforms specialise to their environmental niches across many levels; from low-level features such as DNA and proteins, through to higher-level artefacts including eyes, limbs, and overarching body plans. We propose Multi-Level Evolution (MLE), a bottom-up automatic process that designs robots across multiple levels and niches them to tasks and environmental conditions. MLE concurrently e… ▽ More Natural lifeforms specialise to their environmental niches across many levels; from low-level features such as DNA and proteins, through to higher-level artefacts including eyes, limbs, and overarching body plans. We propose Multi-Level Evolution (MLE), a bottom-up automatic process that designs robots across multiple levels and niches them to tasks and environmental conditions. MLE concurrently explores constituent molecular and material 'building blocks', as well as their possible assemblies into specialised morphological and sensorimotor configurations. MLE provides a route to fully harness a recent explosion in available candidate materials and ongoing advances in rapid manufacturing processes. We outline a feasible MLE architecture that realises this vision, highlight the main roadblocks and how they may be overcome, and show robotic applications to which MLE is particularly suited. By forming a research agenda to stimulate discussion between researchers in related fields, we hope to inspire the pursuit of multi-level robotic design all the way from material to machine. △ Less

Submitted 17 January, 2019; originally announced January 2019.

Journal ref: Nature Machine Intelligence. Vol. 1, Number 1, pages 12--19. 2019

arXiv:1808.08818 [pdf, other]

doi 10.1007/s11047-020-09820-4

A new Taxonomy of Continuous Global Optimization Algorithms

Authors: Jörg Stork, A. E. Eiben, Thomas Bartz-Beielstein

Abstract: Surrogate-based optimization, nature-inspired metaheuristics, and hybrid combinations have become state of the art in algorithm design for solving real-world optimization problems. Still, it is difficult for practitioners to get an overview that explains their advantages in comparison to a large number of available methods in the scope of optimization. Available taxonomies lack the embedding of cu… ▽ More Surrogate-based optimization, nature-inspired metaheuristics, and hybrid combinations have become state of the art in algorithm design for solving real-world optimization problems. Still, it is difficult for practitioners to get an overview that explains their advantages in comparison to a large number of available methods in the scope of optimization. Available taxonomies lack the embedding of current approaches in the larger context of this broad field. This article presents a taxonomy of the field, which explores and matches algorithm strategies by extracting similarities and differences in their search strategies. A particular focus lies on algorithms using surrogates, nature-inspired designs, and those created by design optimization. The extracted features of components or operators allow us to create a set of classification indicators to distinguish between a small number of classes. The features allow a deeper understanding of components of the search strategies and further indicate the close connections between the different algorithm designs. We present intuitive analogies to explain the basic principles of the search algorithms, particularly useful for novices in this research field. Furthermore, this taxonomy allows recommendations for the applicability of the corresponding algorithms. △ Less

Submitted 6 May, 2020; v1 submitted 27 August, 2018; originally announced August 2018.

Comments: 35 pages total, 28 written pages, 4 figures, 2019 Reworked Version

Journal ref: Natural Computing, 2020, 1-24

arXiv:1804.03592 [pdf, other]

A clustering-based reinforcement learning approach for tailored personalization of e-Health interventions

Authors: Ali el Hassouni, Mark Hoogendoorn, Martijn van Otterlo, A. E. Eiben, Vesa Muhonen, Eduardo Barbaro

Abstract: Personalization is very powerful in improving the effectiveness of health interventions. Reinforcement learning (RL) algorithms are suitable for learning these tailored interventions from sequential data collected about individuals. However, learning can be very fragile. The time to learn intervention policies is limited as disengagement from the user can occur quickly. Also, in e-Health intervent… ▽ More Personalization is very powerful in improving the effectiveness of health interventions. Reinforcement learning (RL) algorithms are suitable for learning these tailored interventions from sequential data collected about individuals. However, learning can be very fragile. The time to learn intervention policies is limited as disengagement from the user can occur quickly. Also, in e-Health intervention timing can be crucial before the optimal window passes. We present an approach that learns tailored personalization policies for groups of users by combining RL and clustering. The benefits are two-fold: speeding up the learning to prevent disengagement while maintaining a high level of personalization. Our clustering approach utilizes dynamic time war** to compare user trajectories consisting of states and rewards. We apply online and batch RL to learn policies over clusters of individuals and introduce our self-developed and publicly available simulator for e-Health interventions to evaluate our approach. We compare our methods with an e-Health intervention benchmark. We demonstrate that batch learning outperforms online learning for our setting. Furthermore, our proposed clustering approach for RL finds near-optimal clusterings which lead to significantly better policies in terms of cumulative reward compared to learning a policy per individual or learning one non-personalized policy across all individuals. Our findings also indicate that the learned policies accurately learn to send interventions at the right moments and that the users workout more and at the right times of the day. △ Less

Submitted 21 May, 2020; v1 submitted 10 April, 2018; originally announced April 2018.

arXiv:1704.00702 [pdf, other]

doi 10.1007/978-3-319-55453-2_10

Multi-rendezvous Spacecraft Trajectory Optimization with Beam P-ACO

Authors: Luís F. Simões, Dario Izzo, Evert Haasdijk, A. E. Eiben

Abstract: The design of spacecraft trajectories for missions visiting multiple celestial bodies is here framed as a multi-objective bilevel optimization problem. A comparative study is performed to assess the performance of different Beam Search algorithms at tackling the combinatorial problem of finding the ideal sequence of bodies. Special focus is placed on the development of a new hybridization between… ▽ More The design of spacecraft trajectories for missions visiting multiple celestial bodies is here framed as a multi-objective bilevel optimization problem. A comparative study is performed to assess the performance of different Beam Search algorithms at tackling the combinatorial problem of finding the ideal sequence of bodies. Special focus is placed on the development of a new hybridization between Beam Search and the Population-based Ant Colony Optimization algorithm. An experimental evaluation shows all algorithms achieving exceptional performance on a hard benchmark problem. It is found that a properly tuned deterministic Beam Search always outperforms the remaining variants. Beam P-ACO, however, demonstrates lower parameter sensitivity, while offering superior worst-case performance. Being an anytime algorithm, it is then found to be the preferable choice for certain practical applications. △ Less

Submitted 3 April, 2017; originally announced April 2017.

Comments: Code available at https://github.com/lfsimoes/beam_paco__gtoc5

ACM Class: I.2.8

Journal ref: EvoCOP 2017, LNCS 10197, pp. 141-156, 2017

arXiv:1106.0190 [pdf, other]

Evolution of Things

Authors: A. E. Eiben, N. Ferreira, M. Schut, S. Kernbach

Abstract: Evolution is one of the major omnipresent powers in the universe that has been studied for about two centuries. Recent scientific and technical developments make it possible to make the transition from passively understanding to actively mastering evolution. As of today, the only area where human experimenters can design and manipulate evolutionary processes in full is that of Evolutionary Computi… ▽ More Evolution is one of the major omnipresent powers in the universe that has been studied for about two centuries. Recent scientific and technical developments make it possible to make the transition from passively understanding to actively mastering evolution. As of today, the only area where human experimenters can design and manipulate evolutionary processes in full is that of Evolutionary Computing, where evolutionary processes are carried out in a digital space, inside computers, in simulation. We argue that in the near future it will be possible to move evolutionary computing outside such imaginary spaces and make it physically embodied. In other words, we envision the "Evolution of Things", rather than just the evolution of code, leading to a new field of Embodied Artificial Evolution (EAE). The main objective of the present paper is to offer an umbrella term and vision in order to aid the development of this high potential research area. To this end, we introduce the notion of EAE, discuss a few examples and applications, and elaborate on the expected benefits as well as the grand challenges this develo** field will have to address. △ Less

Submitted 1 June, 2011; originally announced June 2011.

Comments: Paper 5 for the First International Workshop of Distributed Evolutionary computation in Informal Environments

arXiv:cs/0511004 [pdf, ps, other]

Evolutionary Computing

Authors: Aguston E. Eiben, Marc Schoenauer

Abstract: Evolutionary computing (EC) is an exciting development in Computer Science. It amounts to building, applying and studying algorithms based on the Darwinian principles of natural selection. In this paper we briefly introduce the main concepts behind evolutionary computing. We present the main components all evolutionary algorithms (EA), sketch the differences between different types of EAs and su… ▽ More Evolutionary computing (EC) is an exciting development in Computer Science. It amounts to building, applying and studying algorithms based on the Darwinian principles of natural selection. In this paper we briefly introduce the main concepts behind evolutionary computing. We present the main components all evolutionary algorithms (EA), sketch the differences between different types of EAs and survey application areas ranging from optimization, modeling and simulation to entertainment. △ Less

Submitted 1 November, 2005; originally announced November 2005.

Showing 1–32 of 32 results for author: Eiben, A E