-
Discovering and Exploiting Sparse Rewards in a Learned Behavior Space
Authors:
Giuseppe Paolo,
Miranda Coninx,
Alban Laflaquière,
Stephane Doncieux
Abstract:
Learning optimal policies in sparse rewards settings is difficult as the learning agent has little to no feedback on the quality of its actions. In these situations, a good strategy is to focus on exploration, hopefully leading to the discovery of a reward signal to improve on. A learning algorithm capable of dealing with this kind of settings has to be able to (1) explore possible agent behaviors…
▽ More
Learning optimal policies in sparse rewards settings is difficult as the learning agent has little to no feedback on the quality of its actions. In these situations, a good strategy is to focus on exploration, hopefully leading to the discovery of a reward signal to improve on. A learning algorithm capable of dealing with this kind of settings has to be able to (1) explore possible agent behaviors and (2) exploit any possible discovered reward. Efficient exploration algorithms have been proposed that require to define a behavior space, that associates to an agent its resulting behavior in a space that is known to be worth exploring. The need to define this space is a limitation of these algorithms. In this work, we introduce STAX, an algorithm designed to learn a behavior space on-the-fly and to explore it while efficiently optimizing any reward discovered. It does so by separating the exploration and learning of the behavior space from the exploitation of the reward through an alternating two-steps process. In the first step, STAX builds a repertoire of diverse policies while learning a low-dimensional representation of the high-dimensional observations generated during the policies evaluation. In the exploitation step, emitters are used to optimize the performance of the discovered rewarding solutions. Experiments conducted on three different sparse reward environments show that STAX performs comparably to existing baselines while requiring much less prior information about the task as it autonomously builds the behavior space.
△ Less
Submitted 26 September, 2023; v1 submitted 2 November, 2021;
originally announced November 2021.
-
Sparse Reward Exploration via Novelty Search and Emitters
Authors:
Giuseppe Paolo,
Alexandre Coninx,
Stephane Doncieux,
Alban Laflaquière
Abstract:
Reward-based optimization algorithms require both exploration, to find rewards, and exploitation, to maximize performance. The need for efficient exploration is even more significant in sparse reward settings, in which performance feedback is given sparingly, thus rendering it unsuitable for guiding the search process. In this work, we introduce the SparsE Reward Exploration via Novelty and Emitte…
▽ More
Reward-based optimization algorithms require both exploration, to find rewards, and exploitation, to maximize performance. The need for efficient exploration is even more significant in sparse reward settings, in which performance feedback is given sparingly, thus rendering it unsuitable for guiding the search process. In this work, we introduce the SparsE Reward Exploration via Novelty and Emitters (SERENE) algorithm, capable of efficiently exploring a search space, as well as optimizing rewards found in potentially disparate areas. Contrary to existing emitters-based approaches, SERENE separates the search space exploration and reward exploitation into two alternating processes. The first process performs exploration through Novelty Search, a divergent search algorithm. The second one exploits discovered reward areas through emitters, i.e. local instances of population-based optimization algorithms. A meta-scheduler allocates a global computational budget by alternating between the two processes, ensuring the discovery and efficient exploitation of disjoint reward areas. SERENE returns both a collection of diverse solutions covering the search space and a collection of high-performing solutions for each distinct reward area. We evaluate SERENE on various sparse reward environments and show it compares favorably to existing baselines.
△ Less
Submitted 16 April, 2021; v1 submitted 5 February, 2021;
originally announced February 2021.
-
Emergence of Spatial Coordinates via Exploration
Authors:
Alban Laflaquière
Abstract:
Spatial knowledge is a fundamental building block for the development of advanced perceptive and cognitive abilities. Traditionally, in robotics, the Euclidean (x,y,z) coordinate system and the agent's forward model are defined a priori. We show that a naive agent can autonomously build an internal coordinate system, with the same dimension and metric regularity as the external space, simply by le…
▽ More
Spatial knowledge is a fundamental building block for the development of advanced perceptive and cognitive abilities. Traditionally, in robotics, the Euclidean (x,y,z) coordinate system and the agent's forward model are defined a priori. We show that a naive agent can autonomously build an internal coordinate system, with the same dimension and metric regularity as the external space, simply by learning to predict the outcome of sensorimotor transitions in a self-supervised way.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
Novelty Search makes Evolvability Inevitable
Authors:
Stephane Doncieux,
Giuseppe Paolo,
Alban Laflaquière,
Alexandre Coninx
Abstract:
Evolvability is an important feature that impacts the ability of evolutionary processes to find interesting novel solutions and to deal with changing conditions of the problem to solve. The estimation of evolvability is not straightforward and is generally too expensive to be directly used as selective pressure in the evolutionary process. Indirectly promoting evolvability as a side effect of othe…
▽ More
Evolvability is an important feature that impacts the ability of evolutionary processes to find interesting novel solutions and to deal with changing conditions of the problem to solve. The estimation of evolvability is not straightforward and is generally too expensive to be directly used as selective pressure in the evolutionary process. Indirectly promoting evolvability as a side effect of other easier and faster to compute selection pressures would thus be advantageous. In an unbounded behavior space, it has already been shown that evolvable individuals naturally appear and tend to be selected as they are more likely to invade empty behavior niches. Evolvability is thus a natural byproduct of the search in this context. However, practical agents and environments often impose limits on the reach-able behavior space. How do these boundaries impact evolvability? In this context, can evolvability still be promoted without explicitly rewarding it? We show that Novelty Search implicitly creates a pressure for high evolvability even in bounded behavior spaces, and explore the reasons for such a behavior. More precisely we show that, throughout the search, the dynamic evaluation of novelty rewards individuals which are very mobile in the behavior space, which in turn promotes evolvability.
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
Unsupervised Learning and Exploration of Reachable Outcome Space
Authors:
Giuseppe Paolo,
Alban Laflaquière,
Alexandre Coninx,
Stephane Doncieux
Abstract:
Performing Reinforcement Learning in sparse rewards settings, with very little prior knowledge, is a challenging problem since there is no signal to properly guide the learning process. In such situations, a good search strategy is fundamental. At the same time, not having to adapt the algorithm to every single problem is very desirable. Here we introduce TAXONS, a Task Agnostic eXploration of Out…
▽ More
Performing Reinforcement Learning in sparse rewards settings, with very little prior knowledge, is a challenging problem since there is no signal to properly guide the learning process. In such situations, a good search strategy is fundamental. At the same time, not having to adapt the algorithm to every single problem is very desirable. Here we introduce TAXONS, a Task Agnostic eXploration of Outcome spaces through Novelty and Surprise algorithm. Based on a population-based divergent-search approach, it learns a set of diverse policies directly from high-dimensional observations, without any task-specific information. TAXONS builds a repertoire of policies while training an autoencoder on the high-dimensional observation of the final state of the system to build a low-dimensional outcome space. The learned outcome space, combined with the reconstruction error, is used to drive the search for new policies. Results show that TAXONS can find a diverse set of controllers, covering a good part of the ground-truth outcome space, while having no information about such space.
△ Less
Submitted 4 May, 2020; v1 submitted 12 September, 2019;
originally announced September 2019.
-
Unsupervised Emergence of Egocentric Spatial Structure from Sensorimotor Prediction
Authors:
Alban Laflaquière,
Michael Garcia Ortiz
Abstract:
Despite its omnipresence in robotics application, the nature of spatial knowledge and the mechanisms that underlie its emergence in autonomous agents are still poorly understood. Recent theoretical works suggest that the Euclidean structure of space induces invariants in an agent's raw sensorimotor experience. We hypothesize that capturing these invariants is beneficial for sensorimotor prediction…
▽ More
Despite its omnipresence in robotics application, the nature of spatial knowledge and the mechanisms that underlie its emergence in autonomous agents are still poorly understood. Recent theoretical works suggest that the Euclidean structure of space induces invariants in an agent's raw sensorimotor experience. We hypothesize that capturing these invariants is beneficial for sensorimotor prediction and that, under certain exploratory conditions, a motor representation capturing the structure of the external space should emerge as a byproduct of learning to predict future sensory experiences. We propose a simple sensorimotor predictive scheme, apply it to different agents and types of exploration, and evaluate the pertinence of these hypotheses. We show that a naive agent can capture the topology and metric regularity of its sensor's position in an egocentric spatial frame without any a priori knowledge, nor extraneous supervision.
△ Less
Submitted 17 September, 2019; v1 submitted 4 June, 2019;
originally announced June 2019.
-
Self-supervised Body Image Acquisition Using a Deep Neural Network for Sensorimotor Prediction
Authors:
Alban Laflaquière,
Verena V. Hafner
Abstract:
This work investigates how a naive agent can acquire its own body image in a self-supervised way, based on the predictability of its sensorimotor experience. Our working hypothesis is that, due to its temporal stability, an agent's body produces more consistent sensory experiences than the environment, which exhibits a greater variability. Given its motor experience, an agent can thus reliably pre…
▽ More
This work investigates how a naive agent can acquire its own body image in a self-supervised way, based on the predictability of its sensorimotor experience. Our working hypothesis is that, due to its temporal stability, an agent's body produces more consistent sensory experiences than the environment, which exhibits a greater variability. Given its motor experience, an agent can thus reliably predict what appearance its body should have. This intrinsic predictability can be used to automatically isolate the body image from the rest of the environment. We propose a two-branches deconvolutional neural network to predict the visual sensory state associated with an input motor state, as well as the prediction error associated with this input. We train the network on a dataset of first-person images collected with a simulated Pepper robot, and show how the network outputs can be used to automatically isolate its visible arm from the rest of the environment. Finally, the quality of the body image produced by the network is evaluated.
△ Less
Submitted 3 June, 2019;
originally announced June 2019.
-
Identification of Invariant Sensorimotor Structures as a Prerequisite for the Discovery of Objects
Authors:
Nicolas Le Hir,
Olivier Sigaud,
Alban Laflaquière
Abstract:
Perceiving the surrounding environment in terms of objects is useful for any general purpose intelligent agent. In this paper, we investigate a fundamental mechanism making object perception possible, namely the identification of spatio-temporally invariant structures in the sensorimotor experience of an agent. We take inspiration from the Sensorimotor Contingencies Theory to define a computationa…
▽ More
Perceiving the surrounding environment in terms of objects is useful for any general purpose intelligent agent. In this paper, we investigate a fundamental mechanism making object perception possible, namely the identification of spatio-temporally invariant structures in the sensorimotor experience of an agent. We take inspiration from the Sensorimotor Contingencies Theory to define a computational model of this mechanism through a sensorimotor, unsupervised and predictive approach. Our model is based on processing the unsupervised interaction of an artificial agent with its environment. We show how spatio-temporally invariant structures in the environment induce regularities in the sensorimotor experience of an agent, and how this agent, while building a predictive model of its sensorimotor experience, can capture them as densely connected subgraphs in a graph of sensory states connected by motor commands. Our approach is focused on elementary mechanisms, and is illustrated with a set of simple experiments in which an agent interacts with an environment. We show how the agent can build an internal model of moving but spatio-temporally invariant structures by performing a Spectral Clustering of the graph modeling its overall sensorimotor experiences. We systematically examine properties of the model, shedding light more globally on the specificities of the paradigm with respect to methods based on the supervised processing of collections of static images.
△ Less
Submitted 11 October, 2018;
originally announced October 2018.
-
Learning agent's spatial configuration from sensorimotor invariants
Authors:
Alban Laflaquière,
J. Kevin O'Regan,
Sylvain Argentieri,
Bruno Gas,
Alexander V. Terekhov
Abstract:
The design of robotic systems is largely dictated by our purely human intuition about how we perceive the world. This intuition has been proven incorrect with regard to a number of critical issues, such as visual change blindness. In order to develop truly autonomous robots, we must step away from this intuition and let robotic agents develop their own way of perceiving. The robot should start fro…
▽ More
The design of robotic systems is largely dictated by our purely human intuition about how we perceive the world. This intuition has been proven incorrect with regard to a number of critical issues, such as visual change blindness. In order to develop truly autonomous robots, we must step away from this intuition and let robotic agents develop their own way of perceiving. The robot should start from scratch and gradually develop perceptual notions, under no prior assumptions, exclusively by looking into its sensorimotor experience and identifying repetitive patterns and invariants. One of the most fundamental perceptual notions, space, cannot be an exception to this requirement. In this paper we look into the prerequisites for the emergence of simplified spatial notions on the basis of a robot's sensorimotor flow. We show that the notion of space as environment-independent cannot be deduced solely from exteroceptive information, which is highly variable and is mainly determined by the contents of the environment. The environment-independent definition of space can be approached by looking into the functions that link the motor commands to changes in exteroceptive inputs. In a sufficiently rich environment, the kernels of these functions correspond uniquely to the spatial configuration of the agent's exteroceptors. We simulate a redundant robotic arm with a retina installed at its end-point and show how this agent can learn the configuration space of its retina. The resulting manifold has the topology of the Cartesian product of a plane and a circle, and corresponds to the planar position and orientation of the retina.
△ Less
Submitted 3 October, 2018;
originally announced October 2018.
-
Grounding the Experience of a Visual Field through Sensorimotor Contingencies
Authors:
Alban Laflaquière
Abstract:
Artificial perception is traditionally handled by hand-designing task specific algorithms. However, a truly autonomous robot should develop perceptive abilities on its own, by interacting with its environment, and adapting to new situations. The sensorimotor contingencies theory proposes to ground the development of those perceptive abilities in the way the agent can actively transform its sensory…
▽ More
Artificial perception is traditionally handled by hand-designing task specific algorithms. However, a truly autonomous robot should develop perceptive abilities on its own, by interacting with its environment, and adapting to new situations. The sensorimotor contingencies theory proposes to ground the development of those perceptive abilities in the way the agent can actively transform its sensory inputs. We propose a sensorimotor approach, inspired by this theory, in which the agent explores the world and discovers its properties by capturing the sensorimotor regularities they induce. This work presents an application of this approach to the discovery of a so-called visual field as the set of regularities that a visual sensor imposes on a naive agent's experience. A formalism is proposed to describe how those regularities can be captured in a sensorimotor predictive model. Finally, the approach is evaluated on a simulated system coarsely inspired from the human retina.
△ Less
Submitted 3 October, 2018;
originally announced October 2018.
-
Grounding Perception: A Developmental Approach to Sensorimotor Contingencies
Authors:
Alban Laflaquière,
Nikolas Hemion,
Michaël Garcia Ortiz,
Jean-Christophe Baillie
Abstract:
Sensorimotor contingency theory offers a promising account of the nature of perception, a topic rarely addressed in the robotics community. We propose a developmental framework to address the problem of the autonomous acquisition of sensorimotor contingencies by a naive robot. While exploring the world, the robot internally encodes contingencies as predictive models that capture the structure they…
▽ More
Sensorimotor contingency theory offers a promising account of the nature of perception, a topic rarely addressed in the robotics community. We propose a developmental framework to address the problem of the autonomous acquisition of sensorimotor contingencies by a naive robot. While exploring the world, the robot internally encodes contingencies as predictive models that capture the structure they imply in its sensorimotor experience. Three preliminary applications are presented to illustrate our approach to the acquisition of perceptive abilities: discovering the environment, discovering objects, and discovering a visual field.
△ Less
Submitted 3 October, 2018;
originally announced October 2018.
-
A Non-linear Approach to Space Dimension Perception by a Naive Agent
Authors:
Alban Laflaquière,
Sylvain Argentieri,
Olivia Breysse,
Stéphane Genet,
Bruno Gas
Abstract:
Developmental Robotics offers a new approach to numerous AI features that are often taken as granted. Traditionally, perception is supposed to be an inherent capacity of the agent. Moreover, it largely relies on models built by the system's designer. A new approach is to consider perception as an experimentally acquired ability that is learned exclusively through the analysis of the agent's sensor…
▽ More
Developmental Robotics offers a new approach to numerous AI features that are often taken as granted. Traditionally, perception is supposed to be an inherent capacity of the agent. Moreover, it largely relies on models built by the system's designer. A new approach is to consider perception as an experimentally acquired ability that is learned exclusively through the analysis of the agent's sensorimotor flow. Previous works, based on H.Poincaré's intuitions and the sensorimotor contingencies theory, allow a simulated agent to extract the dimension of geometrical space in which it is immersed without any a priori knowledge. Those results are limited to infinitesimal movement's amplitude of the system. In this paper, a non-linear dimension estimation method is proposed to push back this limitation.
△ Less
Submitted 3 October, 2018;
originally announced October 2018.
-
Learning an internal representation of the end-effector configuration space
Authors:
Alban Laflaquière,
Alexander V. Terekhov,
Bruno Gas,
J. Kevin O'Regan
Abstract:
Current machine learning techniques proposed to automatically discover a robot kinematics usually rely on a priori information about the robot's structure, sensors properties or end-effector position. This paper proposes a method to estimate a certain aspect of the forward kinematics model with no such information. An internal representation of the end-effector configuration is generated from unst…
▽ More
Current machine learning techniques proposed to automatically discover a robot kinematics usually rely on a priori information about the robot's structure, sensors properties or end-effector position. This paper proposes a method to estimate a certain aspect of the forward kinematics model with no such information. An internal representation of the end-effector configuration is generated from unstructured proprioceptive and exteroceptive data flow under very limited assumptions. A map** from the proprioceptive space to this representational space can then be used to control the robot.
△ Less
Submitted 3 October, 2018;
originally announced October 2018.
-
Unsupervised Emergence of Spatial Structure from Sensorimotor Prediction
Authors:
Alban Laflaquière,
Michael Garcia Ortiz
Abstract:
Despite its omnipresence in robotics application, the nature of spatial knowledge and the mechanisms that underlie its emergence in autonomous agents are still poorly understood. Recent theoretical work suggests that the concept of space can be grounded by capturing invariants induced by the structure of space in an agent's raw sensorimotor experience. Moreover, it is hypothesized that capturing t…
▽ More
Despite its omnipresence in robotics application, the nature of spatial knowledge and the mechanisms that underlie its emergence in autonomous agents are still poorly understood. Recent theoretical work suggests that the concept of space can be grounded by capturing invariants induced by the structure of space in an agent's raw sensorimotor experience. Moreover, it is hypothesized that capturing these invariants is beneficial for a naive agent trying to predict its sensorimotor experience. Under certain exploratory conditions, spatial representations should thus emerge as a byproduct of learning to predict. We propose a simple sensorimotor predictive scheme, apply it to different agents and types of exploration, and evaluate the pertinence of this hypothesis. We show that a naive agent can capture the topology and metric regularity of its spatial configuration without any a priori knowledge, nor extraneous supervision.
△ Less
Submitted 27 November, 2018; v1 submitted 2 October, 2018;
originally announced October 2018.
-
Discovering space - Grounding spatial topology and metric regularity in a naive agent's sensorimotor experience
Authors:
Alban Laflaquière,
J. Kevin O'Regan,
Bruno Gas,
Alexander Terekhov
Abstract:
In line with the sensorimotor contingency theory, we investigate the problem of the perception of space from a fundamental sensorimotor perspective. Despite its pervasive nature in our perception of the world, the origin of the concept of space remains largely mysterious. For example in the context of artificial perception, this issue is usually circumvented by having engineers pre-define the spat…
▽ More
In line with the sensorimotor contingency theory, we investigate the problem of the perception of space from a fundamental sensorimotor perspective. Despite its pervasive nature in our perception of the world, the origin of the concept of space remains largely mysterious. For example in the context of artificial perception, this issue is usually circumvented by having engineers pre-define the spatial structure of the problem the agent has to face. We here show that the structure of space can be autonomously discovered by a naive agent in the form of sensorimotor regularities, that correspond to so called compensable sensory experiences: these are experiences that can be generated either by the agent or its environment. By detecting such compensable experiences the agent can infer the topological and metric structure of the external space in which its body is moving. We propose a theoretical description of the nature of these regularities and illustrate the approach on a simulated robotic arm equipped with an eye-like sensor, and which interacts with an object. Finally we show how these regularities can be used to build an internal representation of the sensor's external spatial configuration.
△ Less
Submitted 3 October, 2018; v1 submitted 7 June, 2018;
originally announced June 2018.
-
Learning Representations of Spatial Displacement through Sensorimotor Prediction
Authors:
Michael Garcia Ortiz,
Alban Laflaquière
Abstract:
Robots act in their environment through sequences of continuous motor commands. Because of the dimensionality of the motor space, as well as the infinite possible combinations of successive motor commands, agents need compact representations that capture the structure of the resulting displacements. In the case of an autonomous agent with no a priori knowledge about its sensorimotor apparatus, thi…
▽ More
Robots act in their environment through sequences of continuous motor commands. Because of the dimensionality of the motor space, as well as the infinite possible combinations of successive motor commands, agents need compact representations that capture the structure of the resulting displacements. In the case of an autonomous agent with no a priori knowledge about its sensorimotor apparatus, this compression has to be learned. We propose to use Recurrent Neural Networks to encode motor sequences into a compact representation, which is used to predict the consequence of motor sequences in term of sensory changes. We show that sensory prediction can successfully guide the compression of motor sequences into representations that are organized topologically in term of spatial displacement.
△ Less
Submitted 16 May, 2018;
originally announced May 2018.
-
A Sensorimotor Perspective on Grounding the Semantic of Simple Visual Features
Authors:
Alban Laflaquière
Abstract:
In Machine Learning and Robotics, the semantic content of visual features is usually provided to the system by a human who interprets its content. On the contrary, strictly unsupervised approaches have difficulties relating the statistics of sensory inputs to their semantic content without also relying on prior knowledge introduced in the system. We proposed in this paper to tackle this problem fr…
▽ More
In Machine Learning and Robotics, the semantic content of visual features is usually provided to the system by a human who interprets its content. On the contrary, strictly unsupervised approaches have difficulties relating the statistics of sensory inputs to their semantic content without also relying on prior knowledge introduced in the system. We proposed in this paper to tackle this problem from a sensorimotor perspective. In line with the Sensorimotor Contingencies Theory, we make the fundamental assumption that the semantic content of sensory inputs at least partially stems from the way an agent can actively transform it. We illustrate our approach by formalizing how simple visual features can induce invariants in a naive agent's sensorimotor experience, and evaluate it on a simple simulated visual system. Without any a priori knowledge about the way its sensorimotor information is encoded, we show how an agent can characterize the uniformity and edge-ness of the visual features it interacts with.
△ Less
Submitted 11 May, 2018;
originally announced May 2018.
-
Grounding object perception in a naive agent's sensorimotor experience
Authors:
Alban Laflaquière,
Nikolas Hemion
Abstract:
Artificial object perception usually relies on a priori defined models and feature extraction algorithms. We study how the concept of object can be grounded in the sensorimotor experience of a naive agent. Without any knowledge about itself or the world it is immersed in, the agent explores its sensorimotor space and identifies objects as consistent networks of sensorimotor transitions, independen…
▽ More
Artificial object perception usually relies on a priori defined models and feature extraction algorithms. We study how the concept of object can be grounded in the sensorimotor experience of a naive agent. Without any knowledge about itself or the world it is immersed in, the agent explores its sensorimotor space and identifies objects as consistent networks of sensorimotor transitions, independent from their context. A fundamental drive for prediction is assumed to explain the emergence of such networks from a developmental standpoint. An algorithm is proposed and tested to illustrate the approach.
△ Less
Submitted 26 September, 2016;
originally announced September 2016.
-
Autonomous Grounding of Visual Field Experience through Sensorimotor Prediction
Authors:
Alban Laflaquière
Abstract:
In a developmental framework, autonomous robots need to explore the world and learn how to interact with it. Without an a priori model of the system, this opens the challenging problem of having robots master their interface with the world: how to perceive their environment using their sensors, and how to act in it using their motors. The sensorimotor approach of perception claims that a naive age…
▽ More
In a developmental framework, autonomous robots need to explore the world and learn how to interact with it. Without an a priori model of the system, this opens the challenging problem of having robots master their interface with the world: how to perceive their environment using their sensors, and how to act in it using their motors. The sensorimotor approach of perception claims that a naive agent can learn to master this interface by capturing regularities in the way its actions transform its sensory inputs. In this paper, we apply such an approach to the discovery and mastery of the visual field associated with a visual sensor. A computational model is formalized and applied to a simulated system to illustrate the approach.
△ Less
Submitted 3 August, 2016;
originally announced August 2016.