Search | arXiv e-print repository

Task-conditioned adaptation of visual features in multi-task policy learning

Authors: Pierre Marza, Laetitia Matignon, Olivier Simonin, Christian Wolf

Abstract: Successfully addressing a wide variety of tasks is a core ability of autonomous agents, requiring flexibly adapting the underlying decision-making strategies and, as we argue in this work, also adapting the perception modules. An analogical argument would be the human visual system, which uses top-down signals to focus attention determined by the current task. Similarly, we adapt pre-trained large… ▽ More Successfully addressing a wide variety of tasks is a core ability of autonomous agents, requiring flexibly adapting the underlying decision-making strategies and, as we argue in this work, also adapting the perception modules. An analogical argument would be the human visual system, which uses top-down signals to focus attention determined by the current task. Similarly, we adapt pre-trained large vision models conditioned on specific downstream tasks in the context of multi-task policy learning. We introduce task-conditioned adapters that do not require finetuning any pre-trained weights, combined with a single policy trained with behavior cloning and capable of addressing multiple tasks. We condition the visual adapters on task embeddings, which can be selected at inference if the task is known, or alternatively inferred from a set of example demonstrations. To this end, we propose a new optimization-based estimator. We evaluate the method on a wide variety of tasks from the CortexBench benchmark and show that, compared to existing work, it can be addressed with a single policy. In particular, we demonstrate that adapting visual features is a key design choice and that the method generalizes to unseen tasks given a few demonstrations. △ Less

Submitted 6 May, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

arXiv:2304.11241 [pdf, other]

AutoNeRF: Training Implicit Scene Representations with Autonomous Agents

Authors: Pierre Marza, Laetitia Matignon, Olivier Simonin, Dhruv Batra, Christian Wolf, Devendra Singh Chaplot

Abstract: Implicit representations such as Neural Radiance Fields (NeRF) have been shown to be very effective at novel view synthesis. However, these models typically require manual and careful human data collection for training. In this paper, we present AutoNeRF, a method to collect data required to train NeRFs using autonomous embodied agents. Our method allows an agent to explore an unseen environment e… ▽ More Implicit representations such as Neural Radiance Fields (NeRF) have been shown to be very effective at novel view synthesis. However, these models typically require manual and careful human data collection for training. In this paper, we present AutoNeRF, a method to collect data required to train NeRFs using autonomous embodied agents. Our method allows an agent to explore an unseen environment efficiently and use the experience to build an implicit map representation autonomously. We compare the impact of different exploration strategies including handcrafted frontier-based exploration, end-to-end and modular approaches composed of trained high-level planners and classical low-level path followers. We train these models with different reward functions tailored to this problem and evaluate the quality of the learned representations on four different downstream tasks: classical viewpoint rendering, map reconstruction, planning, and pose refinement. Empirical results show that NeRFs can be trained on actively collected data using just a single episode of experience in an unseen environment, and can be used for several downstream robotic tasks, and that modular trained exploration models outperform other classical and end-to-end baselines. Finally, we show that AutoNeRF can reconstruct large-scale scenes, and is thus a useful tool to perform scene-specific adaptation as the produced 3D environment models can be loaded into a simulator to fine-tune a policy of interest. △ Less

Submitted 22 December, 2023; v1 submitted 21 April, 2023; originally announced April 2023.

arXiv:2210.05129 [pdf, other]

Multi-Object Navigation with dynamically learned neural implicit representations

Authors: Pierre Marza, Laetitia Matignon, Olivier Simonin, Christian Wolf

Abstract: Understanding and map** a new environment are core abilities of any autonomously navigating agent. While classical robotics usually estimates maps in a stand-alone manner with SLAM variants, which maintain a topological or metric representation, end-to-end learning of navigation keeps some form of memory in a neural network. Networks are typically imbued with inductive biases, which can range fr… ▽ More Understanding and map** a new environment are core abilities of any autonomously navigating agent. While classical robotics usually estimates maps in a stand-alone manner with SLAM variants, which maintain a topological or metric representation, end-to-end learning of navigation keeps some form of memory in a neural network. Networks are typically imbued with inductive biases, which can range from vectorial representations to birds-eye metric tensors or topological structures. In this work, we propose to structure neural networks with two neural implicit representations, which are learned dynamically during each episode and map the content of the scene: (i) the Semantic Finder predicts the position of a previously seen queried object; (ii) the Occupancy and Exploration Implicit Representation encapsulates information about explored area and obstacles, and is queried with a novel global read mechanism which directly maps from function space to a usable embedding space. Both representations are leveraged by an agent trained with Reinforcement Learning (RL) and learned online during each episode. We evaluate the agent on Multi-Object Navigation and show the high impact of using neural implicit representations as a memory source. △ Less

Submitted 27 September, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

arXiv:2112.11731 [pdf, other]

Graph augmented Deep Reinforcement Learning in the GameRLand3D environment

Authors: Edward Beeching, Maxim Peter, Philippe Marcotte, Jilles Debangoye, Olivier Simonin, Joshua Romoff, Christian Wolf

Abstract: We address planning and navigation in challenging 3D video games featuring maps with disconnected regions reachable by agents using special actions. In this setting, classical symbolic planners are not applicable or difficult to adapt. We introduce a hybrid technique combining a low level policy trained with reinforcement learning and a graph based high level classical planner. In addition to prov… ▽ More We address planning and navigation in challenging 3D video games featuring maps with disconnected regions reachable by agents using special actions. In this setting, classical symbolic planners are not applicable or difficult to adapt. We introduce a hybrid technique combining a low level policy trained with reinforcement learning and a graph based high level classical planner. In addition to providing human-interpretable paths, the approach improves the generalization performance of an end-to-end approach in unseen maps, where it achieves a 20% absolute increase in success rate over a recurrent end-to-end agent on a point to point navigation task in yet unseen large-scale maps of size 1km x 1km. In an in-depth experimental study, we quantify the limitations of end-to-end Deep RL approaches in vast environments and we also introduce "GameRLand3D", a new benchmark and soon to be released environment can generate complex procedural 3D maps for navigation tasks. △ Less

Submitted 22 December, 2021; originally announced December 2021.

arXiv:2112.03636 [pdf, other]

Godot Reinforcement Learning Agents

Authors: Edward Beeching, Jilles Debangoye, Olivier Simonin, Christian Wolf

Abstract: We present Godot Reinforcement Learning (RL) Agents, an open-source interface for develo** environments and agents in the Godot Game Engine. The Godot RL Agents interface allows the design, creation and learning of agent behaviors in challenging 2D and 3D environments with various on-policy and off-policy Deep RL algorithms. We provide a standard Gym interface, with wrappers for learning in the… ▽ More We present Godot Reinforcement Learning (RL) Agents, an open-source interface for develo** environments and agents in the Godot Game Engine. The Godot RL Agents interface allows the design, creation and learning of agent behaviors in challenging 2D and 3D environments with various on-policy and off-policy Deep RL algorithms. We provide a standard Gym interface, with wrappers for learning in the Ray RLlib and Stable Baselines RL frameworks. This allows users access to over 20 state of the art on-policy, off-policy and multi-agent RL algorithms. The framework is a versatile tool that allows researchers and game designers the ability to create environments with discrete, continuous and mixed action spaces. The interface is relatively performant, with 12k interactions per second on a high end laptop computer, when parallized on 4 CPU cores. An overview video is available here: https://youtu.be/g1MlZSFqIj4 △ Less

Submitted 7 December, 2021; originally announced December 2021.

arXiv:2107.06011 [pdf, other]

Teaching Agents how to Map: Spatial Reasoning for Multi-Object Navigation

Authors: Pierre Marza, Laetitia Matignon, Olivier Simonin, Christian Wolf

Abstract: In the context of visual navigation, the capacity to map a novel environment is necessary for an agent to exploit its observation history in the considered place and efficiently reach known goals. This ability can be associated with spatial reasoning, where an agent is able to perceive spatial relationships and regularities, and discover object characteristics. Recent work introduces learnable pol… ▽ More In the context of visual navigation, the capacity to map a novel environment is necessary for an agent to exploit its observation history in the considered place and efficiently reach known goals. This ability can be associated with spatial reasoning, where an agent is able to perceive spatial relationships and regularities, and discover object characteristics. Recent work introduces learnable policies parametrized by deep neural networks and trained with Reinforcement Learning (RL). In classical RL setups, the capacity to map and reason spatially is learned end-to-end, from reward alone. In this setting, we introduce supplementary supervision in the form of auxiliary tasks designed to favor the emergence of spatial perception capabilities in agents trained for a goal-reaching downstream objective. We show that learning to estimate metrics quantifying the spatial relationships between an agent at a given location and a goal to reach has a high positive impact in Multi-Object Navigation settings. Our method significantly improves the performance of different baseline agents, that either build an explicit or implicit representation of the environment, even matching the performance of incomparable oracle agents taking ground-truth maps as input. A learning-based agent from the literature trained with the proposed auxiliary losses was the winning entry to the Multi-Object Navigation Challenge, part of the CVPR 2021 Embodied AI Workshop. △ Less

Submitted 25 April, 2023; v1 submitted 13 July, 2021; originally announced July 2021.

arXiv:2007.05270 [pdf, other]

Learning to plan with uncertain topological maps

Authors: Edward Beeching, Jilles Dibangoye, Olivier Simonin, Christian Wolf

Abstract: We train an agent to navigate in 3D environments using a hierarchical strategy including a high-level graph based planner and a local policy. Our main contribution is a data driven learning based approach for planning under uncertainty in topological maps, requiring an estimate of shortest paths in valued graphs with a probabilistic structure. Whereas classical symbolic algorithms achieve optimal… ▽ More We train an agent to navigate in 3D environments using a hierarchical strategy including a high-level graph based planner and a local policy. Our main contribution is a data driven learning based approach for planning under uncertainty in topological maps, requiring an estimate of shortest paths in valued graphs with a probabilistic structure. Whereas classical symbolic algorithms achieve optimal results on noise-less topologies, or optimal results in a probabilistic sense on graphs with probabilistic structure, we aim to show that machine learning can overcome missing information in the graph by taking into account rich high-dimensional node features, for instance visual information available at each location of the map. Compared to purely learned neural white box algorithms, we structure our neural model with an inductive bias for dynamic programming based shortest path algorithms, and we show that a particular parameterization of our neural model corresponds to the Bellman-Ford algorithm. By performing an empirical analysis of our method in simulated photo-realistic 3D environments, we demonstrate that the inclusion of visual features in the learned neural planner outperforms classical symbolic solutions for graph based planning. △ Less

Submitted 10 July, 2020; originally announced July 2020.

Comments: ECCV 2020

arXiv:2002.02286 [pdf, other]

EgoMap: Projective map** and structured egocentric memory for Deep RL

Authors: Edward Beeching, Christian Wolf, Jilles Dibangoye, Olivier Simonin

Abstract: Tasks involving localization, memorization and planning in partially observable 3D environments are an ongoing challenge in Deep Reinforcement Learning. We present EgoMap, a spatially structured neural memory architecture. EgoMap augments a deep reinforcement learning agent's performance in 3D environments on challenging tasks with multi-step objectives. The EgoMap architecture incorporates severa… ▽ More Tasks involving localization, memorization and planning in partially observable 3D environments are an ongoing challenge in Deep Reinforcement Learning. We present EgoMap, a spatially structured neural memory architecture. EgoMap augments a deep reinforcement learning agent's performance in 3D environments on challenging tasks with multi-step objectives. The EgoMap architecture incorporates several inductive biases including a differentiable inverse projection of CNN feature vectors onto a top-down spatially structured map. The map is updated with ego-motion measurements through a differentiable affine transform. We show this architecture outperforms both standard recurrent agents and state of the art agents with structured memory. We demonstrate that incorporating these inductive biases into an agent's architecture allows for stable training with reward alone, circumventing the expense of acquiring and labelling expert trajectories. A detailed ablation study demonstrates the impact of key aspects of the architecture and through extensive qualitative analysis, we show how the agent exploits its structured internal memory to achieve higher performance. △ Less

Submitted 7 February, 2020; v1 submitted 24 January, 2020; originally announced February 2020.

arXiv:1909.10809 [pdf, other]

Towards S-NAMO: Socially-aware Navigation Among Movable Obstacles

Authors: Benoit Renault, Jacques Saraydaryan, Olivier Simonin

Abstract: In this paper, we present an in-depth analysis of Navigation Among Movable Obstacles (NAMO) literature, notably highlighting that social acceptability remains an unadressed problem in this robotics navigation domain. The objectives of a Socially-Aware NAMO are defined and a first set of algorithmic propositions is built upon existing work. We developed a simulator allowing to test our propositions… ▽ More In this paper, we present an in-depth analysis of Navigation Among Movable Obstacles (NAMO) literature, notably highlighting that social acceptability remains an unadressed problem in this robotics navigation domain. The objectives of a Socially-Aware NAMO are defined and a first set of algorithmic propositions is built upon existing work. We developed a simulator allowing to test our propositions of social movability evaluation for obstacle selection, and social placement of objects with a semantic map layer. Preliminary pushing tests are done with a Pepper robot, the standard platform for the Robocup@home Social Standard Platform League, in the context of our participation (LyonTech Team). △ Less

Submitted 24 September, 2019; originally announced September 2019.

Journal ref: Robocup Symposium 2019, Jul 2019, Sydney, Australia

arXiv:1904.01806 [pdf, other]

Deep Reinforcement Learning on a Budget: 3D Control and Reasoning Without a Supercomputer

Authors: Edward Beeching, Christian Wolf, Jilles Dibangoye, Olivier Simonin

Abstract: An important goal of research in Deep Reinforcement Learning in mobile robotics is to train agents capable of solving complex tasks, which require a high level of scene understanding and reasoning from an egocentric perspective. When trained from simulations, optimal environments should satisfy a currently unobtainable combination of high-fidelity photographic observations, massive amounts of diff… ▽ More An important goal of research in Deep Reinforcement Learning in mobile robotics is to train agents capable of solving complex tasks, which require a high level of scene understanding and reasoning from an egocentric perspective. When trained from simulations, optimal environments should satisfy a currently unobtainable combination of high-fidelity photographic observations, massive amounts of different environment configurations and fast simulation speeds. In this paper we argue that research on training agents capable of complex reasoning can be simplified by decoupling from the requirement of high fidelity photographic observations. We present a suite of tasks requiring complex reasoning and exploration in continuous, partially observable 3D environments. The objective is to provide challenging scenarios and a robust baseline agent architecture that can be trained on mid-range consumer hardware in under 24h. Our scenarios combine two key advantages: (i) they are based on a simple but highly efficient 3D environment (ViZDoom) which allows high speed simulation (12000fps); (ii) the scenarios provide the user with a range of difficulty settings, in order to identify the limitations of current state of the art algorithms and network architectures. We aim to increase accessibility to the field of Deep-RL by providing baselines for challenging scenarios where new ideas can be iterated on quickly. We argue that the community should be able to address challenging problems in reasoning of mobile agents without the need for a large compute infrastructure. △ Less

Submitted 3 April, 2019; originally announced April 2019.

arXiv:1901.10272 [pdf, other]

Multi-UAV Visual Coverage of Partially Known 3D Surfaces: Voronoi-based Initialization to Improve Local Optimizers

Authors: Alessandro Renzaglia, Jilles Dibangoye, Vincent Le Doze, Olivier Simonin

Abstract: In this paper we study the problem of steering a team of Unmanned Aerial Vehicles (UAVs) toward a static configuration which maximizes the visibility of a 3D environment. The UAVs are assumed to be equipped with visual sensors constrained by a maximum sensing range and the prior knowledge on the environment is considered to be very sparse. To solve this problem on-line, derivative-free measurement… ▽ More In this paper we study the problem of steering a team of Unmanned Aerial Vehicles (UAVs) toward a static configuration which maximizes the visibility of a 3D environment. The UAVs are assumed to be equipped with visual sensors constrained by a maximum sensing range and the prior knowledge on the environment is considered to be very sparse. To solve this problem on-line, derivative-free measurement-based optimization algorithms can be adopted, even though they are strongly limited by local optimality. To overcome this limitation, we propose to exploit the partial initial knowledge on the environment to find suitable initial configurations from which the agents start the local optimization. In particular, a constrained centroidal Voronoi tessellation on a coarse approximation of the surface to cover is proposed. The behavior of the agent is so based on a two-step optimization approach, where a stochastic optimization algorithm based on the on-line acquired information follows the geometrical-based initialization. The algorithm performance is evaluated in simulation and in particular the improvement on the solution brought by the Voronoi tessellation with respect to different initializations is analyzed. △ Less

Submitted 30 January, 2019; v1 submitted 29 January, 2019; originally announced January 2019.

Showing 1–11 of 11 results for author: Simonin, O