-
Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning
Authors:
Dhruva Tirumala,
Markus Wulfmeier,
Ben Moran,
Sandy Huang,
Jan Humplik,
Guy Lever,
Tuomas Haarnoja,
Leonard Hasenclever,
Arunkumar Byravan,
Nathan Batchelor,
Neil Sreendra,
Kushal Patel,
Marlon Gwira,
Francesco Nori,
Martin Riedmiller,
Nicolas Heess
Abstract:
We apply multi-agent deep reinforcement learning (RL) to train end-to-end robot soccer policies with fully onboard computation and sensing via egocentric RGB vision. This setting reflects many challenges of real-world robotics, including active perception, agile full-body control, and long-horizon planning in a dynamic, partially-observable, multi-agent domain. We rely on large-scale, simulation-b…
▽ More
We apply multi-agent deep reinforcement learning (RL) to train end-to-end robot soccer policies with fully onboard computation and sensing via egocentric RGB vision. This setting reflects many challenges of real-world robotics, including active perception, agile full-body control, and long-horizon planning in a dynamic, partially-observable, multi-agent domain. We rely on large-scale, simulation-based data generation to obtain complex behaviors from egocentric vision which can be successfully transferred to physical robots using low-cost sensors. To achieve adequate visual realism, our simulation combines rigid-body physics with learned, realistic rendering via multiple Neural Radiance Fields (NeRFs). We combine teacher-based multi-agent RL and cross-experiment data reuse to enable the discovery of sophisticated soccer strategies. We analyze active-perception behaviors including object tracking and ball seeking that emerge when simply optimizing perception-agnostic soccer play. The agents display equivalent levels of performance and agility as policies with access to privileged, ground-truth state. To our knowledge, this paper constitutes a first demonstration of end-to-end training for multi-agent robot soccer, map** raw pixel observations to joint-level actions, that can be deployed in the real world. Videos of the game-play and analyses can be seen on our website https://sites.google.com/view/vision-soccer .
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation
Authors:
Konstantinos Bousmalis,
Giulia Vezzani,
Dushyant Rao,
Coline Devin,
Alex X. Lee,
Maria Bauza,
Todor Davchev,
Yuxiang Zhou,
Agrim Gupta,
Akhil Raju,
Antoine Laurens,
Claudio Fantacci,
Valentin Dalibard,
Martina Zambelli,
Murilo Martins,
Rugile Pevceviciute,
Michiel Blokzijl,
Misha Denil,
Nathan Batchelor,
Thomas Lampe,
Emilio Parisotto,
Konrad Żołna,
Scott Reed,
Sergio Gómez Colmenarejo,
Jon Scholz
, et al. (14 additional authors not shown)
Abstract:
The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and language, we propose a multi-embodiment, multi-task generalist agent for robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned de…
▽ More
The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and language, we propose a multi-embodiment, multi-task generalist agent for robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned decision transformer capable of consuming action-labelled visual experience. This data spans a large repertoire of motor control skills from simulated and real robotic arms with varying sets of observations and actions. With RoboCat, we demonstrate the ability to generalise to new tasks and robots, both zero-shot as well as through adaptation using only 100-1000 examples for the target task. We also show how a trained model itself can be used to generate data for subsequent training iterations, thus providing a basic building block for an autonomous improvement loop. We investigate the agent's capabilities, with large-scale evaluations both in simulation and on three different real robot embodiments. We find that as we grow and diversify its training data, RoboCat not only shows signs of cross-task transfer, but also becomes more efficient at adapting to new tasks.
△ Less
Submitted 22 December, 2023; v1 submitted 20 June, 2023;
originally announced June 2023.
-
Barkour: Benchmarking Animal-level Agility with Quadruped Robots
Authors:
Ken Caluwaerts,
Atil Iscen,
J. Chase Kew,
Wenhao Yu,
Tingnan Zhang,
Daniel Freeman,
Kuang-Huei Lee,
Lisa Lee,
Stefano Saliceti,
Vincent Zhuang,
Nathan Batchelor,
Steven Bohez,
Federico Casarini,
Jose Enrique Chen,
Omar Cortes,
Erwin Coumans,
Adil Dostmohamed,
Gabriel Dulac-Arnold,
Alejandro Escontrela,
Erik Frey,
Roland Hafner,
Deepali Jain,
Bauyrjan Jyenis,
Yuheng Kuang,
Edward Lee
, et al. (19 additional authors not shown)
Abstract:
Animals have evolved various agile locomotion strategies, such as sprinting, lea**, and jum**. There is a growing interest in develo** legged robots that move like their biological counterparts and show various agile skills to navigate complex environments quickly. Despite the interest, the field lacks systematic benchmarks to measure the performance of control policies and hardware in agili…
▽ More
Animals have evolved various agile locomotion strategies, such as sprinting, lea**, and jum**. There is a growing interest in develo** legged robots that move like their biological counterparts and show various agile skills to navigate complex environments quickly. Despite the interest, the field lacks systematic benchmarks to measure the performance of control policies and hardware in agility. We introduce the Barkour benchmark, an obstacle course to quantify agility for legged robots. Inspired by dog agility competitions, it consists of diverse obstacles and a time based scoring mechanism. This encourages researchers to develop controllers that not only move fast, but do so in a controllable and versatile way. To set strong baselines, we present two methods for tackling the benchmark. In the first approach, we train specialist locomotion skills using on-policy reinforcement learning methods and combine them with a high-level navigation controller. In the second approach, we distill the specialist skills into a Transformer-based generalist locomotion policy, named Locomotion-Transformer, that can handle various terrains and adjust the robot's gait based on the perceived environment and robot states. Using a custom-built quadruped robot, we demonstrate that our method can complete the course at half the speed of a dog. We hope that our work represents a step towards creating controllers that enable robots to reach animal-level agility.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
Authors:
Tuomas Haarnoja,
Ben Moran,
Guy Lever,
Sandy H. Huang,
Dhruva Tirumala,
Jan Humplik,
Markus Wulfmeier,
Saran Tunyasuvunakool,
Noah Y. Siegel,
Roland Hafner,
Michael Bloesch,
Kristian Hartikainen,
Arunkumar Byravan,
Leonard Hasenclever,
Yuval Tassa,
Fereshteh Sadeghi,
Nathan Batchelor,
Federico Casarini,
Stefano Saliceti,
Charles Game,
Neil Sreendra,
Kushal Patel,
Marlon Gwira,
Andrea Huber,
Nicole Hurley
, et al. (3 additional authors not shown)
Abstract:
We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environments. We used Deep RL to train a humanoid robot with 20 actuated joints to play a simplified one-versus-one (1v1) soccer game. The resulting agent exhibits robust…
▽ More
We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environments. We used Deep RL to train a humanoid robot with 20 actuated joints to play a simplified one-versus-one (1v1) soccer game. The resulting agent exhibits robust and dynamic movement skills such as rapid fall recovery, walking, turning, kicking and more; and it transitions between them in a smooth, stable, and efficient manner. The agent's locomotion and tactical behavior adapts to specific game contexts in a way that would be impractical to manually design. The agent also developed a basic strategic understanding of the game, and learned, for instance, to anticipate ball movements and to block opponent shots. Our agent was trained in simulation and transferred to real robots zero-shot. We found that a combination of sufficiently high-frequency control, targeted dynamics randomization, and perturbations during training in simulation enabled good-quality transfer. Although the robots are inherently fragile, basic regularization of the behavior during training led the robots to learn safe and effective movements while still performing in a dynamic and agile way -- well beyond what is intuitively expected from the robot. Indeed, in experiments, they walked 181% faster, turned 302% faster, took 63% less time to get up, and kicked a ball 34% faster than a scripted baseline, while efficiently combining the skills to achieve the longer term objectives.
△ Less
Submitted 11 April, 2024; v1 submitted 26 April, 2023;
originally announced April 2023.
-
Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors
Authors:
Steven Bohez,
Saran Tunyasuvunakool,
Philemon Brakel,
Fereshteh Sadeghi,
Leonard Hasenclever,
Yuval Tassa,
Emilio Parisotto,
Jan Humplik,
Tuomas Haarnoja,
Roland Hafner,
Markus Wulfmeier,
Michael Neunert,
Ben Moran,
Noah Siegel,
Andrea Huber,
Francesco Romano,
Nathan Batchelor,
Federico Casarini,
Josh Merel,
Raia Hadsell,
Nicolas Heess
Abstract:
We investigate the use of prior knowledge of human and animal movement to learn reusable locomotion skills for real legged robots. Our approach builds upon previous work on imitating human or dog Motion Capture (MoCap) data to learn a movement skill module. Once learned, this skill module can be reused for complex downstream tasks. Importantly, due to the prior imposed by the MoCap data, our appro…
▽ More
We investigate the use of prior knowledge of human and animal movement to learn reusable locomotion skills for real legged robots. Our approach builds upon previous work on imitating human or dog Motion Capture (MoCap) data to learn a movement skill module. Once learned, this skill module can be reused for complex downstream tasks. Importantly, due to the prior imposed by the MoCap data, our approach does not require extensive reward engineering to produce sensible and natural looking behavior at the time of reuse. This makes it easy to create well-regularized, task-oriented controllers that are suitable for deployment on real robots. We demonstrate how our skill module can be used for imitation, and train controllable walking and ball dribbling policies for both the ANYmal quadruped and OP3 humanoid. These policies are then deployed on hardware via zero-shot simulation-to-reality transfer. Accompanying videos are available at https://bit.ly/robot-npmp.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.