Search | arXiv e-print repository

Barkour: Benchmarking Animal-level Agility with Quadruped Robots

Authors: Ken Caluwaerts, Atil Iscen, J. Chase Kew, Wenhao Yu, Tingnan Zhang, Daniel Freeman, Kuang-Huei Lee, Lisa Lee, Stefano Saliceti, Vincent Zhuang, Nathan Batchelor, Steven Bohez, Federico Casarini, Jose Enrique Chen, Omar Cortes, Erwin Coumans, Adil Dostmohamed, Gabriel Dulac-Arnold, Alejandro Escontrela, Erik Frey, Roland Hafner, Deepali Jain, Bauyrjan Jyenis, Yuheng Kuang, Edward Lee , et al. (19 additional authors not shown)

Abstract: Animals have evolved various agile locomotion strategies, such as sprinting, lea**, and jum**. There is a growing interest in develo** legged robots that move like their biological counterparts and show various agile skills to navigate complex environments quickly. Despite the interest, the field lacks systematic benchmarks to measure the performance of control policies and hardware in agili… ▽ More Animals have evolved various agile locomotion strategies, such as sprinting, lea**, and jum**. There is a growing interest in develo** legged robots that move like their biological counterparts and show various agile skills to navigate complex environments quickly. Despite the interest, the field lacks systematic benchmarks to measure the performance of control policies and hardware in agility. We introduce the Barkour benchmark, an obstacle course to quantify agility for legged robots. Inspired by dog agility competitions, it consists of diverse obstacles and a time based scoring mechanism. This encourages researchers to develop controllers that not only move fast, but do so in a controllable and versatile way. To set strong baselines, we present two methods for tackling the benchmark. In the first approach, we train specialist locomotion skills using on-policy reinforcement learning methods and combine them with a high-level navigation controller. In the second approach, we distill the specialist skills into a Transformer-based generalist locomotion policy, named Locomotion-Transformer, that can handle various terrains and adjust the robot's gait based on the perceived environment and robot states. Using a custom-built quadruped robot, we demonstrate that our method can complete the course at half the speed of a dog. We hope that our work represents a step towards creating controllers that enable robots to reach animal-level agility. △ Less

Submitted 23 May, 2023; originally announced May 2023.

Comments: 17 pages, 19 figures

arXiv:2304.09834 [pdf, other]

Learning and Adapting Agile Locomotion Skills by Transferring Experience

Authors: Laura Smith, J. Chase Kew, Tianyu Li, Linda Luu, Xue Bin Peng, Sehoon Ha, Jie Tan, Sergey Levine

Abstract: Legged robots have enormous potential in their range of capabilities, from navigating unstructured terrains to high-speed running. However, designing robust controllers for highly agile dynamic motions remains a substantial challenge for roboticists. Reinforcement learning (RL) offers a promising data-driven approach for automatically training such controllers. However, exploration in these high-d… ▽ More Legged robots have enormous potential in their range of capabilities, from navigating unstructured terrains to high-speed running. However, designing robust controllers for highly agile dynamic motions remains a substantial challenge for roboticists. Reinforcement learning (RL) offers a promising data-driven approach for automatically training such controllers. However, exploration in these high-dimensional, underactuated systems remains a significant hurdle for enabling legged robots to learn performant, naturalistic, and versatile agility skills. We propose a framework for training complex robotic skills by transferring experience from existing controllers to jumpstart learning new tasks. To leverage controllers we can acquire in practice, we design this framework to be flexible in terms of their source -- that is, the controllers may have been optimized for a different objective under different dynamics, or may require different knowledge of the surroundings -- and thus may be highly suboptimal for the target task. We show that our method enables learning complex agile jum** behaviors, navigating to goal locations while walking on hind legs, and adapting to new environments. We also demonstrate that the agile behaviors learned in this way are graceful and safe enough to deploy in the real world. △ Less

Submitted 19 April, 2023; originally announced April 2023.

Comments: Project website: https://sites.google.com/berkeley.edu/twirl

arXiv:2110.05457 [pdf, other]

Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

Authors: Laura Smith, J. Chase Kew, Xue Bin Peng, Sehoon Ha, Jie Tan, Sergey Levine

Abstract: Legged robots are physically capable of traversing a wide range of challenging environments, but designing controllers that are sufficiently robust to handle this diversity has been a long-standing challenge in robotics. Reinforcement learning presents an appealing approach for automating the controller design process and has been able to produce remarkably robust controllers when trained in a sui… ▽ More Legged robots are physically capable of traversing a wide range of challenging environments, but designing controllers that are sufficiently robust to handle this diversity has been a long-standing challenge in robotics. Reinforcement learning presents an appealing approach for automating the controller design process and has been able to produce remarkably robust controllers when trained in a suitable range of environments. However, it is difficult to predict all likely conditions the robot will encounter during deployment and enumerate them at training-time. What if instead of training controllers that are robust enough to handle any eventuality, we enable the robot to continually learn in any setting it finds itself in? This kind of real-world reinforcement learning poses a number of challenges, including efficiency, safety, and autonomy. To address these challenges, we propose a practical robot reinforcement learning system for fine-tuning locomotion policies in the real world. We demonstrate that a modest amount of real-world training can substantially improve performance during deployment, and this enables a real A1 quadrupedal robot to autonomously fine-tune multiple locomotion skills in a range of environments, including an outdoor lawn and a variety of indoor terrains. △ Less

Submitted 11 October, 2021; originally announced October 2021.

Comments: Project website: https://sites.google.com/berkeley.edu/fine-tuning-locomotion

arXiv:2003.06906 [pdf, other]

Model-based Reinforcement Learning for Decentralized Multiagent Rendezvous

Authors: Rose E. Wang, J. Chase Kew, Dennis Lee, Tsang-Wei Edward Lee, Tingnan Zhang, Brian Ichter, Jie Tan, Aleksandra Faust

Abstract: Collaboration requires agents to align their goals on the fly. Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans. We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous. Starting with pretrained, single-agent point to p… ▽ More Collaboration requires agents to align their goals on the fly. Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans. We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous. Starting with pretrained, single-agent point to point navigation policies and using noisy, high-dimensional sensor inputs like lidar, we first learn via self-supervision motion predictions of all agents on the team. Next, HPP uses the prediction models to propose and evaluate navigation subgoals for completing the rendezvous task without explicit communication among agents. We evaluate HPP in a suite of unseen environments, with increasing complexity and numbers of obstacles. We show that HPP outperforms alternative reinforcement learning, path planning, and heuristic-based baselines on challenging, unseen environments. Experiments in the real world demonstrate successful transfer of the prediction models from sim to real world without any additional fine-tuning. Altogether, HPP removes the need for a centralized operator in multiagent systems by combining model-based RL and inference methods, enabling agents to dynamically align plans. △ Less

Submitted 9 November, 2020; v1 submitted 15 March, 2020; originally announced March 2020.

Comments: CoRL 2020. The video is available at: https://youtu.be/-ydXHUtPzWE

arXiv:1910.05917 [pdf, other]

Neural Collision Clearance Estimator for Batched Motion Planning

Authors: J. Chase Kew, Brian Ichter, Maryam Bandari, Tsang-Wei Edward Lee, Aleksandra Faust

Abstract: We present a neural network collision checking heuristic, ClearanceNet, and a planning algorithm, CN-RRT. ClearanceNet learns to predict separation distance (minimum distance between robot and workspace) with respect to a workspace. CN-RRT then efficiently computes a motion plan by leveraging three key features of ClearanceNet. First, CN-RRT explores the space by expanding multiple nodes at the sa… ▽ More We present a neural network collision checking heuristic, ClearanceNet, and a planning algorithm, CN-RRT. ClearanceNet learns to predict separation distance (minimum distance between robot and workspace) with respect to a workspace. CN-RRT then efficiently computes a motion plan by leveraging three key features of ClearanceNet. First, CN-RRT explores the space by expanding multiple nodes at the same time, processing batches of thousands of collision checks. Second, CN-RRT adaptively relaxes its clearance requirements for more difficult problems. Third, to repair errors, CN-RRT shifts its nodes in the direction of ClearanceNet's gradient and repairs any residual errors with a traditional RRT, thus maintaining theoretical probabilistic completeness guarantees. In configuration spaces with up to 30 degrees of freedom, ClearanceNet achieves 845x speedup over traditional collision detection methods, while CN-RRT accelerates motion planning by up to 42% over a baseline and finds paths up to 36% more efficient. Experiments on an 11 degree of freedom robot in a cluttered environment confirm the method's feasibility on real robots. △ Less

Submitted 14 July, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

arXiv:1902.09458 [pdf, other]

Long-Range Indoor Navigation with PRM-RL

Authors: Anthony Francis, Aleksandra Faust, Hao-Tien Lewis Chiang, Jasmine Hsu, J. Chase Kew, Marek Fiser, Tsang-Wei Edward Lee

Abstract: Long-range indoor navigation requires guiding robots with noisy sensors and controls through cluttered environments along paths that span a variety of buildings. We achieve this with PRM-RL, a hierarchical robot navigation method in which reinforcement learning agents that map noisy sensors to robot controls learn to solve short-range obstacle avoidance tasks, and then sampling-based planners map… ▽ More Long-range indoor navigation requires guiding robots with noisy sensors and controls through cluttered environments along paths that span a variety of buildings. We achieve this with PRM-RL, a hierarchical robot navigation method in which reinforcement learning agents that map noisy sensors to robot controls learn to solve short-range obstacle avoidance tasks, and then sampling-based planners map where these agents can reliably navigate in simulation; these roadmaps and agents are then deployed on robots, guiding them along the shortest path where the agents are likely to succeed. Here we use Probabilistic Roadmaps (PRMs) as the sampling-based planner, and AutoRL as the reinforcement learning method in the indoor navigation context. We evaluate the method in simulation for kinematic differential drive and kinodynamic car-like robots in several environments, and on differential-drive robots at three physical sites. Our results show PRM-RL with AutoRL is more successful than several baselines, is robust to noise, and can guide robots over hundreds of meters in the face of noise and obstacles in both simulation and on robots, including over 5.8 kilometers of physical robot navigation. Video: https://youtu.be/xN-OWX5gKvQ △ Less

Submitted 22 February, 2020; v1 submitted 25 February, 2019; originally announced February 2019.

Comments: Accepted to T-RO

arXiv:1805.06150 [pdf, other]

FollowNet: Robot Navigation by Following Natural Language Directions with Deep Reinforcement Learning

Authors: Pararth Shah, Marek Fiser, Aleksandra Faust, J. Chase Kew, Dilek Hakkani-Tur

Abstract: Understanding and following directions provided by humans can enable robots to navigate effectively in unknown situations. We present FollowNet, an end-to-end differentiable neural architecture for learning multi-modal navigation policies. FollowNet maps natural language instructions as well as visual and depth inputs to locomotion primitives. FollowNet processes instructions using an attention me… ▽ More Understanding and following directions provided by humans can enable robots to navigate effectively in unknown situations. We present FollowNet, an end-to-end differentiable neural architecture for learning multi-modal navigation policies. FollowNet maps natural language instructions as well as visual and depth inputs to locomotion primitives. FollowNet processes instructions using an attention mechanism conditioned on its visual and depth input to focus on the relevant parts of the command while performing the navigation task. Deep reinforcement learning (RL) a sparse reward learns simultaneously the state representation, the attention function, and control policies. We evaluate our agent on a dataset of complex natural language directions that guide the agent through a rich and realistic dataset of simulated homes. We show that the FollowNet agent learns to execute previously unseen instructions described with a similar vocabulary, and successfully navigates along paths not encountered during training. The agent shows 30% improvement over a baseline model without the attention mechanism, with 52% success rate at novel instructions. △ Less

Submitted 16 May, 2018; originally announced May 2018.

Comments: 7 pages, 8 figures

Journal ref: Third Workshop in Machine Learning in the Planning and Control of Robot Motion at ICRA, 2018

Showing 1–7 of 7 results for author: Kew, J C