-
DextrAH-G: Pixels-to-Action Dexterous Arm-Hand Gras** with Geometric Fabrics
Authors:
Tyler Ga Wei Lum,
Martin Matak,
Viktor Makoviychuk,
Ankur Handa,
Arthur Allshire,
Tucker Hermans,
Nathan D. Ratliff,
Karl Van Wyk
Abstract:
A pivotal challenge in robotics is achieving fast, safe, and robust dexterous gras** across a diverse range of objects, an important goal within industrial applications. However, existing methods often have very limited speed, dexterity, and generality, along with limited or no hardware safety guarantees. In this work, we introduce DextrAH-G, a depth-based dexterous gras** policy trained entir…
▽ More
A pivotal challenge in robotics is achieving fast, safe, and robust dexterous gras** across a diverse range of objects, an important goal within industrial applications. However, existing methods often have very limited speed, dexterity, and generality, along with limited or no hardware safety guarantees. In this work, we introduce DextrAH-G, a depth-based dexterous gras** policy trained entirely in simulation that combines reinforcement learning, geometric fabrics, and teacher-student distillation. We address key challenges in joint arm-hand policy learning, such as high-dimensional observation and action spaces, the sim2real gap, collision avoidance, and hardware constraints. DextrAH-G enables a 23 motor arm-hand robot to safely and continuously grasp and transport a large variety of objects at high speed using multi-modal inputs including depth images, allowing generalization across object geometry. Videos at https://sites.google.com/view/dextrah-g.
△ Less
Submitted 3 July, 2024; v1 submitted 2 July, 2024;
originally announced July 2024.
-
Geometric Fabrics: a Safe Guiding Medium for Policy Learning
Authors:
Karl Van Wyk,
Ankur Handa,
Viktor Makoviychuk,
Yijie Guo,
Arthur Allshire,
Nathan D. Ratliff
Abstract:
Robotics policies are always subjected to complex, second order dynamics that entangle their actions with resulting states. In reinforcement learning (RL) contexts, policies have the burden of deciphering these complicated interactions over massive amounts of experience and complex reward functions to learn how to accomplish tasks. Moreover, policies typically issue actions directly to controllers…
▽ More
Robotics policies are always subjected to complex, second order dynamics that entangle their actions with resulting states. In reinforcement learning (RL) contexts, policies have the burden of deciphering these complicated interactions over massive amounts of experience and complex reward functions to learn how to accomplish tasks. Moreover, policies typically issue actions directly to controllers like Operational Space Control (OSC) or joint PD control, which induces straightline motion towards these action targets in task or joint space. However, straightline motion in these spaces for the most part do not capture the rich, nonlinear behavior our robots need to exhibit, shifting the burden of discovering these behaviors more completely to the agent. Unlike these simpler controllers, geometric fabrics capture a much richer and desirable set of behaviors via artificial, second order dynamics grounded in nonlinear geometry. These artificial dynamics shift the uncontrolled dynamics of a robot via an appropriate control law to form behavioral dynamics. Behavioral dynamics unlock a new action space and safe, guiding behavior over which RL policies are trained. Behavioral dynamics enable bang-bang-like RL policy actions that are still safe for real robots, simplify reward engineering, and help sequence real-world, high-performance policies. We describe the framework more generally and create a specific instantiation for the problem of dexterous, in-hand reorientation of a cube by a highly actuated robot hand.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
Symmetry Considerations for Learning Task Symmetric Robot Policies
Authors:
Mayank Mittal,
Nikita Rudin,
Victor Klemm,
Arthur Allshire,
Marco Hutter
Abstract:
Symmetry is a fundamental aspect of many real-world robotic tasks. However, current deep reinforcement learning (DRL) approaches can seldom harness and exploit symmetry effectively. Often, the learned behaviors fail to achieve the desired transformation invariances and suffer from motion artifacts. For instance, a quadruped may exhibit different gaits when commanded to move forward or backward, ev…
▽ More
Symmetry is a fundamental aspect of many real-world robotic tasks. However, current deep reinforcement learning (DRL) approaches can seldom harness and exploit symmetry effectively. Often, the learned behaviors fail to achieve the desired transformation invariances and suffer from motion artifacts. For instance, a quadruped may exhibit different gaits when commanded to move forward or backward, even though it is symmetrical about its torso. This issue becomes further pronounced in high-dimensional or complex environments, where DRL methods are prone to local optima and fail to explore regions of the state space equally. Past methods on encouraging symmetry for robotic tasks have studied this topic mainly in a single-task setting, where symmetry usually refers to symmetry in the motion, such as the gait patterns. In this paper, we revisit this topic for goal-conditioned tasks in robotics, where symmetry lies mainly in task execution and not necessarily in the learned motions themselves. In particular, we investigate two approaches to incorporate symmetry invariance into DRL -- data augmentation and mirror loss function. We provide a theoretical foundation for using augmented samples in an on-policy setting. Based on this, we show that the corresponding approach achieves faster convergence and improves the learned behaviors in various challenging robotic tasks, from climbing boxes with a quadruped to dexterous manipulation.
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Real Robot Challenge 2022: Learning Dexterous Manipulation from Offline Data in the Real World
Authors:
Nico Gürtler,
Felix Widmaier,
Cansu Sancaktar,
Sebastian Blaes,
Pavel Kolev,
Stefan Bauer,
Manuel Wüthrich,
Markus Wulfmeier,
Martin Riedmiller,
Arthur Allshire,
Qiang Wang,
Robert McCarthy,
Hangyeol Kim,
Jongchan Baek,
Wookyong Kwon,
Shanliang Qian,
Yasunori Toshimitsu,
Mike Yan Michelis,
Amirhossein Kazemipour,
Arman Raayatsanati,
Hehui Zheng,
Barnabas Gavin Cangan,
Bernhard Schölkopf,
Georg Martius
Abstract:
Experimentation on real robots is demanding in terms of time and costs. For this reason, a large part of the reinforcement learning (RL) community uses simulators to develop and benchmark algorithms. However, insights gained in simulation do not necessarily translate to real robots, in particular for tasks involving complex interactions with the environment. The Real Robot Challenge 2022 therefore…
▽ More
Experimentation on real robots is demanding in terms of time and costs. For this reason, a large part of the reinforcement learning (RL) community uses simulators to develop and benchmark algorithms. However, insights gained in simulation do not necessarily translate to real robots, in particular for tasks involving complex interactions with the environment. The Real Robot Challenge 2022 therefore served as a bridge between the RL and robotics communities by allowing participants to experiment remotely with a real robot - as easily as in simulation.
In the last years, offline reinforcement learning has matured into a promising paradigm for learning from pre-collected datasets, alleviating the reliance on expensive online interactions. We therefore asked the participants to learn two dexterous manipulation tasks involving pushing, gras**, and in-hand orientation from provided real-robot datasets. An extensive software documentation and an initial stage based on a simulation of the real set-up made the competition particularly accessible. By giving each team plenty of access budget to evaluate their offline-learned policies on a cluster of seven identical real TriFinger platforms, we organized an exciting competition for machine learners and roboticists alike.
In this work we state the rules of the competition, present the methods used by the winning teams and compare their results with a benchmark of state-of-the-art offline RL algorithms on the challenge datasets.
△ Less
Submitted 24 November, 2023; v1 submitted 15 August, 2023;
originally announced August 2023.
-
DexPBT: Scaling up Dexterous Manipulation for Hand-Arm Systems with Population Based Training
Authors:
Aleksei Petrenko,
Arthur Allshire,
Gavriel State,
Ankur Handa,
Viktor Makoviychuk
Abstract:
In this work, we propose algorithms and methods that enable learning dexterous object manipulation using simulated one- or two-armed robots equipped with multi-fingered hand end-effectors. Using a parallel GPU-accelerated physics simulator (Isaac Gym), we implement challenging tasks for these robots, including regras**, grasp-and-throw, and object reorientation. To solve these problems we introd…
▽ More
In this work, we propose algorithms and methods that enable learning dexterous object manipulation using simulated one- or two-armed robots equipped with multi-fingered hand end-effectors. Using a parallel GPU-accelerated physics simulator (Isaac Gym), we implement challenging tasks for these robots, including regras**, grasp-and-throw, and object reorientation. To solve these problems we introduce a decentralized Population-Based Training (PBT) algorithm that allows us to massively amplify the exploration capabilities of deep reinforcement learning. We find that this method significantly outperforms regular end-to-end learning and is able to discover robust control policies in challenging tasks. Video demonstrations of learned behaviors and the code can be found at https://sites.google.com/view/dexpbt
△ Less
Submitted 20 May, 2023;
originally announced May 2023.
-
DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to Reality
Authors:
Ankur Handa,
Arthur Allshire,
Viktor Makoviychuk,
Aleksei Petrenko,
Ritvik Singh,
**gzhou Liu,
Denys Makoviichuk,
Karl Van Wyk,
Alexander Zhurkevich,
Balakumar Sundaralingam,
Yashraj Narang,
Jean-Francois Lafleche,
Dieter Fox,
Gavriel State
Abstract:
Recent work has demonstrated the ability of deep reinforcement learning (RL) algorithms to learn complex robotic behaviours in simulation, including in the domain of multi-fingered manipulation. However, such models can be challenging to transfer to the real world due to the gap between simulation and reality. In this paper, we present our techniques to train a) a policy that can perform robust de…
▽ More
Recent work has demonstrated the ability of deep reinforcement learning (RL) algorithms to learn complex robotic behaviours in simulation, including in the domain of multi-fingered manipulation. However, such models can be challenging to transfer to the real world due to the gap between simulation and reality. In this paper, we present our techniques to train a) a policy that can perform robust dexterous manipulation on an anthropomorphic robot hand and b) a robust pose estimator suitable for providing reliable real-time information on the state of the object being manipulated. Our policies are trained to adapt to a wide range of conditions in simulation. Consequently, our vision-based policies significantly outperform the best vision policies in the literature on the same reorientation task and are competitive with policies that are given privileged state information via motion capture systems. Our work reaffirms the possibilities of sim-to-real transfer for dexterous manipulation in diverse kinds of hardware and simulator setups, and in our case, with the Allegro Hand and Isaac Gym GPU-based simulation. Furthermore, it opens up possibilities for researchers to achieve such results with commonly-available, affordable robot hands and cameras. Videos of the resulting policy and supplementary information, including experiments and demos, can be found at https://dextreme.org/
△ Less
Submitted 2 January, 2024; v1 submitted 24 October, 2022;
originally announced October 2022.
-
Real Robot Challenge: A Robotics Competition in the Cloud
Authors:
Stefan Bauer,
Felix Widmaier,
Manuel Wüthrich,
Annika Buchholz,
Sebastian Stark,
Anirudh Goyal,
Thomas Steinbrenner,
Joel Akpo,
Shruti Joshi,
Vincent Berenz,
Vaibhav Agrawal,
Niklas Funk,
Julen Urain De Jesus,
Jan Peters,
Joe Watson,
Claire Chen,
Krishnan Srinivasan,
Junwu Zhang,
Jeffrey Zhang,
Matthew R. Walter,
Rishabh Madan,
Charles Schaff,
Takahiro Maeda,
Takuma Yoneda,
Denis Yarats
, et al. (17 additional authors not shown)
Abstract:
Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at MPI for Intelligent Systems and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able…
▽ More
Dexterous manipulation remains an open problem in robotics. To coordinate efforts of the research community towards tackling this problem, we propose a shared benchmark. We designed and built robotic platforms that are hosted at MPI for Intelligent Systems and can be accessed remotely. Each platform consists of three robotic fingers that are capable of dexterous object manipulation. Users are able to control the platforms remotely by submitting code that is executed automatically, akin to a computational cluster. Using this setup, i) we host robotics competitions, where teams from anywhere in the world access our platforms to tackle challenging tasks ii) we publish the datasets collected during these competitions (consisting of hundreds of robot hours), and iii) we give researchers access to these platforms for their own projects.
△ Less
Submitted 10 June, 2022; v1 submitted 22 September, 2021;
originally announced September 2021.
-
Isaac Gym: High Performance GPU-Based Physics Simulation For Robot Learning
Authors:
Viktor Makoviychuk,
Lukasz Wawrzyniak,
Yunrong Guo,
Michelle Lu,
Kier Storey,
Miles Macklin,
David Hoeller,
Nikita Rudin,
Arthur Allshire,
Ankur Handa,
Gavriel State
Abstract:
Isaac Gym offers a high performance learning platform to train policies for wide variety of robotics tasks directly on GPU. Both physics simulation and the neural network policy training reside on GPU and communicate by directly passing data from physics buffers to PyTorch tensors without ever going through any CPU bottlenecks. This leads to blazing fast training times for complex robotics tasks o…
▽ More
Isaac Gym offers a high performance learning platform to train policies for wide variety of robotics tasks directly on GPU. Both physics simulation and the neural network policy training reside on GPU and communicate by directly passing data from physics buffers to PyTorch tensors without ever going through any CPU bottlenecks. This leads to blazing fast training times for complex robotics tasks on a single GPU with 2-3 orders of magnitude improvements compared to conventional RL training that uses a CPU based simulator and GPU for neural networks. We host the results and videos at \url{https://sites.google.com/view/isaacgym-nvidia} and isaac gym can be downloaded at \url{https://developer.nvidia.com/isaac-gym}.
△ Less
Submitted 25 August, 2021; v1 submitted 23 August, 2021;
originally announced August 2021.
-
Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World TriFinger
Authors:
Arthur Allshire,
Mayank Mittal,
Varun Lodaya,
Viktor Makoviychuk,
Denys Makoviichuk,
Felix Widmaier,
Manuel Wüthrich,
Stefan Bauer,
Ankur Handa,
Animesh Garg
Abstract:
We present a system for learning a challenging dexterous manipulation task involving moving a cube to an arbitrary 6-DoF pose with only 3-fingers trained with NVIDIA's IsaacGym simulator. We show empirical benefits, both in simulation and sim-to-real transfer, of using keypoints as opposed to position+quaternion representations for the object pose in 6-DoF for policy observations and in reward cal…
▽ More
We present a system for learning a challenging dexterous manipulation task involving moving a cube to an arbitrary 6-DoF pose with only 3-fingers trained with NVIDIA's IsaacGym simulator. We show empirical benefits, both in simulation and sim-to-real transfer, of using keypoints as opposed to position+quaternion representations for the object pose in 6-DoF for policy observations and in reward calculation to train a model-free reinforcement learning agent. By utilizing domain randomization strategies along with the keypoint representation of the pose of the manipulated object, we achieve a high success rate of 83% on a remote TriFinger system maintained by the organizers of the Real Robot Challenge. With the aim of assisting further research in learning in-hand manipulation, we make the codebase of our system, along with trained checkpoints that come with billions of steps of experience available, at https://s2r2-ig.github.io
△ Less
Submitted 20 October, 2022; v1 submitted 22 August, 2021;
originally announced August 2021.
-
LASER: Learning a Latent Action Space for Efficient Reinforcement Learning
Authors:
Arthur Allshire,
Roberto Martín-Martín,
Charles Lin,
Shawn Manuel,
Silvio Savarese,
Animesh Garg
Abstract:
The process of learning a manipulation task depends strongly on the action space used for exploration: posed in the incorrect action space, solving a task with reinforcement learning can be drastically inefficient. Additionally, similar tasks or instances of the same task family impose latent manifold constraints on the most effective action space: the task family can be best solved with actions i…
▽ More
The process of learning a manipulation task depends strongly on the action space used for exploration: posed in the incorrect action space, solving a task with reinforcement learning can be drastically inefficient. Additionally, similar tasks or instances of the same task family impose latent manifold constraints on the most effective action space: the task family can be best solved with actions in a manifold of the entire action space of the robot. Combining these insights we present LASER, a method to learn latent action spaces for efficient reinforcement learning. LASER factorizes the learning problem into two sub-problems, namely action space learning and policy learning in the new action space. It leverages data from similar manipulation task instances, either from an offline expert or online during policy learning, and learns from these trajectories a map** from the original to a latent action space. LASER is trained as a variational encoder-decoder model to map raw actions into a disentangled latent action space while maintaining action reconstruction and latent space dynamic consistency. We evaluate LASER on two contact-rich robotic tasks in simulation, and analyze the benefit of policy learning in the generated latent action space. We show improved sample efficiency compared to the original action space from better alignment of the action space to the task space, as we observe with visualizations of the learned action space manifold. Additional details: https://www.pair.toronto.edu/laser
△ Less
Submitted 30 March, 2021; v1 submitted 29 March, 2021;
originally announced March 2021.