Skip to main content

Showing 1–36 of 36 results for author: Peng, X B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.06615  [pdf, other

    cs.CL cs.AI cs.LG cs.RO

    Language Guided Skill Discovery

    Authors: Seungeun Rho, Laura Smith, Tianyu Li, Sergey Levine, Xue Bin Peng, Sehoon Ha

    Abstract: Skill discovery methods enable agents to learn diverse emergent behaviors without explicit rewards. To make learned skills useful for unknown downstream tasks, obtaining a semantically diverse repertoire of skills is essential. While some approaches introduce a discriminator to distinguish skills and others aim to increase state coverage, no existing work directly addresses the "semantic diversity… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  2. arXiv:2405.11126  [pdf, other

    cs.CV cs.GR cs.LG

    Flexible Motion In-betweening with Diffusion Models

    Authors: Setareh Cohan, Guy Tevet, Daniele Reda, Xue Bin Peng, Michiel van de Panne

    Abstract: Motion in-betweening, a fundamental task in character animation, consists of generating motion sequences that plausibly interpolate user-provided keyframe constraints. It has long been recognized as a labor-intensive and challenging process. We investigate the potential of diffusion models in generating diverse human motions guided by keyframes. Unlike previous inbetweening methods, we propose a s… ▽ More

    Submitted 23 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: SIGGRAPH 2024. For project page and code, see https://setarehc.github.io/CondMDI/

  3. arXiv:2404.19264  [pdf, other

    cs.RO

    DiffuseLoco: Real-Time Legged Locomotion Control with Diffusion from Offline Datasets

    Authors: Xiaoyu Huang, Yufeng Chi, Ruofeng Wang, Zhongyu Li, Xue Bin Peng, Sophia Shao, Borivoje Nikolic, Koushil Sreenath

    Abstract: This work introduces DiffuseLoco, a framework for training multi-skill diffusion-based policies for dynamic legged locomotion from offline datasets, enabling real-time control of diverse skills on robots in the real world. Offline learning at scale has led to breakthroughs in computer vision, natural language processing, and robotic manipulation domains. However, scaling up learning for legged rob… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  4. arXiv:2404.10685  [pdf, other

    cs.CV cs.GR

    Generating Human Interaction Motions in Scenes with Text Control

    Authors: Hongwei Yi, Justus Thies, Michael J. Black, Xue Bin Peng, Davis Rempe

    Abstract: We present TeSMo, a method for text-controlled scene-aware motion generation based on denoising diffusion models. Previous text-to-motion methods focus on characters in isolation without considering scenes due to the limited availability of datasets that include motion, text descriptions, and interactive scenes. Our approach begins with pre-training a scene-agnostic text-to-motion diffusion model,… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Project Page: https://research.nvidia.com/labs/toronto-ai/tesmo/

  5. arXiv:2401.16889  [pdf, other

    cs.RO cs.AI eess.SY

    Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control

    Authors: Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

    Abstract: This paper presents a comprehensive study on using deep reinforcement learning (RL) to create dynamic locomotion controllers for bipedal robots. Going beyond focusing on a single locomotion skill, we develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jum** and standing. Our RL-based controller incorporates a n… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  6. arXiv:2401.08559  [pdf, other

    cs.CV cs.GR cs.LG

    Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation

    Authors: Mathis Petrovich, Or Litany, Umar Iqbal, Michael J. Black, Gül Varol, Xue Bin Peng, Davis Rempe

    Abstract: Recent advances in generative modeling have led to promising progress on synthesizing 3D human motion from text, with methods that can generate character animations from short prompts and specified durations. However, using a single text prompt as input lacks the fine-grained control needed by animators, such as composing multiple actions and defining precise durations for parts of the motion. To… ▽ More

    Submitted 24 May, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: CVPR 2024, HuMoGen Workshop

  7. arXiv:2312.04535  [pdf, other

    cs.LG cs.RO

    Trajeglish: Traffic Modeling as Next-Token Prediction

    Authors: Jonah Philion, Xue Bin Peng, Sanja Fidler

    Abstract: A longstanding challenge for self-driving development is simulating dynamic driving scenarios seeded from recorded driving logs. In pursuit of this functionality, we apply tools from discrete sequence modeling to model how vehicles, pedestrians and cyclists interact in driving scenarios. Using a simple data-driven tokenization scheme, we discretize trajectories to centimeter-level resolution using… ▽ More

    Submitted 14 April, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: ICLR 2024

  8. arXiv:2305.14343  [pdf, other

    cs.LG cs.AI cs.CV

    Video Prediction Models as Rewards for Reinforcement Learning

    Authors: Alejandro Escontrela, Ademi Adeniji, Wilson Yan, Ajay Jain, Xue Bin Peng, Ken Goldberg, Youngwoon Lee, Danijar Hafner, Pieter Abbeel

    Abstract: Specifying reward signals that allow agents to learn complex behaviors is a long-standing challenge in reinforcement learning. A promising approach is to extract preferences for behaviors from unlabeled videos, which are widely available on the internet. We present Video Prediction Rewards (VIPER), an algorithm that leverages pretrained video prediction models as action-free reward signals for rei… ▽ More

    Submitted 30 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: 22 pages, 18 figures, 4 tables. under review

  9. arXiv:2305.02195  [pdf, other

    cs.CV cs.AI cs.RO

    CALM: Conditional Adversarial Latent Models for Directable Virtual Characters

    Authors: Chen Tessler, Yoni Kasten, Yunrong Guo, Shie Mannor, Gal Chechik, Xue Bin Peng

    Abstract: In this work, we present Conditional Adversarial Latent Models (CALM), an approach for generating diverse and directable behaviors for user-controlled interactive virtual characters. Using imitation learning, CALM learns a representation of movement that captures the complexity and diversity of human motion, and enables direct control over character movements. The approach jointly learns a control… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: Accepted to SIGGRAPH 2023

  10. arXiv:2304.09834  [pdf, other

    cs.RO cs.AI

    Learning and Adapting Agile Locomotion Skills by Transferring Experience

    Authors: Laura Smith, J. Chase Kew, Tianyu Li, Linda Luu, Xue Bin Peng, Sehoon Ha, Jie Tan, Sergey Levine

    Abstract: Legged robots have enormous potential in their range of capabilities, from navigating unstructured terrains to high-speed running. However, designing robust controllers for highly agile dynamic motions remains a substantial challenge for roboticists. Reinforcement learning (RL) offers a promising data-driven approach for automatically training such controllers. However, exploration in these high-d… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: Project website: https://sites.google.com/berkeley.edu/twirl

  11. arXiv:2304.04150  [pdf, other

    cs.RO cs.AI

    RoboPianist: Dexterous Piano Playing with Deep Reinforcement Learning

    Authors: Kevin Zakka, Philipp Wu, Laura Smith, Nimrod Gileadi, Taylor Howell, Xue Bin Peng, Sumeet Singh, Yuval Tassa, Pete Florence, Andy Zeng, Pieter Abbeel

    Abstract: Replicating human-like dexterity in robot hands represents one of the largest open problems in robotics. Reinforcement learning is a promising approach that has achieved impressive progress in the last few years; however, the class of problems it has typically addressed corresponds to a rather narrow definition of dexterity as compared to human capabilities. To address this gap, we investigate pia… ▽ More

    Submitted 3 December, 2023; v1 submitted 8 April, 2023; originally announced April 2023.

    Comments: Accepted to the Conference on Robot Learning (CORL) 2023

  12. arXiv:2304.01893  [pdf, other

    cs.CV cs.GR cs.LG

    Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion

    Authors: Davis Rempe, Zhengyi Luo, Xue Bin Peng, Ye Yuan, Kris Kitani, Karsten Kreis, Sanja Fidler, Or Litany

    Abstract: We introduce a method for generating realistic pedestrian trajectories and full-body animations that can be controlled to meet user-defined goals. We draw on recent advances in guided diffusion modeling to achieve test-time controllability of trajectories, which is normally only associated with rule-based systems. Our guided diffusion model allows users to constrain trajectories through target way… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: Conference on Computer Vision and Pattern Recognition (CVPR) 2023

  13. arXiv:2302.09450  [pdf, other

    cs.RO cs.AI eess.SY

    Robust and Versatile Bipedal Jum** Control through Reinforcement Learning

    Authors: Zhongyu Li, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

    Abstract: This work aims to push the limits of agility for bipedal robots by enabling a torque-controlled bipedal robot to perform robust and versatile dynamic jumps in the real world. We present a reinforcement learning framework for training a robot to accomplish a large variety of jum** tasks, such as jum** to different locations and directions. To improve performance on these challenging tasks, we d… ▽ More

    Submitted 31 May, 2023; v1 submitted 18 February, 2023; originally announced February 2023.

    Comments: Accepted in Robotics: Science and Systems 2023 (RSS 2023). The accompanying video is at https://youtu.be/aAPSZ2QFB-E

  14. arXiv:2302.00883  [pdf, other

    cs.GR cs.AI cs.LG

    Synthesizing Physical Character-Scene Interactions

    Authors: Mohamed Hassan, Yunrong Guo, Tingwu Wang, Michael Black, Sanja Fidler, Xue Bin Peng

    Abstract: Movement is how people interact with and affect their environment. For realistic character animation, it is necessary to synthesize such interactions between virtual characters and their surroundings. Despite recent progress in character animation using machine learning, most systems focus on controlling an agent's movements in fairly simple and homogeneous environments, with limited interactions… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  15. arXiv:2301.13868  [pdf, other

    cs.LG cs.AI cs.CL cs.GR

    PADL: Language-Directed Physics-Based Character Control

    Authors: Jordan Juravsky, Yunrong Guo, Sanja Fidler, Xue Bin Peng

    Abstract: Develo** systems that can synthesize natural and life-like motions for simulated characters has long been a focus for computer animation. But in order for these systems to be useful for downstream applications, they need not only produce high-quality motions, but must also provide an accessible and versatile interface through which users can direct a character's behaviors. Natural language provi… ▽ More

    Submitted 31 January, 2023; originally announced January 2023.

  16. arXiv:2210.04435  [pdf, other

    cs.RO cs.AI eess.SY

    Creating a Dynamic Quadrupedal Robotic Goalkeeper with Reinforcement Learning

    Authors: Xiaoyu Huang, Zhongyu Li, Yanzhen Xiang, Yiming Ni, Yufeng Chi, Yunhao Li, Lizhi Yang, Xue Bin Peng, Koushil Sreenath

    Abstract: We present a reinforcement learning (RL) framework that enables quadrupedal robots to perform soccer goalkee** tasks in the real world. Soccer goalkee** using quadrupeds is a challenging problem, that combines highly dynamic locomotion with precise and fast non-prehensile object (ball) manipulation. The robot needs to react to and intercept a potentially flying ball using dynamic locomotion ma… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: First two authors contributed equally. Accompanying video is at https://youtu.be/iX6OgG67-ZQ

  17. arXiv:2209.05309  [pdf, other

    cs.RO cs.LG

    GenLoco: Generalized Locomotion Controllers for Quadrupedal Robots

    Authors: Gilbert Feng, Hongbo Zhang, Zhongyu Li, Xue Bin Peng, Bhuvan Basireddy, Linzhu Yue, Zhitao Song, Lizhi Yang, Yunhui Liu, Koushil Sreenath, Sergey Levine

    Abstract: Recent years have seen a surge in commercially-available and affordable quadrupedal robots, with many of these platforms being actively used in research and industry. As the availability of legged robots grows, so does the need for controllers that enable these robots to perform useful skills. However, most learning-based frameworks for controller development focus on training robot-specific contr… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

    Comments: First two authors contributed equally

  18. arXiv:2208.01160  [pdf, other

    cs.RO cs.AI eess.SY

    Hierarchical Reinforcement Learning for Precise Soccer Shooting Skills using a Quadrupedal Robot

    Authors: Yandong Ji, Zhongyu Li, Yinan Sun, Xue Bin Peng, Sergey Levine, Glen Berseth, Koushil Sreenath

    Abstract: We address the problem of enabling quadrupedal robots to perform precise shooting skills in the real world using reinforcement learning. Develo** algorithms to enable a legged robot to shoot a soccer ball to a given target is a challenging problem that combines robot motion control and planning into one task. To solve this problem, we need to consider the dynamics limitation and motion stability… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

    Comments: Accepted to 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2022)

  19. arXiv:2205.01906  [pdf, other

    cs.GR cs.AI cs.LG

    ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated Characters

    Authors: Xue Bin Peng, Yunrong Guo, Lina Halper, Sergey Levine, Sanja Fidler

    Abstract: The incredible feats of athleticism demonstrated by humans are made possible in part by a vast repertoire of general-purpose motor skills, acquired through years of practice and experience. These skills not only enable humans to perform complex tasks, but also provide powerful priors for guiding their behaviors when learning new tasks. This is in stark contrast to what is common practice in physic… ▽ More

    Submitted 5 May, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

  20. arXiv:2203.15103  [pdf, other

    cs.AI cs.RO

    Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions

    Authors: Alejandro Escontrela, Xue Bin Peng, Wenhao Yu, Tingnan Zhang, Atil Iscen, Ken Goldberg, Pieter Abbeel

    Abstract: Training a high-dimensional simulated agent with an under-specified reward function often leads the agent to learn physically infeasible strategies that are ineffective when deployed in the real world. To mitigate these unnatural behaviors, reinforcement learning practitioners often utilize complex reward functions that encourage physically plausible behaviors. However, a tedious labor-intensive t… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: 8 pages, 6 figures, 3 tables

  21. arXiv:2202.00161  [pdf, other

    cs.LG cs.AI

    CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery

    Authors: Michael Laskin, Hao Liu, Xue Bin Peng, Denis Yarats, Aravind Rajeswaran, Pieter Abbeel

    Abstract: We introduce Contrastive Intrinsic Control (CIC), an algorithm for unsupervised skill discovery that maximizes the mutual information between state-transitions and latent skill vectors. CIC utilizes contrastive learning between state-transitions and skills to learn behavior embeddings and maximizes the entropy of these embeddings as an intrinsic reward to encourage behavioral diversity. We evaluat… ▽ More

    Submitted 29 March, 2022; v1 submitted 31 January, 2022; originally announced February 2022.

    Comments: Project website: https://sites.google.com/view/cicrl/

  22. arXiv:2110.05457  [pdf, other

    cs.RO

    Legged Robots that Keep on Learning: Fine-Tuning Locomotion Policies in the Real World

    Authors: Laura Smith, J. Chase Kew, Xue Bin Peng, Sehoon Ha, Jie Tan, Sergey Levine

    Abstract: Legged robots are physically capable of traversing a wide range of challenging environments, but designing controllers that are sufficiently robust to handle this diversity has been a long-standing challenge in robotics. Reinforcement learning presents an appealing approach for automating the controller design process and has been able to produce remarkably robust controllers when trained in a sui… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: Project website: https://sites.google.com/berkeley.edu/fine-tuning-locomotion

  23. AMP: Adversarial Motion Priors for Stylized Physics-Based Character Control

    Authors: Xue Bin Peng, Ze Ma, Pieter Abbeel, Sergey Levine, Angjoo Kanazawa

    Abstract: Synthesizing graceful and life-like behaviors for physically simulated characters has been a fundamental challenge in computer animation. Data-driven methods that leverage motion tracking are a prominent class of techniques for producing high fidelity motions for a wide range of behaviors. However, the effectiveness of these tracking-based methods often hinges on carefully designed objective funct… ▽ More

    Submitted 12 May, 2022; v1 submitted 5 April, 2021; originally announced April 2021.

  24. arXiv:2103.14295  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots

    Authors: Zhongyu Li, Xuxin Cheng, Xue Bin Peng, Pieter Abbeel, Sergey Levine, Glen Berseth, Koushil Sreenath

    Abstract: Develo** robust walking controllers for bipedal robots is a challenging endeavor. Traditional model-based locomotion controllers require simplifying assumptions and careful modelling; any small errors can result in unstable control. To address these challenges for bipedal locomotion, we present a model-free reinforcement learning framework for training robust locomotion policies in simulation, w… ▽ More

    Submitted 26 March, 2021; originally announced March 2021.

    Comments: To appear on 2021 International Conference on Robotics and Automation (ICRA 2021)

  25. arXiv:2008.06043  [pdf, other

    cs.LG cs.AI stat.ML

    Offline Meta-Reinforcement Learning with Advantage Weighting

    Authors: Eric Mitchell, Rafael Rafailov, Xue Bin Peng, Sergey Levine, Chelsea Finn

    Abstract: This paper introduces the offline meta-reinforcement learning (offline meta-RL) problem setting and proposes an algorithm that performs well in this setting. Offline meta-RL is analogous to the widely successful supervised learning strategy of pre-training a model on a large batch of fixed, pre-collected data (possibly from various tasks) and fine-tuning the model to a new task with relatively lit… ▽ More

    Submitted 21 July, 2021; v1 submitted 13 August, 2020; originally announced August 2020.

    Comments: ICML 2021; for code & project info, see http://sites.google.com/view/macaw-metarl

  26. arXiv:2004.00784  [pdf, other

    cs.RO cs.LG

    Learning Agile Robotic Locomotion Skills by Imitating Animals

    Authors: Xue Bin Peng, Erwin Coumans, Tingnan Zhang, Tsang-Wei Lee, Jie Tan, Sergey Levine

    Abstract: Reproducing the diverse and agile locomotion skills of animals has been a longstanding challenge in robotics. While manually-designed controllers have been able to emulate many complex behaviors, building such controllers involves a time-consuming and difficult development process, often requiring substantial expertise of the nuances of each skill. Reinforcement learning provides an appealing alte… ▽ More

    Submitted 20 July, 2020; v1 submitted 1 April, 2020; originally announced April 2020.

  27. arXiv:1912.13465  [pdf, other

    cs.LG stat.ML

    Reward-Conditioned Policies

    Authors: Aviral Kumar, Xue Bin Peng, Sergey Levine

    Abstract: Reinforcement learning offers the promise of automating the acquisition of complex behavioral skills. However, compared to commonly used and well-understood supervised learning methods, reinforcement learning algorithms can be brittle, difficult to use and tune, and sensitive to seemingly innocuous implementation decisions. In contrast, imitation learning utilizes standard and well-understood supe… ▽ More

    Submitted 31 December, 2019; originally announced December 2019.

  28. arXiv:1910.00177  [pdf, other

    cs.LG stat.ML

    Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

    Authors: Xue Bin Peng, Aviral Kumar, Grace Zhang, Sergey Levine

    Abstract: In this paper, we aim to develop a simple and scalable reinforcement learning algorithm that uses standard supervised learning methods as subroutines. Our goal is an algorithm that utilizes only simple and convergent maximum likelihood loss functions, while also being able to leverage off-policy data. Our proposed approach, which we refer to as advantage-weighted regression (AWR), consists of two… ▽ More

    Submitted 7 October, 2019; v1 submitted 30 September, 2019; originally announced October 2019.

  29. arXiv:1906.10667  [pdf, other

    cs.LG cs.AI stat.ML

    Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives

    Authors: Anirudh Goyal, Shagun Sodhani, Jonathan Binas, Xue Bin Peng, Sergey Levine, Yoshua Bengio

    Abstract: Reinforcement learning agents that operate in diverse and complex environments can benefit from the structured decomposition of their behavior. Often, this is addressed in the context of hierarchical reinforcement learning, where the aim is to decompose a policy into lower-level primitives or options, and a higher-level meta-policy that triggers the appropriate behaviors for a given situation. How… ▽ More

    Submitted 25 June, 2019; originally announced June 2019.

    Comments: Preprint, Under Review

  30. arXiv:1905.09808  [pdf, other

    cs.LG stat.ML

    MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies

    Authors: Xue Bin Peng, Michael Chang, Grace Zhang, Pieter Abbeel, Sergey Levine

    Abstract: Humans are able to perform a myriad of sophisticated tasks by drawing upon skills acquired through prior experience. For autonomous agents to have this capability, they must be able to extract reusable skills from past experience that can be recombined in new ways for subsequent tasks. Furthermore, when controlling complex high-dimensional morphologies, such as humanoid bodies, tasks often require… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

  31. arXiv:1810.03599  [pdf, other

    cs.GR cs.CV cs.LG

    SFV: Reinforcement Learning of Physical Skills from Videos

    Authors: Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, Sergey Levine

    Abstract: Data-driven character animation based on motion capture can produce highly naturalistic behaviors and, when combined with physics simulation, can provide for natural procedural responses to physical perturbations, environmental changes, and morphological discrepancies. Motion capture remains the most popular source of motion data, but collecting mocap data typically requires heavily instrumented e… ▽ More

    Submitted 15 October, 2018; v1 submitted 8 October, 2018; originally announced October 2018.

  32. arXiv:1810.00821  [pdf, other

    cs.LG stat.ML

    Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow

    Authors: Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, Sergey Levine

    Abstract: Adversarial learning methods have been proposed for a wide range of applications, but the training of adversarial models can be notoriously unstable. Effectively balancing the performance of the generator and discriminator is critical, since a discriminator that achieves very high accuracy will produce relatively uninformative gradients. In this work, we propose a simple and general technique to c… ▽ More

    Submitted 24 August, 2020; v1 submitted 1 October, 2018; originally announced October 2018.

  33. arXiv:1804.06424  [pdf, other

    cs.AI cs.RO

    Terrain RL Simulator

    Authors: Glen Berseth, Xue Bin Peng, Michiel van de Panne

    Abstract: We provide $89$ challenging simulation environments that range in difficulty. The difficulty of solving a task is linked not only to the number of dimensions in the action space but also to the size and shape of the distribution of configurations the agent experiences. Therefore, we are releasing a number of simulation environments that include randomly generated terrain. The library also provides… ▽ More

    Submitted 17 April, 2018; originally announced April 2018.

    Comments: 10 pages

  34. arXiv:1804.02717  [pdf, other

    cs.GR cs.AI cs.LG

    DeepMimic: Example-Guided Deep Reinforcement Learning of Physics-Based Character Skills

    Authors: Xue Bin Peng, Pieter Abbeel, Sergey Levine, Michiel van de Panne

    Abstract: A longstanding goal in character animation is to combine data-driven specification of behavior with a system that can execute a similar behavior in a physical simulation, thus enabling realistic responses to perturbations and environmental variation. We show that well-known reinforcement learning (RL) methods can be adapted to learn robust control policies capable of imitating a broad range of exa… ▽ More

    Submitted 26 July, 2018; v1 submitted 8 April, 2018; originally announced April 2018.

  35. Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

    Authors: Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel

    Abstract: Simulations are attractive environments for training agents as they provide an abundant source of data and alleviate certain safety concerns during the training process. But the behaviours developed by agents in simulation are often specific to the characteristics of the simulator. Due to modeling error, strategies that are successful in simulation may not transfer to their real world counterparts… ▽ More

    Submitted 2 March, 2018; v1 submitted 17 October, 2017; originally announced October 2017.

  36. arXiv:1611.01055  [pdf, other

    cs.LG cs.GR cs.RO

    Learning Locomotion Skills Using DeepRL: Does the Choice of Action Space Matter?

    Authors: Xue Bin Peng, Michiel van de Panne

    Abstract: The use of deep reinforcement learning allows for high-dimensional state descriptors, but little is known about how the choice of action representation impacts the learning difficulty and the resulting performance. We compare the impact of four different action parameterizations (torques, muscle-activations, target joint angles, and target joint-angle velocities) in terms of learning time, policy… ▽ More

    Submitted 3 November, 2016; originally announced November 2016.