Skip to main content

Showing 1–16 of 16 results for author: Haarnoja, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.02425  [pdf, other

    cs.RO cs.AI

    Learning Robot Soccer from Egocentric Vision with Deep Reinforcement Learning

    Authors: Dhruva Tirumala, Markus Wulfmeier, Ben Moran, Sandy Huang, Jan Humplik, Guy Lever, Tuomas Haarnoja, Leonard Hasenclever, Arunkumar Byravan, Nathan Batchelor, Neil Sreendra, Kushal Patel, Marlon Gwira, Francesco Nori, Martin Riedmiller, Nicolas Heess

    Abstract: We apply multi-agent deep reinforcement learning (RL) to train end-to-end robot soccer policies with fully onboard computation and sensing via egocentric RGB vision. This setting reflects many challenges of real-world robotics, including active perception, agile full-body control, and long-horizon planning in a dynamic, partially-observable, multi-agent domain. We rely on large-scale, simulation-b… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  2. arXiv:2311.15951  [pdf, other

    cs.LG cs.AI cs.RO

    Replay across Experiments: A Natural Extension of Off-Policy RL

    Authors: Dhruva Tirumala, Thomas Lampe, Jose Enrique Chen, Tuomas Haarnoja, Sandy Huang, Guy Lever, Ben Moran, Tim Hertweck, Leonard Hasenclever, Martin Riedmiller, Nicolas Heess, Markus Wulfmeier

    Abstract: Replaying data is a principal mechanism underlying the stability and data efficiency of off-policy reinforcement learning (RL). We present an effective yet simple framework to extend the use of replays across multiple experiments, minimally adapting the RL workflow for sizeable improvements in controller performance and research iteration times. At its core, Replay Across Experiments (RaE) involve… ▽ More

    Submitted 28 November, 2023; v1 submitted 27 November, 2023; originally announced November 2023.

  3. Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning

    Authors: Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H. Huang, Dhruva Tirumala, Jan Humplik, Markus Wulfmeier, Saran Tunyasuvunakool, Noah Y. Siegel, Roland Hafner, Michael Bloesch, Kristian Hartikainen, Arunkumar Byravan, Leonard Hasenclever, Yuval Tassa, Fereshteh Sadeghi, Nathan Batchelor, Federico Casarini, Stefano Saliceti, Charles Game, Neil Sreendra, Kushal Patel, Marlon Gwira, Andrea Huber, Nicole Hurley , et al. (3 additional authors not shown)

    Abstract: We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environments. We used Deep RL to train a humanoid robot with 20 actuated joints to play a simplified one-versus-one (1v1) soccer game. The resulting agent exhibits robust… ▽ More

    Submitted 11 April, 2024; v1 submitted 26 April, 2023; originally announced April 2023.

    Comments: Project website: https://sites.google.com/view/op3-soccer

  4. arXiv:2211.13743  [pdf, other

    cs.LG cs.AI cs.RO

    SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration

    Authors: Giulia Vezzani, Dhruva Tirumala, Markus Wulfmeier, Dushyant Rao, Abbas Abdolmaleki, Ben Moran, Tuomas Haarnoja, Jan Humplik, Roland Hafner, Michael Neunert, Claudio Fantacci, Tim Hertweck, Thomas Lampe, Fereshteh Sadeghi, Nicolas Heess, Martin Riedmiller

    Abstract: The ability to effectively reuse prior knowledge is a key requirement when building general and flexible Reinforcement Learning (RL) agents. Skill reuse is one of the most common approaches, but current methods have considerable limitations.For example, fine-tuning an existing policy frequently fails, as the policy can degrade rapidly early in training. In a similar vein, distillation of expert be… ▽ More

    Submitted 11 January, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

  5. arXiv:2210.04932  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields

    Authors: Arunkumar Byravan, Jan Humplik, Leonard Hasenclever, Arthur Brussee, Francesco Nori, Tuomas Haarnoja, Ben Moran, Steven Bohez, Fereshteh Sadeghi, Bojan Vujatovic, Nicolas Heess

    Abstract: We present a system for applying sim2real approaches to "in the wild" scenes with realistic visuals, and to policies which rely on active perception using RGB cameras. Given a short video of a static scene collected using a generic phone, we learn the scene's contact geometry and a function for novel view synthesis using a Neural Radiance Field (NeRF). We augment the NeRF rendering of the static s… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

  6. arXiv:2204.05893  [pdf, other

    cs.RO cs.AI cs.LG

    Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data

    Authors: Wenxuan Zhou, Steven Bohez, Jan Humplik, Abbas Abdolmaleki, Dushyant Rao, Markus Wulfmeier, Tuomas Haarnoja, Nicolas Heess

    Abstract: Robots will experience non-stationary environment dynamics throughout their lifetime: the robot dynamics can change due to wear and tear, or its surroundings may change over time. Eventually, the robots should perform well in all of the environment variations it has encountered. At the same time, it should still be able to learn fast in a new environment. We identify two challenges in Reinforcemen… ▽ More

    Submitted 18 August, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

    Comments: Published at 1st Conference on Lifelong Learning Agents, 2022

  7. arXiv:2203.17138  [pdf, other

    cs.RO cs.AI cs.LG

    Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors

    Authors: Steven Bohez, Saran Tunyasuvunakool, Philemon Brakel, Fereshteh Sadeghi, Leonard Hasenclever, Yuval Tassa, Emilio Parisotto, Jan Humplik, Tuomas Haarnoja, Roland Hafner, Markus Wulfmeier, Michael Neunert, Ben Moran, Noah Siegel, Andrea Huber, Francesco Romano, Nathan Batchelor, Federico Casarini, Josh Merel, Raia Hadsell, Nicolas Heess

    Abstract: We investigate the use of prior knowledge of human and animal movement to learn reusable locomotion skills for real legged robots. Our approach builds upon previous work on imitating human or dog Motion Capture (MoCap) data to learn a movement skill module. Once learned, this skill module can be reused for complex downstream tasks. Importantly, due to the prior imposed by the MoCap data, our appro… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

    Comments: 30 pages, 9 figures, 8 tables, 14 videos at https://bit.ly/robot-npmp , submitted to Science Robotics

  8. arXiv:2105.12196  [pdf, other

    cs.AI cs.MA cs.NE cs.RO

    From Motor Control to Team Play in Simulated Humanoid Football

    Authors: Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

    Abstract: Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to serve goals defined on much longer timescales, and in terms of relations that extend far beyond the body itself, ultimately involving coordination with other agents… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

  9. arXiv:1907.08225  [pdf, other

    cs.LG cs.AI cs.CV cs.RO stat.ML

    Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery

    Authors: Kristian Hartikainen, Xinyang Geng, Tuomas Haarnoja, Sergey Levine

    Abstract: Reinforcement learning requires manual specification of a reward function to learn a task. While in principle this reward function only needs to specify the task goal, in practice reinforcement learning can be very time-consuming or even infeasible unless the reward function is shaped so as to provide a smooth gradient towards a successful outcome. This sha** is difficult to specify by hand, par… ▽ More

    Submitted 14 February, 2020; v1 submitted 18 July, 2019; originally announced July 2019.

    Comments: 11+6 pages, 6+2 figures, last two authors (Tuomas Haarnoja, Sergey Levine) advised equally

  10. arXiv:1812.11103  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Learning to Walk via Deep Reinforcement Learning

    Authors: Tuomas Haarnoja, Sehoon Ha, Aurick Zhou, Jie Tan, George Tucker, Sergey Levine

    Abstract: Deep reinforcement learning (deep RL) holds the promise of automating the acquisition of complex controllers that can map sensory inputs directly to low-level actions. In the domain of robotic locomotion, deep RL could enable learning locomotion skills with minimal engineering and without an explicit model of the robot dynamics. Unfortunately, applying deep RL to real-world robotic tasks is except… ▽ More

    Submitted 19 June, 2019; v1 submitted 26 December, 2018; originally announced December 2018.

    Comments: RSS 2019, https://sites.google.com/view/minitaur-locomotion/

  11. arXiv:1812.05905  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Soft Actor-Critic Algorithms and Applications

    Authors: Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, Sergey Levine

    Abstract: Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample complexity and brittleness to hyperparameters. Both of these challenges limit the applicability of such methods to real-world domains. In this paper, we describe S… ▽ More

    Submitted 29 January, 2019; v1 submitted 12 December, 2018; originally announced December 2018.

    Comments: arXiv admin note: substantial text overlap with arXiv:1801.01290

  12. arXiv:1804.02808  [pdf, other

    cs.LG cs.AI stat.ML

    Latent Space Policies for Hierarchical Reinforcement Learning

    Authors: Tuomas Haarnoja, Kristian Hartikainen, Pieter Abbeel, Sergey Levine

    Abstract: We address the problem of learning hierarchical deep neural network policies for reinforcement learning. In contrast to methods that explicitly restrict or cripple lower layers of a hierarchy to force them to use higher-level modulating signals, each layer in our framework is trained to directly solve the task, but acquires a range of diverse strategies via a maximum entropy reinforcement learning… ▽ More

    Submitted 3 September, 2018; v1 submitted 9 April, 2018; originally announced April 2018.

    Comments: ICML 2018; Videos: https://sites.google.com/view/latent-space-deep-rl Code: https://github.com/haarnoja/sac

  13. arXiv:1803.06773  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Composable Deep Reinforcement Learning for Robotic Manipulation

    Authors: Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel, Sergey Levine

    Abstract: Model-free deep reinforcement learning has been shown to exhibit good performance in domains ranging from video games to simulated robotic manipulation and locomotion. However, model-free methods are known to perform poorly when the interaction time with the environment is limited, as is the case for most real-world robotic tasks. In this paper, we study how maximum entropy policies trained using… ▽ More

    Submitted 18 March, 2018; originally announced March 2018.

    Comments: Videos: https://sites.google.com/view/composing-real-world-policies/

  14. arXiv:1801.01290  [pdf, other

    cs.LG cs.AI stat.ML

    Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

    Authors: Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine

    Abstract: Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major challenges: very high sample complexity and brittle convergence properties, which necessitate meticulous hyperparameter tuning. Both of these challenges severely limit the applicability of such methods to c… ▽ More

    Submitted 8 August, 2018; v1 submitted 4 January, 2018; originally announced January 2018.

    Comments: ICML 2018 Videos: sites.google.com/view/soft-actor-critic Code: github.com/haarnoja/sac

  15. arXiv:1702.08165  [pdf, other

    cs.LG cs.AI

    Reinforcement Learning with Deep Energy-Based Policies

    Authors: Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, Sergey Levine

    Abstract: We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before. We apply our method to learning maximum entropy policies, resulting into a new algorithm, called soft Q-learning, that expresses the optimal policy via a Boltzmann distribution. We use the recently proposed amortized Stein variational gradient… ▽ More

    Submitted 21 July, 2017; v1 submitted 27 February, 2017; originally announced February 2017.

  16. arXiv:1605.07148  [pdf, other

    cs.LG cs.AI

    Backprop KF: Learning Discriminative Deterministic State Estimators

    Authors: Tuomas Haarnoja, Anurag Ajay, Sergey Levine, Pieter Abbeel

    Abstract: Generative state estimators based on probabilistic filters and smoothers are one of the most popular classes of state estimators for robots and autonomous vehicles. However, generative models have limited capacity to handle rich sensory observations, such as camera images, since they must model the entire distribution over sensor readings. Discriminative models do not suffer from this limitation,… ▽ More

    Submitted 30 September, 2017; v1 submitted 23 May, 2016; originally announced May 2016.

    Comments: NIPS 2016