Skip to main content

Showing 1–14 of 14 results for author: Lutter, M

.
  1. arXiv:2303.03955  [pdf, other

    cs.LG

    Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning

    Authors: Daniel Palenicek, Michael Lutter, Joao Carvalho, Jan Peters

    Abstract: Model-based reinforcement learning is one approach to increase sample efficiency. However, the accuracy of the dynamics model and the resulting compounding error over modelled trajectories are commonly regarded as key limitations. A natural question to ask is: How much more sample efficiency can be gained by improving the learned dynamics models? Our paper empirically answers this question for the… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

    Comments: Published as a conference paper at ICLR 2023

  2. arXiv:2203.14660  [pdf, other

    cs.LG

    Revisiting Model-based Value Expansion

    Authors: Daniel Palenicek, Michael Lutter, Jan Peters

    Abstract: Model-based value expansion methods promise to improve the quality of value function targets and, thereby, the effectiveness of value function learning. However, to date, these methods are being outperformed by Dyna-style algorithms with conceptually simpler 1-step value function targets. This shows that in practice, the theoretical justification of value expansion does not seem to hold. We provid… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

  3. arXiv:2110.12422  [pdf, other

    cs.RO

    A Differentiable Newton-Euler Algorithm for Real-World Robotics

    Authors: Michael Lutter, Johannes Silberbauer, Joe Watson, Jan Peters

    Abstract: Obtaining dynamics models is essential for robotics to achieve accurate model-based controllers and simulators for planning. The dynamics models are typically obtained using model specification of the manufacturer or simple numerical methods such as linear regression. However, this approach does not guarantee physically plausible parameters and can only be applied to kinematic chains consisting of… ▽ More

    Submitted 24 October, 2021; originally announced October 2021.

    Comments: arXiv admin note: text overlap with arXiv:2011.01734

  4. arXiv:2110.01954  [pdf, other

    cs.RO cs.LG

    Continuous-Time Fitted Value Iteration for Robust Policies

    Authors: Michael Lutter, Boris Belousov, Shie Mannor, Dieter Fox, Animesh Garg, Jan Peters

    Abstract: Solving the Hamilton-Jacobi-Bellman equation is important in many domains including control, robotics and economics. Especially for continuous control, solving this differential equation and its extension the Hamilton-Jacobi-Isaacs equation, is important as it yields the optimal policy that achieves the maximum reward on a give task. In the case of the Hamilton-Jacobi-Isaacs equation, which includ… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

    Comments: arXiv admin note: text overlap with arXiv:2105.12189

  5. arXiv:2110.01894  [pdf, other

    cs.LG cs.RO

    Combining Physics and Deep Learning to learn Continuous-Time Dynamics Models

    Authors: Michael Lutter, Jan Peters

    Abstract: Deep learning has been widely used within learning algorithms for robotics. One disadvantage of deep networks is that these networks are black-box representations. Therefore, the learned approximations ignore the existing knowledge of physics or robotics. Especially for learning dynamics models, these black-box models are not desirable as the underlying principles are well understood and the stand… ▽ More

    Submitted 16 March, 2023; v1 submitted 5 October, 2021; originally announced October 2021.

  6. arXiv:2109.14311  [pdf, other

    cs.LG cs.RO

    Learning Dynamics Models for Model Predictive Agents

    Authors: Michael Lutter, Leonard Hasenclever, Arunkumar Byravan, Gabriel Dulac-Arnold, Piotr Trochim, Nicolas Heess, Josh Merel, Yuval Tassa

    Abstract: Model-Based Reinforcement Learning involves learning a \textit{dynamics model} from data, and then using this model to optimise behaviour, most often with an online \textit{planner}. Much of the recent research along these lines presents a particular set of design choices, involving problem definition, model learning and planning. Given the multiple contributions, it is difficult to evaluate the e… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.

  7. arXiv:2105.12189  [pdf, other

    cs.LG cs.RO eess.SY

    Robust Value Iteration for Continuous Control Tasks

    Authors: Michael Lutter, Shie Mannor, Jan Peters, Dieter Fox, Animesh Garg

    Abstract: When transferring a control policy from simulation to a physical system, the policy needs to be robust to variations in the dynamics to perform well. Commonly, the optimal policy overfits to the approximate model and the corresponding state-distribution, often resulting in failure to trasnfer underlying distributional shifts. In this paper, we present Robust Fitted Value Iteration, which uses dyna… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

    Comments: Accepted Paper at Robotics: Science and Systems

  8. arXiv:2105.04682  [pdf, other

    cs.LG cs.RO eess.SY

    Value Iteration in Continuous Actions, States and Time

    Authors: Michael Lutter, Shie Mannor, Jan Peters, Dieter Fox, Animesh Garg

    Abstract: Classical value iteration approaches are not applicable to environments with continuous states and actions. For such environments, the states and actions are usually discretized, which leads to an exponential increase in computational complexity. In this paper, we propose continuous fitted value iteration (cFVI). This algorithm enables dynamic programming for continuous states and actions with a k… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

    Comments: Accepted at International Conference on Machine Learning (ICML) 2021

  9. arXiv:2011.01734  [pdf, other

    cs.RO cs.LG

    Differentiable Physics Models for Real-world Offline Model-based Reinforcement Learning

    Authors: Michael Lutter, Johannes Silberbauer, Joe Watson, Jan Peters

    Abstract: A limitation of model-based reinforcement learning (MBRL) is the exploitation of errors in the learned models. Black-box models can fit complex dynamics with high fidelity, but their behavior is undefined outside of the data distribution.Physics-based models are better at extrapolating, due to the general validity of their informed structure, but underfit in the real world due to the presence of u… ▽ More

    Submitted 3 November, 2020; originally announced November 2020.

  10. arXiv:2010.13483  [pdf, other

    cs.RO cs.LG stat.ML

    High Acceleration Reinforcement Learning for Real-World Juggling with Binary Rewards

    Authors: Kai Ploeger, Michael Lutter, Jan Peters

    Abstract: Robots that can learn in the physical world will be important to en-able robots to escape their stiff and pre-programmed movements. For dynamic high-acceleration tasks, such as juggling, learning in the real-world is particularly challenging as one must push the limits of the robot and its actuation without harming the system, amplifying the necessity of sample efficiency and safety for robot lear… ▽ More

    Submitted 31 October, 2020; v1 submitted 26 October, 2020; originally announced October 2020.

    Comments: Published at Conference on Robot Learning (CoRL) 2020

  11. arXiv:2010.09802  [pdf, other

    cs.RO cs.LG

    A Differentiable Newton Euler Algorithm for Multi-body Model Learning

    Authors: Michael Lutter, Johannes Silberbauer, Joe Watson, Jan Peters

    Abstract: In this work, we examine a spectrum of hybrid model for the domain of multi-body robot dynamics. We motivate a computation graph architecture that embodies the Newton Euler equations, emphasizing the utility of the Lie Algebra form in translating the dynamical geometry into an efficient computational structure for learning. We describe the used virtual parameters that enable unconstrained physical… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Comments: ICML 2020 Workshop on Inductive Biases, Invariances and Generalization in Reinforcement Learning

  12. arXiv:1909.06153  [pdf, other

    cs.LG cs.RO stat.ML

    HJB Optimal Feedback Control with Deep Differential Value Functions and Action Constraints

    Authors: Michael Lutter, Boris Belousov, Kim Listmann, Debora Clever, Jan Peters

    Abstract: Learning optimal feedback control laws capable of executing optimal trajectories is essential for many robotic applications. Such policies can be learned using reinforcement learning or planned using optimal control. While reinforcement learning is sample inefficient, optimal control only plans an optimal trajectory from a specific starting configuration. In this paper we propose deep optimal feed… ▽ More

    Submitted 11 October, 2019; v1 submitted 13 September, 2019; originally announced September 2019.

    Comments: Conference on Robot Learning (CoRL) 2019

  13. arXiv:1907.04490  [pdf, other

    cs.LG cs.RO eess.SY stat.ML

    Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning

    Authors: Michael Lutter, Christian Ritter, Jan Peters

    Abstract: Deep learning has achieved astonishing results on many tasks with large amounts of data and generalization within the proximity of training data. For many important real-world applications, these requirements are unfeasible and additional prior knowledge on the task domain is required to overcome the resulting problems. In particular, learning physics models for model-based control requires robust… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

    Comments: Published at ICLR 2019

  14. arXiv:1907.04489  [pdf, other

    cs.RO cs.LG eess.SY

    Deep Lagrangian Networks for end-to-end learning of energy-based control for under-actuated systems

    Authors: Michael Lutter, Kim Listmann, Jan Peters

    Abstract: Applying Deep Learning to control has a lot of potential for enabling the intelligent design of robot control laws. Unfortunately common deep learning approaches to control, such as deep reinforcement learning, require an unrealistic amount of interaction with the real system, do not yield any performance guarantees, and do not make good use of extensive insights from model-based control. In parti… ▽ More

    Submitted 3 August, 2019; v1 submitted 9 July, 2019; originally announced July 2019.

    Comments: Published at IROS 2019