Skip to main content

Showing 1–22 of 22 results for author: Bloesch, M

.
  1. arXiv:2402.05546  [pdf, other

    cs.LG cs.AI cs.RO

    Offline Actor-Critic Reinforcement Learning Scales to Large Models

    Authors: Jost Tobias Springenberg, Abbas Abdolmaleki, **gwei Zhang, Oliver Groth, Michael Bloesch, Thomas Lampe, Philemon Brakel, Sarah Bechtle, Steven Kapturowski, Roland Hafner, Nicolas Heess, Martin Riedmiller

    Abstract: We show that offline actor-critic reinforcement learning can scale to large models - such as transformers - and follows similar scaling laws as supervised learning. We find that offline actor-critic algorithms can outperform strong, supervised, behavioral cloning baselines for multi-task training on a large dataset containing both sub-optimal and expert behavior on 132 continuous control tasks. We… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  2. arXiv:2312.11374  [pdf, other

    cs.RO

    Mastering Stacking of Diverse Shapes with Large-Scale Iterative Reinforcement Learning on Real Robots

    Authors: Thomas Lampe, Abbas Abdolmaleki, Sarah Bechtle, Sandy H. Huang, Jost Tobias Springenberg, Michael Bloesch, Oliver Groth, Roland Hafner, Tim Hertweck, Michael Neunert, Markus Wulfmeier, **gwei Zhang, Francesco Nori, Nicolas Heess, Martin Riedmiller

    Abstract: Reinforcement learning solely from an agent's self-generated data is often believed to be infeasible for learning on real robots, due to the amount of data needed. However, if done right, agents learning from real data can be surprisingly efficient through re-using previously collected sub-optimal data. In this paper we demonstrate how the increased understanding of off-policy learning methods and… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  3. Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning

    Authors: Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H. Huang, Dhruva Tirumala, Jan Humplik, Markus Wulfmeier, Saran Tunyasuvunakool, Noah Y. Siegel, Roland Hafner, Michael Bloesch, Kristian Hartikainen, Arunkumar Byravan, Leonard Hasenclever, Yuval Tassa, Fereshteh Sadeghi, Nathan Batchelor, Federico Casarini, Stefano Saliceti, Charles Game, Neil Sreendra, Kushal Patel, Marlon Gwira, Andrea Huber, Nicole Hurley , et al. (3 additional authors not shown)

    Abstract: We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environments. We used Deep RL to train a humanoid robot with 20 actuated joints to play a simplified one-versus-one (1v1) soccer game. The resulting agent exhibits robust… ▽ More

    Submitted 11 April, 2024; v1 submitted 26 April, 2023; originally announced April 2023.

    Comments: Project website: https://sites.google.com/view/op3-soccer

  4. arXiv:2207.10940  [pdf, other

    cs.RO cs.CV

    Dense RGB-D-Inertial SLAM with Map Deformations

    Authors: Tristan Laidlow, Michael Bloesch, Wenbin Li, Stefan Leutenegger

    Abstract: While dense visual SLAM methods are capable of estimating dense reconstructions of the environment, they suffer from a lack of robustness in their tracking step, especially when the optimisation is poorly initialised. Sparse visual SLAM systems have attained high levels of accuracy and robustness through the inclusion of inertial measurements in a tightly-coupled fusion. Inspired by this performan… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

    Comments: Accepted at IROS 2017; supplementary video available at https://youtu.be/-gUdQ0cxDh0

  5. arXiv:2201.11861  [pdf, other

    cs.LG

    The Challenges of Exploration for Offline Reinforcement Learning

    Authors: Nathan Lambert, Markus Wulfmeier, William Whitney, Arunkumar Byravan, Michael Bloesch, Vibhavari Dasagi, Tim Hertweck, Martin Riedmiller

    Abstract: Offline Reinforcement Learning (ORL) enablesus to separately study the two interlinked processes of reinforcement learning: collecting informative experience and inferring optimal behaviour. The second step has been widely studied in the offline setting, but just as critical to data-efficient RL is the collection of informative data. The task-agnostic setting for data collection, where the task is… ▽ More

    Submitted 18 February, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

  6. arXiv:2101.09458  [pdf, other

    cs.LG

    Decoupled Exploration and Exploitation Policies for Sample-Efficient Reinforcement Learning

    Authors: William F. Whitney, Michael Bloesch, Jost Tobias Springenberg, Abbas Abdolmaleki, Kyunghyun Cho, Martin Riedmiller

    Abstract: Despite the close connection between exploration and sample efficiency, most state of the art reinforcement learning algorithms include no considerations for exploration beyond maximizing the entropy of the policy. In this work we address this seeming missed opportunity. We observe that the most common formulation of directed exploration in deep RL, known as bonus-based exploration (BBE), suffers… ▽ More

    Submitted 1 July, 2021; v1 submitted 23 January, 2021; originally announced January 2021.

  7. arXiv:2008.12228  [pdf, other

    cs.RO cs.AI cs.LG stat.ML

    Towards General and Autonomous Learning of Core Skills: A Case Study in Locomotion

    Authors: Roland Hafner, Tim Hertweck, Philipp Klöppner, Michael Bloesch, Michael Neunert, Markus Wulfmeier, Saran Tunyasuvunakool, Nicolas Heess, Martin Riedmiller

    Abstract: Modern Reinforcement Learning (RL) algorithms promise to solve difficult motor control problems directly from raw sensory inputs. Their attraction is due in part to the fact that they can represent a general class of methods that allow to learn a solution with a reasonably set reward and minimal prior knowledge, even in situations where it is difficult or expensive for a human expert. For RL to tr… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

  8. arXiv:2005.07541  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Simple Sensor Intentions for Exploration

    Authors: Tim Hertweck, Martin Riedmiller, Michael Bloesch, Jost Tobias Springenberg, Noah Siegel, Markus Wulfmeier, Roland Hafner, Nicolas Heess

    Abstract: Modern reinforcement learning algorithms can learn solutions to increasingly difficult control problems while at the same time reduce the amount of prior knowledge needed for their application. One of the remaining challenges is the definition of reward schemes that appropriately facilitate exploration without biasing the solution in undesirable ways, and that can be implemented on real robotic sy… ▽ More

    Submitted 15 May, 2020; originally announced May 2020.

  9. arXiv:2002.10342  [pdf, other

    cs.CV

    Comparing View-Based and Map-Based Semantic Labelling in Real-Time SLAM

    Authors: Zoe Landgraf, Fabian Falck, Michael Bloesch, Stefan Leutenegger, Andrew Davison

    Abstract: Generally capable Spatial AI systems must build persistent scene representations where geometric models are combined with meaningful semantic labels. The many approaches to labelling scenes can be divided into two clear groups: view-based which estimate labels from the input view-wise data and then incrementally fuse them into the scene model as it is built; and map-based which label the generated… ▽ More

    Submitted 24 February, 2020; originally announced February 2020.

    Comments: ICRA 2020

  10. arXiv:1903.06482  [pdf, other

    cs.CV cs.LG

    SceneCode: Monocular Dense Semantic Reconstruction using Learned Encoded Scene Representations

    Authors: Shuaifeng Zhi, Michael Bloesch, Stefan Leutenegger, Andrew J. Davison

    Abstract: Systems which incrementally create 3D semantic maps from image sequences must store and update representations of both geometry and semantic entities. However, while there has been much work on the correct formulation for geometrical estimation, state-of-the-art systems usually rely on simple semantic representations which store and update independent label estimates for each surface element (dept… ▽ More

    Submitted 18 March, 2019; v1 submitted 15 March, 2019; originally announced March 2019.

    Comments: To be published in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2019)

  11. arXiv:1812.07976  [pdf, other

    cs.RO cs.CV

    MID-Fusion: Octree-based Object-Level Multi-Instance Dynamic SLAM

    Authors: Binbin Xu, Wenbin Li, Dimos Tzoumanikas, Michael Bloesch, Andrew Davison, Stefan Leutenegger

    Abstract: We propose a new multi-instance dynamic RGB-D SLAM system using an object-level octree-based volumetric representation. It can provide robust camera tracking in dynamic environments and at the same time, continuously estimate geometric, semantic, and motion properties for arbitrary objects in the scene. For each incoming frame, we perform instance segmentation to detect objects and refine mask bou… ▽ More

    Submitted 21 March, 2019; v1 submitted 19 December, 2018; originally announced December 2018.

    Comments: Accepted to International Conference on Robotics and Automation (ICRA) 2019. 7 (6 + 1) pages. Please also see video Link: https://youtu.be/gturboNl9gg

  12. arXiv:1811.10681  [pdf, other

    cs.CV

    Matching Features without Descriptors: Implicitly Matched Interest Points

    Authors: Titus Cieslewski, Michael Bloesch, Davide Scaramuzza

    Abstract: The extraction and matching of interest points is a prerequisite for many geometric computer vision problems. Traditionally, matching has been achieved by assigning descriptors to interest points and matching points that have similar descriptors. In this paper, we propose a method by which interest points are instead already implicitly matched at detection time. With this, descriptors do not need… ▽ More

    Submitted 5 August, 2019; v1 submitted 26 November, 2018; originally announced November 2018.

    Comments: 10 pages without references, accepted for publication at the British Machine Vision Conference (BMVC), Cardiff, 2019. v2 contains additional results, and a bug in the evaluation of LF-NET has been fixed

    Journal ref: British Machine Vision Conference (BMVC), Cardiff, 2019

  13. arXiv:1810.03237  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Task-Embedded Control Networks for Few-Shot Imitation Learning

    Authors: Stephen James, Michael Bloesch, Andrew J. Davison

    Abstract: Much like humans, robots should have the ability to leverage knowledge from previously learned tasks in order to learn new tasks quickly in new and unfamiliar environments. Despite this, most robot learning approaches have focused on learning a single task, from scratch, with a limited notion of generalisation, and no way of leveraging the knowledge to learn other tasks more efficiently. One possi… ▽ More

    Submitted 7 October, 2018; originally announced October 2018.

    Comments: Published at the Conference on Robot Learning (CoRL) 2018

  14. arXiv:1809.02966  [pdf, other

    cs.CV

    LS-Net: Learning to Solve Nonlinear Least Squares for Monocular Stereo

    Authors: Ronald Clark, Michael Bloesch, Jan Czarnowski, Stefan Leutenegger, Andrew J. Davison

    Abstract: Sum-of-squares objective functions are very popular in computer vision algorithms. However, these objective functions are not always easy to optimize. The underlying assumptions made by solvers are often not satisfied and many problems are inherently ill-posed. In this paper, we propose LS-Net, a neural nonlinear least squares optimization algorithm which learns to effectively optimize these cost… ▽ More

    Submitted 9 September, 2018; originally announced September 2018.

    Comments: ECCV 2018. Video: https://youtu.be/5bZbMm8UqbA

  15. arXiv:1808.08378  [pdf, other

    cs.CV

    Fusion++: Volumetric Object-Level SLAM

    Authors: John McCormac, Ronald Clark, Michael Bloesch, Andrew J. Davison, Stefan Leutenegger

    Abstract: We propose an online object-level SLAM system which builds a persistent and accurate 3D graph map of arbitrary reconstructed objects. As an RGB-D camera browses a cluttered indoor scene, Mask-RCNN instance segmentations are used to initialise compact per-object Truncated Signed Distance Function (TSDF) reconstructions with object size-dependent resolutions and a novel 3D foreground mask. Reconstru… ▽ More

    Submitted 28 August, 2018; v1 submitted 25 August, 2018; originally announced August 2018.

  16. arXiv:1804.00874  [pdf, other

    cs.CV cs.LG

    CodeSLAM - Learning a Compact, Optimisable Representation for Dense Visual SLAM

    Authors: Michael Bloesch, Jan Czarnowski, Ronald Clark, Stefan Leutenegger, Andrew J. Davison

    Abstract: The representation of geometry in real-time 3D perception systems continues to be a critical research issue. Dense maps capture complete surface shape and can be augmented with semantic labels, but their high dimensionality makes them computationally costly to store and process, and unsuitable for rigorous probabilistic inference. Sparse feature-based representations avoid these problems, but capt… ▽ More

    Submitted 14 April, 2019; v1 submitted 3 April, 2018; originally announced April 2018.

    Comments: Published in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018)

  17. arXiv:1801.07478  [pdf, other

    cs.RO

    Why and How to Avoid the Flipped Quaternion Multiplication

    Authors: Hannes Sommer, Igor Gilitschenski, Michael Bloesch, Stephan Weiss, Roland Siegwart, Juan Nieto

    Abstract: Over the last decades quaternions have become a crucial and very successful tool for attitude representation in robotics and aerospace. However, there is a major problem that is continuously causing trouble in practice when it comes to exchanging formulas or implementations: there are two quaternion multiplications in common use, Hamilton's original multiplication and its flipped version, which is… ▽ More

    Submitted 5 February, 2018; v1 submitted 23 January, 2018; originally announced January 2018.

    Comments: 16 pages, 1 figure, 2 tables (minor improvements and fixes over v1, smaller page margins)

  18. Build Your Own Visual-Inertial Drone: A Cost-Effective and Open-Source Autonomous Drone

    Authors: Inkyu Sa, Mina Kamel, Michael Burri, Michael Bloesch, Raghav Khanna, Marija Popovic, Juan Nieto, Roland Siegwart

    Abstract: This paper describes an approach to building a cost-effective and research grade visual-inertial odometry aided vertical taking-off and landing (VTOL) platform. We utilize an off-the-shelf visual-inertial sensor, an onboard computer, and a quadrotor platform that are factory-calibrated and mass-produced, thereby sharing similar hardware and sensor specifications (e.g., mass, dimensions, intrinsic… ▽ More

    Submitted 6 September, 2018; v1 submitted 22 August, 2017; originally announced August 2017.

    Comments: 21 pages, 10 figures, accepted to IEEE Robotics & Automation Magazine

    Journal ref: IEEE Robotics & Automation Magazine 2017

  19. arXiv:1607.07405  [pdf, other

    cs.CV cs.LG

    gvnn: Neural Network Library for Geometric Computer Vision

    Authors: Ankur Handa, Michael Bloesch, Viorica Patraucean, Simon Stent, John McCormac, Andrew Davison

    Abstract: We introduce gvnn, a neural network library in Torch aimed towards bridging the gap between classic geometric computer vision and deep learning. Inspired by the recent success of Spatial Transformer Networks, we propose several new layers which are often used as parametric transformations on the data in geometric computer vision. These layers can be inserted within a neural network much in the spi… ▽ More

    Submitted 12 August, 2016; v1 submitted 25 July, 2016; originally announced July 2016.

    Comments: Submitted to ECCV Workshop on Deep Geometry

  20. arXiv:1606.05285  [pdf, other

    cs.RO

    A Primer on the Differential Calculus of 3D Orientations

    Authors: Michael Bloesch, Hannes Sommer, Tristan Laidlow, Michael Burri, Gabriel Nuetzi, Péter Fankhauser, Dario Bellicoso, Christian Gehring, Stefan Leutenegger, Marco Hutter, Roland Siegwart

    Abstract: The proper handling of 3D orientations is a central element in many optimization problems in engineering. Unfortunately many researchers and engineers struggle with the formulation of such problems and often fall back to suboptimal solutions. The existence of many different conventions further complicates this issue, especially when interfacing multiple differing implementations. This document dis… ▽ More

    Submitted 31 October, 2016; v1 submitted 16 June, 2016; originally announced June 2016.

  21. arXiv:1507.02081  [pdf, other

    cs.RO

    An Open Source, Fiducial Based, Visual-Inertial Motion Capture System

    Authors: Michael Neunert, Michael Bloesch, Jonas Buchli

    Abstract: Many robotic tasks rely on the accurate localization of moving objects within a given workspace. This information about the objects' poses and velocities are used for control,motion planning, navigation, interaction with the environment or verification. Often motion capture systems are used to obtain such a state estimate. However, these systems are often costly, limited in workspace size and not… ▽ More

    Submitted 13 June, 2016; v1 submitted 8 July, 2015; originally announced July 2015.

    Comments: To appear in The International Conference on Information Fusion (FUSION) 2016

  22. State Estimation for a Humanoid Robot

    Authors: Nicholas Rotella, Michael Bloesch, Ludovic Righetti, Stefan Schaal

    Abstract: This paper introduces a framework for state estimation on a humanoid robot platform using only common proprioceptive sensors and knowledge of leg kinematics. The presented approach extends that detailed in [1] on a quadruped platform by incorporating the rotational constraints imposed by the humanoid's flat feet. As in previous work, the proposed Extended Kalman Filter (EKF) accommodates contact s… ▽ More

    Submitted 10 December, 2014; v1 submitted 21 February, 2014; originally announced February 2014.

    Comments: IROS 2014 Submission, IEEE/RSJ International Conference on Intelligent Robots and Systems (2014) 952-958