Search | arXiv e-print repository

DITTO: Demonstration Imitation by Trajectory Transformation

Authors: Nick Heppert, Max Argus, Tim Welschehold, Thomas Brox, Abhinav Valada

Abstract: Teaching robots new skills quickly and conveniently is crucial for the broader adoption of robotic systems. In this work, we address the problem of one-shot imitation from a single human demonstration, given by an RGB-D video recording through a two-stage process. In the first stage which is offline, we extract the trajectory of the demonstration. This entails segmenting manipulated objects and de… ▽ More Teaching robots new skills quickly and conveniently is crucial for the broader adoption of robotic systems. In this work, we address the problem of one-shot imitation from a single human demonstration, given by an RGB-D video recording through a two-stage process. In the first stage which is offline, we extract the trajectory of the demonstration. This entails segmenting manipulated objects and determining their relative motion in relation to secondary objects such as containers. Subsequently, in the live online trajectory generation stage, we first \mbox{re-detect} all objects, then we warp the demonstration trajectory to the current scene, and finally, we trace the trajectory with the robot. To complete these steps, our method makes leverages several ancillary models, including those for segmentation, relative object pose estimation, and grasp prediction. We systematically evaluate different combinations of correspondence and re-detection methods to validate our design decision across a diverse range of tasks. Specifically, we collect demonstrations of ten different tasks including pick-and-place tasks as well as articulated object manipulation. Finally, we perform extensive evaluations on a real robot system to demonstrate the effectiveness and utility of our approach in real-world scenarios. We make the code publicly available at http://ditto.cs.uni-freiburg.de. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: 8 pages, 4 figures, 3 tables, submitted to IROS 2024

arXiv:2403.14305 [pdf, other]

Bayesian Optimization for Sample-Efficient Policy Improvement in Robotic Manipulation

Authors: Adrian Röfer, Iman Nematollahi, Tim Welschehold, Wolfram Burgard, Abhinav Valada

Abstract: Sample efficient learning of manipulation skills poses a major challenge in robotics. While recent approaches demonstrate impressive advances in the type of task that can be addressed and the sensing modalities that can be incorporated, they still require large amounts of training data. Especially with regard to learning actions on robots in the real world, this poses a major problem due to the hi… ▽ More Sample efficient learning of manipulation skills poses a major challenge in robotics. While recent approaches demonstrate impressive advances in the type of task that can be addressed and the sensing modalities that can be incorporated, they still require large amounts of training data. Especially with regard to learning actions on robots in the real world, this poses a major problem due to the high costs associated with both demonstrations and real-world robot interactions. To address this challenge, we introduce BOpt-GMM, a hybrid approach that combines imitation learning with own experience collection. We first learn a skill model as a dynamical system encoded in a Gaussian Mixture Model from a few demonstrations. We then improve this model with Bayesian optimization building on a small number of autonomous skill executions in a sparse reward setting. We demonstrate the sample efficiency of our approach on multiple complex manipulation skills in both simulations and real-world experiments. Furthermore, we make the code and pre-trained models publicly available at http://bopt-gmm. cs.uni-freiburg.de. △ Less

Submitted 21 March, 2024; originally announced March 2024.

Comments: 7 pages, 5 figures, 2 tables, submitted to IROS2024

arXiv:2403.08605 [pdf, other]

Language-Grounded Dynamic Scene Graphs for Interactive Object Search with Mobile Manipulation

Authors: Daniel Honerkamp, Martin Büchner, Fabien Despinoy, Tim Welschehold, Abhinav Valada

Abstract: To fully leverage the capabilities of mobile manipulation robots, it is imperative that they are able to autonomously execute long-horizon tasks in large unexplored environments. While large language models (LLMs) have shown emergent reasoning skills on arbitrary tasks, existing work primarily concentrates on explored environments, typically focusing on either navigation or manipulation tasks in i… ▽ More To fully leverage the capabilities of mobile manipulation robots, it is imperative that they are able to autonomously execute long-horizon tasks in large unexplored environments. While large language models (LLMs) have shown emergent reasoning skills on arbitrary tasks, existing work primarily concentrates on explored environments, typically focusing on either navigation or manipulation tasks in isolation. In this work, we propose MoMa-LLM, a novel approach that grounds language models within structured representations derived from open-vocabulary scene graphs, dynamically updated as the environment is explored. We tightly interleave these representations with an object-centric action space. Given object detections, the resulting approach is zero-shot, open-vocabulary, and readily extendable to a spectrum of mobile manipulation and household robotic tasks. We demonstrate the effectiveness of MoMa-LLM in a novel semantic interactive search task in large realistic indoor environments. In extensive experiments in both simulation and the real world, we show substantially improved search efficiency compared to conventional baselines and state-of-the-art approaches, as well as its applicability to more abstract tasks. We make the code publicly available at http://moma-llm.cs.uni-freiburg.de. △ Less

Submitted 11 June, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

Comments: Project website: http://moma-llm.cs.uni-freiburg.de

arXiv:2312.08240 [pdf, other]

CenterGrasp: Object-Aware Implicit Representation Learning for Simultaneous Shape Reconstruction and 6-DoF Grasp Estimation

Authors: Eugenio Chisari, Nick Heppert, Tim Welschehold, Wolfram Burgard, Abhinav Valada

Abstract: Reliable object gras** is a crucial capability for autonomous robots. However, many existing gras** approaches focus on general clutter removal without explicitly modeling objects and thus only relying on the visible local geometry. We introduce CenterGrasp, a novel framework that combines object awareness and holistic gras**. CenterGrasp learns a general object prior by encoding shapes and… ▽ More Reliable object gras** is a crucial capability for autonomous robots. However, many existing gras** approaches focus on general clutter removal without explicitly modeling objects and thus only relying on the visible local geometry. We introduce CenterGrasp, a novel framework that combines object awareness and holistic gras**. CenterGrasp learns a general object prior by encoding shapes and valid grasps in a continuous latent space. It consists of an RGB-D image encoder that leverages recent advances to detect objects and infer their pose and latent code, and a decoder to predict shape and grasps for each object in the scene. We perform extensive experiments on simulated as well as real-world cluttered scenes and demonstrate strong scene reconstruction and 6-DoF grasp-pose estimation performance. Compared to the state of the art, CenterGrasp achieves an improvement of 38.5 mm in shape reconstruction and 33 percentage points on average in grasp success. We make the code and trained models publicly available at http://centergrasp.cs.uni-freiburg.de. △ Less

Submitted 5 April, 2024; v1 submitted 13 December, 2023; originally announced December 2023.

Comments: Accepted at RA-L. Video, code and models available at http://centergrasp.cs.uni-freiburg.de

arXiv:2310.15059 [pdf, other]

Robot Skill Generalization via Keypoint Integrated Soft Actor-Critic Gaussian Mixture Models

Authors: Iman Nematollahi, Kirill Yankov, Wolfram Burgard, Tim Welschehold

Abstract: A long-standing challenge for a robotic manipulation system operating in real-world scenarios is adapting and generalizing its acquired motor skills to unseen environments. We tackle this challenge employing hybrid skill models that integrate imitation and reinforcement paradigms, to explore how the learning and adaptation of a skill, along with its core grounding in the scene through a learned ke… ▽ More A long-standing challenge for a robotic manipulation system operating in real-world scenarios is adapting and generalizing its acquired motor skills to unseen environments. We tackle this challenge employing hybrid skill models that integrate imitation and reinforcement paradigms, to explore how the learning and adaptation of a skill, along with its core grounding in the scene through a learned keypoint, can facilitate such generalization. To that end, we develop Keypoint Integrated Soft Actor-Critic Gaussian Mixture Models (KIS-GMM) approach that learns to predict the reference of a dynamical system within the scene as a 3D keypoint, leveraging visual observations obtained by the robot's physical interactions during skill learning. Through conducting comprehensive evaluations in both simulated and real-world environments, we show that our method enables a robot to gain a significant zero-shot generalization to novel environments and to refine skills in the target environments faster than learning from scratch. Importantly, this is achieved without the need for new ground truth data. Moreover, our method effectively copes with scene displacements. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: Accepted at the International Symposium on Experimental Robotics (ISER) 2023. Videos at http://kis-gmm.cs.uni-freiburg.de/

arXiv:2307.06125 [pdf, other]

doi 10.1109/LRA.2023.3329619

Learning Hierarchical Interactive Multi-Object Search for Mobile Manipulation

Authors: Fabian Schmalstieg, Daniel Honerkamp, Tim Welschehold, Abhinav Valada

Abstract: Existing object-search approaches enable robots to search through free pathways, however, robots operating in unstructured human-centered environments frequently also have to manipulate the environment to their needs. In this work, we introduce a novel interactive multi-object search task in which a robot has to open doors to navigate rooms and search inside cabinets and drawers to find target obj… ▽ More Existing object-search approaches enable robots to search through free pathways, however, robots operating in unstructured human-centered environments frequently also have to manipulate the environment to their needs. In this work, we introduce a novel interactive multi-object search task in which a robot has to open doors to navigate rooms and search inside cabinets and drawers to find target objects. These new challenges require combining manipulation and navigation skills in unexplored environments. We present HIMOS, a hierarchical reinforcement learning approach that learns to compose exploration, navigation, and manipulation skills. To achieve this, we design an abstract high-level action space around a semantic map memory and leverage the explored environment as instance navigation points. We perform extensive experiments in simulation and the real world that demonstrate that, with accurate perception, the decision making of HIMOS effectively transfers to new environments in a zero-shot manner. It shows robustness to unseen subpolicies, failures in their execution, and different robot kinematics. These capabilities open the door to a wide range of downstream tasks across embodied AI and real-world use cases. △ Less

Submitted 19 October, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

Comments: 11 pages, 6 figures, Accepted for publication in RA-L. Code and Models: http://himos.cs.uni-freiburg.de/

arXiv:2305.04718 [pdf, other]

doi 10.1109/LRA.2023.3313917

The Treachery of Images: Bayesian Scene Keypoints for Deep Policy Learning in Robotic Manipulation

Authors: Jan Ole von Hartz, Eugenio Chisari, Tim Welschehold, Wolfram Burgard, Joschka Boedecker, Abhinav Valada

Abstract: In policy learning for robotic manipulation, sample efficiency is of paramount importance. Thus, learning and extracting more compact representations from camera observations is a promising avenue. However, current methods often assume full observability of the scene and struggle with scale invariance. In many tasks and settings, this assumption does not hold as objects in the scene are often occl… ▽ More In policy learning for robotic manipulation, sample efficiency is of paramount importance. Thus, learning and extracting more compact representations from camera observations is a promising avenue. However, current methods often assume full observability of the scene and struggle with scale invariance. In many tasks and settings, this assumption does not hold as objects in the scene are often occluded or lie outside the field of view of the camera, rendering the camera observation ambiguous with regard to their location. To tackle this problem, we present BASK, a Bayesian approach to tracking scale-invariant keypoints over time. Our approach successfully resolves inherent ambiguities in images, enabling keypoint tracking on symmetrical objects and occluded and out-of-view objects. We employ our method to learn challenging multi-object robot manipulation tasks from wrist camera observations and demonstrate superior utility for policy learning compared to other representation learning techniques. Furthermore, we show outstanding robustness towards disturbances such as clutter, occlusions, and noisy depth measurements, as well as generalization to unseen objects both in simulation and real-world robotic experiments. △ Less

Submitted 20 September, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Journal ref: IEEE Robotics and Automation Letters, vol. 8, no. 11, pp. 6931-6938, Nov. 2023

arXiv:2303.11756 [pdf, other]

Improving Deep Dynamics Models for Autonomous Vehicles with Multimodal Latent Map** of Surfaces

Authors: Johan Vertens, Nicolai Dorka, Tim Welschehold, Michael Thompson, Wolfram Burgard

Abstract: The safe deployment of autonomous vehicles relies on their ability to effectively react to environmental changes. This can require maneuvering on varying surfaces which is still a difficult problem, especially for slippery terrains. To address this issue we propose a new approach that learns a surface-aware dynamics model by conditioning it on a latent variable vector storing surface information a… ▽ More The safe deployment of autonomous vehicles relies on their ability to effectively react to environmental changes. This can require maneuvering on varying surfaces which is still a difficult problem, especially for slippery terrains. To address this issue we propose a new approach that learns a surface-aware dynamics model by conditioning it on a latent variable vector storing surface information about the current location. A latent mapper is trained to update these latent variables during inference from multiple modalities on every traversal of the corresponding locations and stores them in a map. By training everything end-to-end with the loss of the dynamics model, we enforce the latent mapper to learn an update rule for the latent map that is useful for the subsequent dynamics model. We implement and evaluate our approach on a real miniature electric car. The results show that the latent map is updated to allow more accurate predictions of the dynamics model compared to a model without this information. We further show that by using this model, the driving performance can be improved on varying and challenging surfaces. △ Less

Submitted 21 March, 2023; originally announced March 2023.

arXiv:2303.10144 [pdf, other]

Dynamic Update-to-Data Ratio: Minimizing World Model Overfitting

Authors: Nicolai Dorka, Tim Welschehold, Wolfram Burgard

Abstract: Early stop** based on the validation set performance is a popular approach to find the right balance between under- and overfitting in the context of supervised learning. However, in reinforcement learning, even for supervised sub-problems such as world model learning, early stop** is not applicable as the dataset is continually evolving. As a solution, we propose a new general method that dyn… ▽ More Early stop** based on the validation set performance is a popular approach to find the right balance between under- and overfitting in the context of supervised learning. However, in reinforcement learning, even for supervised sub-problems such as world model learning, early stop** is not applicable as the dataset is continually evolving. As a solution, we propose a new general method that dynamically adjusts the update to data (UTD) ratio during training based on under- and overfitting detection on a small subset of the continuously collected experience not used for training. We apply our method to DreamerV2, a state-of-the-art model-based reinforcement learning algorithm, and evaluate it on the DeepMind Control Suite and the Atari $100$k benchmark. The results demonstrate that one can better balance under- and overestimation by adjusting the UTD ratio with our approach compared to the default setting in DreamerV2 and that it is competitive with an extensive hyperparameter search which is not feasible for many applications. Our method eliminates the need to set the UTD hyperparameter by hand and even leads to a higher robustness with regard to other learning-related hyperparameters further reducing the amount of necessary tuning. △ Less

Submitted 17 March, 2023; originally announced March 2023.

Comments: ICLR 2023

arXiv:2206.08737 [pdf, other]

doi 10.1109/TRO.2023.3284346

N$^2$M$^2$: Learning Navigation for Arbitrary Mobile Manipulation Motions in Unseen and Dynamic Environments

Authors: Daniel Honerkamp, Tim Welschehold, Abhinav Valada

Abstract: Despite its importance in both industrial and service robotics, mobile manipulation remains a significant challenge as it requires a seamless integration of end-effector trajectory generation with navigation skills as well as reasoning over long-horizons. Existing methods struggle to control the large configuration space, and to navigate dynamic and unknown environments. In previous work, we propo… ▽ More Despite its importance in both industrial and service robotics, mobile manipulation remains a significant challenge as it requires a seamless integration of end-effector trajectory generation with navigation skills as well as reasoning over long-horizons. Existing methods struggle to control the large configuration space, and to navigate dynamic and unknown environments. In previous work, we proposed to decompose mobile manipulation tasks into a simplified motion generator for the end-effector in task space and a trained reinforcement learning agent for the mobile base to account for kinematic feasibility of the motion. In this work, we introduce Neural Navigation for Mobile Manipulation (N$^2$M$^2$) which extends this decomposition to complex obstacle environments and enables it to tackle a broad range of tasks in real world settings. The resulting approach can perform unseen, long-horizon tasks in unexplored environments while instantly reacting to dynamic obstacles and environmental changes. At the same time, it provides a simple way to define new mobile manipulation tasks. We demonstrate the capabilities of our proposed approach in extensive simulation and real-world experiments on multiple kinematically diverse mobile manipulators. Code and videos are publicly available at http://mobile-rl.cs.uni-freiburg.de. △ Less

Submitted 29 June, 2023; v1 submitted 17 June, 2022; originally announced June 2022.

Comments: Project website: http://mobile-rl.cs.uni-freiburg.de; Accepted at T-RO

ACM Class: I.2.9; I.2.6

arXiv:2205.11384 [pdf, other]

Learning Long-Horizon Robot Exploration Strategies for Multi-Object Search in Continuous Action Spaces

Authors: Fabian Schmalstieg, Daniel Honerkamp, Tim Welschehold, Abhinav Valada

Abstract: Recent advances in vision-based navigation and exploration have shown impressive capabilities in photorealistic indoor environments. However, these methods still struggle with long-horizon tasks and require large amounts of data to generalize to unseen environments. In this work, we present a novel reinforcement learning approach for multi-object search that combines short-term and long-term reaso… ▽ More Recent advances in vision-based navigation and exploration have shown impressive capabilities in photorealistic indoor environments. However, these methods still struggle with long-horizon tasks and require large amounts of data to generalize to unseen environments. In this work, we present a novel reinforcement learning approach for multi-object search that combines short-term and long-term reasoning in a single model while avoiding the complexities arising from hierarchical structures. In contrast to existing multi-object search methods that act in granular discrete action spaces, our approach achieves exceptional performance in continuous action spaces. We perform extensive experiments and show that it generalizes to unseen apartment environments with limited data. Furthermore, we demonstrate zero-shot transfer of the learned policies to an office environment in real world experiments. △ Less

Submitted 24 May, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

Journal ref: Proceedings of the International Symposium on Robotics Research (ISRR), 2022

arXiv:2205.08316 [pdf, other]

Self-Supervised Learning of Multi-Object Keypoints for Robotic Manipulation

Authors: Jan Ole von Hartz, Eugenio Chisari, Tim Welschehold, Abhinav Valada

Abstract: In recent years, policy learning methods using either reinforcement or imitation have made significant progress. However, both techniques still suffer from being computationally expensive and requiring large amounts of training data. This problem is especially prevalent in real-world robotic manipulation tasks, where access to ground truth scene features is not available and policies are instead l… ▽ More In recent years, policy learning methods using either reinforcement or imitation have made significant progress. However, both techniques still suffer from being computationally expensive and requiring large amounts of training data. This problem is especially prevalent in real-world robotic manipulation tasks, where access to ground truth scene features is not available and policies are instead learned from raw camera observations. In this paper, we demonstrate the efficacy of learning image keypoints via the Dense Correspondence pretext task for downstream policy learning. Extending prior work to challenging multi-object scenes, we show that our model can be trained to deal with important problems in representation learning, primarily scale-invariance and occlusion. We evaluate our approach on diverse robot manipulation tasks, compare it to other visual representation learning approaches, and demonstrate its flexibility and effectiveness for sample-efficient policy learning. △ Less

Submitted 11 October, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

Comments: Presented at IEEE ICRA 2022 Workshop 'Reinforcement Learning for Contact-Rich Manipulation'

arXiv:2202.02654 [pdf, other]

Doing Right by Not Doing Wrong in Human-Robot Collaboration

Authors: Laura Londoño, Adrian Röfer, Tim Welschehold, Abhinav Valada

Abstract: As robotic systems become more and more capable of assisting humans in their everyday lives, we must consider the opportunities for these artificial agents to make their human collaborators feel unsafe or to treat them unfairly. Robots can exhibit antisocial behavior causing physical harm to people or reproduce unfair behavior replicating and even amplifying historical and societal biases which ar… ▽ More As robotic systems become more and more capable of assisting humans in their everyday lives, we must consider the opportunities for these artificial agents to make their human collaborators feel unsafe or to treat them unfairly. Robots can exhibit antisocial behavior causing physical harm to people or reproduce unfair behavior replicating and even amplifying historical and societal biases which are detrimental to humans they interact with. In this paper, we discuss these issues considering sociable robotic manipulation and fair robotic decision making. We propose a novel approach to learning fair and sociable behavior, not by reproducing positive behavior, but rather by avoiding negative behavior. In this study, we highlight the importance of incorporating sociability in robot manipulation, as well as the need to consider fairness in human-robot interactions. △ Less

Submitted 5 February, 2022; originally announced February 2022.

arXiv:2111.14843 [pdf, other]

Catch Me If You Hear Me: Audio-Visual Navigation in Complex Unmapped Environments with Moving Sounds

Authors: Abdelrahman Younes, Daniel Honerkamp, Tim Welschehold, Abhinav Valada

Abstract: Audio-visual navigation combines sight and hearing to navigate to a sound-emitting source in an unmapped environment. While recent approaches have demonstrated the benefits of audio input to detect and find the goal, they focus on clean and static sound sources and struggle to generalize to unheard sounds. In this work, we propose the novel dynamic audio-visual navigation benchmark which requires… ▽ More Audio-visual navigation combines sight and hearing to navigate to a sound-emitting source in an unmapped environment. While recent approaches have demonstrated the benefits of audio input to detect and find the goal, they focus on clean and static sound sources and struggle to generalize to unheard sounds. In this work, we propose the novel dynamic audio-visual navigation benchmark which requires catching a moving sound source in an environment with noisy and distracting sounds, posing a range of new challenges. We introduce a reinforcement learning approach that learns a robust navigation policy for these complex settings. To achieve this, we propose an architecture that fuses audio-visual information in the spatial feature space to learn correlations of geometric information inherent in both local maps and audio signals. We demonstrate that our approach consistently outperforms the current state-of-the-art by a large margin across all tasks of moving sounds, unheard sounds, and noisy environments, on two challenging 3D scanned real-world environments, namely Matterport3D and Replica. The benchmark is available at http://dav-nav.cs.uni-freiburg.de. △ Less

Submitted 3 January, 2023; v1 submitted 29 November, 2021; originally announced November 2021.

Comments: This paper has been accepted for publication at IEEE ROBOTICS AND AUTOMATION LETTERS

arXiv:2111.13129 [pdf, other]

Robot Skill Adaptation via Soft Actor-Critic Gaussian Mixture Models

Authors: Iman Nematollahi, Erick Rosete-Beas, Adrian Röfer, Tim Welschehold, Abhinav Valada, Wolfram Burgard

Abstract: A core challenge for an autonomous agent acting in the real world is to adapt its repertoire of skills to cope with its noisy perception and dynamics. To scale learning of skills to long-horizon tasks, robots should be able to learn and later refine their skills in a structured manner through trajectories rather than making instantaneous decisions individually at each time step. To this end, we pr… ▽ More A core challenge for an autonomous agent acting in the real world is to adapt its repertoire of skills to cope with its noisy perception and dynamics. To scale learning of skills to long-horizon tasks, robots should be able to learn and later refine their skills in a structured manner through trajectories rather than making instantaneous decisions individually at each time step. To this end, we propose the Soft Actor-Critic Gaussian Mixture Model (SAC-GMM), a novel hybrid approach that learns robot skills through a dynamical system and adapts the learned skills in their own trajectory distribution space through interactions with the environment. Our approach combines classical robotics techniques of learning from demonstration with the deep reinforcement learning framework and exploits their complementary nature. We show that our method utilizes sensors solely available during the execution of preliminarily learned skills to extract relevant features that lead to faster skill refinement. Extensive evaluations in both simulation and real-world environments demonstrate the effectiveness of our method in refining robot skills by leveraging physical interactions, high-dimensional sensory data, and sparse task completion rewards. Videos, code, and pre-trained models are available at http://sac-gmm.cs.uni-freiburg.de. △ Less

Submitted 19 September, 2022; v1 submitted 25 November, 2021; originally announced November 2021.

Comments: Accepted at the 2022 IEEE International Conference on Robotics and Automation (ICRA)

arXiv:2111.12673 [pdf, other]

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Authors: Nicolai Dorka, Tim Welschehold, Joschka Boedecker, Wolfram Burgard

Abstract: Accurate value estimates are important for off-policy reinforcement learning. Algorithms based on temporal difference learning typically are prone to an over- or underestimation bias building up over time. In this paper, we propose a general method called Adaptively Calibrated Critics (ACC) that uses the most recent high variance but unbiased on-policy rollouts to alleviate the bias of the low var… ▽ More Accurate value estimates are important for off-policy reinforcement learning. Algorithms based on temporal difference learning typically are prone to an over- or underestimation bias building up over time. In this paper, we propose a general method called Adaptively Calibrated Critics (ACC) that uses the most recent high variance but unbiased on-policy rollouts to alleviate the bias of the low variance temporal difference targets. We apply ACC to Truncated Quantile Critics, which is an algorithm for continuous control that allows regulation of the bias with a hyperparameter tuned per environment. The resulting algorithm adaptively adjusts the parameter during training rendering hyperparameter search unnecessary and sets a new state of the art on the OpenAI gym continuous control benchmark among all algorithms that do not tune hyperparameters for each environment. ACC further achieves improved results on different tasks from the Meta-World robot benchmark. Additionally, we demonstrate the generality of ACC by applying it to TD3 and showing an improved performance also in this setting. △ Less

Submitted 21 October, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

Comments: Submitted to RA-L

arXiv:2110.03316 [pdf, other]

Correct Me if I am Wrong: Interactive Learning for Robotic Manipulation

Authors: Eugenio Chisari, Tim Welschehold, Joschka Boedecker, Wolfram Burgard, Abhinav Valada

Abstract: Learning to solve complex manipulation tasks from visual observations is a dominant challenge for real-world robot learning. Although deep reinforcement learning algorithms have recently demonstrated impressive results in this context, they still require an impractical amount of time-consuming trial-and-error iterations. In this work, we consider the promising alternative paradigm of interactive l… ▽ More Learning to solve complex manipulation tasks from visual observations is a dominant challenge for real-world robot learning. Although deep reinforcement learning algorithms have recently demonstrated impressive results in this context, they still require an impractical amount of time-consuming trial-and-error iterations. In this work, we consider the promising alternative paradigm of interactive learning in which a human teacher provides feedback to the policy during execution, as opposed to imitation learning where a pre-collected dataset of perfect demonstrations is used. Our proposed CEILing (Corrective and Evaluative Interactive Learning) framework combines both corrective and evaluative feedback from the teacher to train a stochastic policy in an asynchronous manner, and employs a dedicated mechanism to trade off human corrections with the robot's own experience. We present results obtained with our framework in extensive simulation and real-world experiments to demonstrate that CEILing can effectively solve complex robot manipulation tasks directly from raw images in less than one hour of real-world training. △ Less

Submitted 19 January, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

Comments: Accepted for publication in RA-L. Video, code and models available at http://ceiling.cs.uni-freiburg.de/

arXiv:2106.06369 [pdf, other]

Courteous Behavior of Automated Vehicles at Unsignalized Intersections via Reinforcement Learning

Authors: Shengchao Yan, Tim Welschehold, Daniel Büscher, Wolfram Burgard

Abstract: The transition from today's mostly human-driven traffic to a purely automated one will be a gradual evolution, with the effect that we will likely experience mixed traffic in the near future. Connected and automated vehicles can benefit human-driven ones and the whole traffic system in different ways, for example by improving collision avoidance and reducing traffic waves. Many studies have been c… ▽ More The transition from today's mostly human-driven traffic to a purely automated one will be a gradual evolution, with the effect that we will likely experience mixed traffic in the near future. Connected and automated vehicles can benefit human-driven ones and the whole traffic system in different ways, for example by improving collision avoidance and reducing traffic waves. Many studies have been carried out to improve intersection management, a significant bottleneck in traffic, with intelligent traffic signals or exclusively automated vehicles. However, the problem of how to improve mixed traffic at unsignalized intersections has received less attention. In this paper, we propose a novel approach to optimizing traffic flow at intersections in mixed traffic situations using deep reinforcement learning. Our reinforcement learning agent learns a policy for a centralized controller to let connected autonomous vehicles at unsignalized intersections give up their right of way and yield to other vehicles to optimize traffic flow. We implemented our approach and tested it in the traffic simulator SUMO based on simulated and real traffic data. The experimental evaluation demonstrates that our method significantly improves traffic flow through unsignalized intersections in mixed traffic settings and also provides better performance on a wide range of traffic situations compared to the state-of-the-art traffic signal controller for the corresponding signalized intersection. △ Less

Submitted 11 June, 2021; originally announced June 2021.

arXiv:2101.05325 [pdf, ps, other]

Learning Kinematic Feasibility for Mobile Manipulation through Deep Reinforcement Learning

Authors: Daniel Honerkamp, Tim Welschehold, Abhinav Valada

Abstract: Mobile manipulation tasks remain one of the critical challenges for the widespread adoption of autonomous robots in both service and industrial scenarios. While planning approaches are good at generating feasible whole-body robot trajectories, they struggle with dynamic environments as well as the incorporation of constraints given by the task and the environment. On the other hand, dynamic motion… ▽ More Mobile manipulation tasks remain one of the critical challenges for the widespread adoption of autonomous robots in both service and industrial scenarios. While planning approaches are good at generating feasible whole-body robot trajectories, they struggle with dynamic environments as well as the incorporation of constraints given by the task and the environment. On the other hand, dynamic motion models in the action space struggle with generating kinematically feasible trajectories for mobile manipulation actions. We propose a deep reinforcement learning approach to learn feasible dynamic motions for a mobile base while the end-effector follows a trajectory in task space generated by an arbitrary system to fulfill the task at hand. This modular formulation has several benefits: it enables us to readily transform a broad range of end-effector motions into mobile applications, it allows us to use the kinematic feasibility of the end-effector trajectory as a dense reward signal and its modular formulation allows it to generalise to unseen end-effector motions at test time. We demonstrate the capabilities of our approach on multiple mobile robot platforms with different kinematic abilities and different types of wheeled platforms in extensive simulated as well as real-world experiments. △ Less

Submitted 18 June, 2021; v1 submitted 13 January, 2021; originally announced January 2021.

Comments: Accepted for publication in RA-L. Code and Models: http://rl.uni-freiburg.de/research/kinematic-feasibility-rl

Journal ref: IEEE Robotics and Automation Letters (RA-L), vol. 6, no. 4, pp. 6289-6296, 2021

arXiv:1908.10184 [pdf, other]

Combined Task and Action Learning from Human Demonstrations for Mobile Manipulation Applications

Authors: Tim Welschehold, Nichola Abdo, Christian Dornhege, Wolfram Burgard

Abstract: Learning from demonstrations is a promising paradigm for transferring knowledge to robots. However, learning mobile manipulation tasks directly from a human teacher is a complex problem as it requires learning models of both the overall task goal and of the underlying actions. Additionally, learning from a small number of demonstrations often introduces ambiguity with respect to the intention of t… ▽ More Learning from demonstrations is a promising paradigm for transferring knowledge to robots. However, learning mobile manipulation tasks directly from a human teacher is a complex problem as it requires learning models of both the overall task goal and of the underlying actions. Additionally, learning from a small number of demonstrations often introduces ambiguity with respect to the intention of the teacher, making it challenging to commit to one model for generalizing the task to new settings. In this paper, we present an approach to learning flexible mobile manipulation action models and task goal representations from teacher demonstrations. Our action models enable the robot to consider different likely outcomes of each action and to generate feasible trajectories for achieving them. Accordingly, we leverage a probabilistic framework based on Monte Carlo tree search to compute sequences of feasible actions imitating the teacher intention in new settings without requiring the teacher to specify an explicit goal state. We demonstrate the effectiveness of our approach in complex tasks carried out in real-world settings. △ Less

Submitted 25 August, 2019; originally announced August 2019.

Comments: Accepted for publication at Iros 2019

arXiv:1803.02622 [pdf, other]

3D Human Pose Estimation in RGBD Images for Robotic Task Learning

Authors: Christian Zimmermann, Tim Welschehold, Christian Dornhege, Wolfram Burgard, Thomas Brox

Abstract: We propose an approach to estimate 3D human pose in real world units from a single RGBD image and show that it exceeds performance of monocular 3D pose estimation approaches from color as well as pose estimation exclusively from depth. Our approach builds on robust human keypoint detectors for color images and incorporates depth for lifting into 3D. We combine the system with our learning from dem… ▽ More We propose an approach to estimate 3D human pose in real world units from a single RGBD image and show that it exceeds performance of monocular 3D pose estimation approaches from color as well as pose estimation exclusively from depth. Our approach builds on robust human keypoint detectors for color images and incorporates depth for lifting into 3D. We combine the system with our learning from demonstration framework to instruct a service robot without the need of markers. Experiments in real world settings demonstrate that our approach enables a PR2 robot to imitate manipulation actions observed from a human teacher. △ Less

Submitted 13 March, 2018; v1 submitted 7 March, 2018; originally announced March 2018.

Comments: Accepted to ICRA 2018. Video and Code (ROS node) are available: http://lmb.informatik.uni-freiburg.de/projects/rgbd-pose3d/

Showing 1–21 of 21 results for author: Welschehold, T