Skip to main content

Showing 1–50 of 167 results for author: Behnke, S

.
  1. arXiv:2405.19921  [pdf, other

    cs.CV cs.RO

    MCDS-VSS: Moving Camera Dynamic Scene Video Semantic Segmentation by Filtering with Self-Supervised Geometry and Motion

    Authors: Angel Villar-Corrales, Moritz Austermann, Sven Behnke

    Abstract: Autonomous systems, such as self-driving cars, rely on reliable semantic environment perception for decision making. Despite great advances in video semantic segmentation, existing approaches ignore important inductive biases and lack structured and interpretable internal representations. In this work, we propose MCDS-VSS, a structured filter model that learns in a self-supervised manner to estima… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  2. arXiv:2404.09736  [pdf, other

    cs.CV

    FSRT: Facial Scene Representation Transformer for Face Reenactment from Factorized Appearance, Head-pose, and Facial Expression Features

    Authors: Andre Rochow, Max Schwarz, Sven Behnke

    Abstract: The task of face reenactment is to transfer the head motion and facial expressions from a driving video to the appearance of a source image, which may be of a different person (cross-reenactment). Most existing methods are CNN-based and estimate optical flow from the source image to the current driving frame, which is then inpainted and refined to produce the output animation. We propose a transfo… ▽ More

    Submitted 10 June, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  3. arXiv:2404.06277  [pdf, other

    cs.CV

    Learning Embeddings with Centroid Triplet Loss for Object Identification in Robotic Gras**

    Authors: Anas Gouda, Max Schwarz, Christopher Reining, Sven Behnke, Alice Kirchheim

    Abstract: Foundation models are a strong trend in deep learning and computer vision. These models serve as a base for applications as they require minor or no further fine-tuning by developers to integrate into their applications. Foundation models for zero-shot object segmentation such as Segment Anything (SAM) output segmentation masks from images without any further object information. When they are foll… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  4. arXiv:2403.15306  [pdf, other

    cs.RO

    HortiBot: An Adaptive Multi-Arm System for Robotic Horticulture of Sweet Peppers

    Authors: Christian Lenz, Rohit Menon, Michael Schreiber, Melvin Paul Jacob, Sven Behnke, Maren Bennewitz

    Abstract: Horticultural tasks such as pruning and selective harvesting are labor intensive and horticultural staff are hard to find. Automating these tasks is challenging due to the semi-structured greenhouse workspaces, changing environmental conditions such as lighting, dense plant growth with many occlusions, and the need for gentle manipulation of non-rigid plant organs. In this work, we present the thr… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: Submitted to International Conference on Intelligent Robots and Systems (IROS) 2024. C. Lenz and R. Menon contributed equally

  5. arXiv:2403.10187  [pdf, other

    cs.RO cs.AI cs.LG

    Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects

    Authors: Malte Mosbach, Sven Behnke

    Abstract: Interactive gras** from clutter, akin to human dexterity, is one of the longest-standing problems in robot learning. Challenges stem from the intricacies of visual perception, the demand for precise motor skills, and the complex interplay between the two. In this work, we present Teacher-Augmented Policy Gradient (TAPG), a novel two-stage learning framework that synergizes reinforcement learning… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  6. arXiv:2403.09309  [pdf, other

    cs.RO

    MOTPose: Multi-object 6D Pose Estimation for Dynamic Video Sequences using Attention-based Temporal Fusion

    Authors: Arul Selvam Periyasamy, Sven Behnke

    Abstract: Cluttered bin-picking environments are challenging for pose estimation models. Despite the impressive progress enabled by deep learning, single-view RGB pose estimation models perform poorly in cluttered dynamic environments. Imbuing the rich temporal information contained in the video of scenes has the potential to enhance models ability to deal with the adverse effects of occlusion and the dynam… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  7. arXiv:2403.08885  [pdf, other

    cs.CV cs.AI cs.RO

    SLCF-Net: Sequential LiDAR-Camera Fusion for Semantic Scene Completion using a 3D Recurrent U-Net

    Authors: Helin Cao, Sven Behnke

    Abstract: We introduce SLCF-Net, a novel approach for the Semantic Scene Completion (SSC) task that sequentially fuses LiDAR and camera data. It jointly estimates missing geometry and semantics in a scene from sequences of RGB images and sparse LiDAR measurements. The images are semantically segmented by a pre-trained 2D U-Net and a dense depth prior is estimated from a depth-conditioned pipeline fueled by… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 2024 IEEE International Conference on Robotics and Automation (ICRA2024), Yokohama, Japan, May 2024

  8. arXiv:2402.19305  [pdf, other

    cs.CV

    HyenaPixel: Global Image Context with Convolutions

    Authors: Julian Spravil, Sebastian Houben, Sven Behnke

    Abstract: In computer vision, a larger effective receptive field (ERF) is associated with better performance. While attention natively supports global context, its quadratic complexity limits its applicability to tasks that benefit from high-resolution input. In this work, we extend Hyena, a convolution-based attention replacement, from causal sequences to bidirectional data and two-dimensional image space.… ▽ More

    Submitted 23 May, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

  9. arXiv:2402.01287  [pdf, other

    cs.CV cs.LG cs.NE

    Spiking CenterNet: A Distillation-boosted Spiking Neural Network for Object Detection

    Authors: Lennard Bodden, Franziska Schwaiger, Duc Bach Ha, Lars Kreuzberg, Sven Behnke

    Abstract: In the era of AI at the edge, self-driving cars, and climate change, the need for energy-efficient, small, embedded AI is growing. Spiking Neural Networks (SNNs) are a promising approach to address this challenge, with their event-driven information flow and sparse activations. We propose Spiking CenterNet for object detection on event data. It combines an SNN CenterNet adaptation with an efficien… ▽ More

    Submitted 6 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: 8 pages, 5 figures. Accepted at IJCNN 2024

  10. arXiv:2401.05909  [pdf, other

    cs.RO

    RoboCup 2023 Humanoid AdultSize Winner NimbRo: NimbRoNet3 Visual Perception and Responsive Gait with Waveform In-walk Kicks

    Authors: Dmytro Pavlichenko, Grzegorz Ficht, Angel Villar-Corrales, Luis Denninger, Julia Brocker, Tim Sinen, Michael Schreiber, Sven Behnke

    Abstract: The RoboCup Humanoid League holds annual soccer robot world championships towards the long-term objective of winning against the FIFA world champions by 2050. The participating teams continuously improve their systems. This paper presents the upgrades to our humanoid soccer system, leading our team NimbRo to win the Soccer Tournament in the Humanoid AdultSize League at RoboCup 2023 in Bordeaux, Fr… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: Accepted for: RoboCup 2023: Robot World Cup XXVI, LNCS, Springer, to appear 2024

  11. arXiv:2401.05290  [pdf, other

    cs.RO cs.HC

    Analysis and Perspectives on the ANA Avatar XPRIZE Competition

    Authors: Kris Hauser, Eleanor Watson, Joonbum Bae, Josh Bankston, Sven Behnke, Bill Borgia, Manuel G. Catalano, Stefano Dafarra, Jan B. F. van Erp, Thomas Ferris, Jeremy Fishel, Guy Hoffman, Serena Ivaldi, Fumio Kanehiro, Abderrahmane Kheddar, Gaelle Lannuzel, Jacqueline Ford Morie, Patrick Naughton, Steve NGuyen, Paul Oh, Taskin Padir, Jim Pippine, Jaeheung Park, Daniele Pucci, Jean Vaz , et al. (3 additional authors not shown)

    Abstract: The ANA Avatar XPRIZE was a four-year competition to develop a robotic "avatar" system to allow a human operator to sense, communicate, and act in a remote environment as though physically present. The competition featured a unique requirement that judges would operate the avatars after less than one hour of training on the human-machine interfaces, and avatar systems were judged on both objective… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: 26 pages, preprint of article appearing in International Journal of Social Robotics

  12. arXiv:2312.11019  [pdf, other

    cs.RO

    Centroidal State Estimation and Control for Hardware-constrained Humanoid Robots

    Authors: Grzegorz Ficht, Sven Behnke

    Abstract: We introduce novel methods for state estimation, feedforward and feedback control, which specifically target humanoid robots with hardware limitations. Our method combines a five-mass model with approximate dynamics of each mass. It enables acquiring an accurate assessment of the centroidal state and Center of Pressure, even when direct forms of force or contact sensing are unavailable. Upon this,… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: IEEE-RAS 22nd International Conference on Humanoid Robots (Humanoids), Austin, USA, December 2023

  13. arXiv:2312.09750  [pdf, other

    cs.CV cs.RO

    Attention-Based VR Facial Animation with Visual Mouth Camera Guidance for Immersive Telepresence Avatars

    Authors: Andre Rochow, Max Schwarz, Sven Behnke

    Abstract: Facial animation in virtual reality environments is essential for applications that necessitate clear visibility of the user's face and the ability to convey emotional signals. In our scenario, we animate the face of an operator who controls a robotic Avatar system. The use of facial animation is particularly valuable when the perception of interacting with a specific individual, rather than just… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: Published in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2023

  14. arXiv:2312.08268  [pdf, other

    cs.CV

    Efficient Multi-Object Pose Estimation using Multi-Resolution Deformable Attention and Query Aggregation

    Authors: Arul Selvam Periyasamy, Vladimir Tsaturyan, Sven Behnke

    Abstract: Object pose estimation is a long-standing problem in computer vision. Recently, attention-based vision transformer models have achieved state-of-the-art results in many computer vision applications. Exploiting the permutation-invariant nature of the attention mechanism, a family of vision transformer models formulate multi-object pose estimation as a set prediction problem. However, existing visio… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: IEEE International Conference on Robotic Computing (IRC), Laguna Hills, USA, December 2023

  15. Field Robot for High-throughput and High-resolution 3D Plant Phenoty**

    Authors: Felix Esser, Radu Alexandru Rosu, André Cornelißen, Lasse Klingbeil, Heiner Kuhlmann, Sven Behnke

    Abstract: With the need to feed a growing world population, the efficiency of crop production is of paramount importance. To support breeding and field management, various characteristics of the plant phenotype need to be measured -- a time-consuming process when performed manually. We present a robotic platform equipped with multiple laser and camera sensors for high-throughput, high-resolution in-field pl… ▽ More

    Submitted 5 November, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted for IEEE Robotics and Automation Magazine

  16. arXiv:2309.15562  [pdf, other

    cs.CV

    Learning from SAM: Harnessing a Foundation Model for Sim2Real Adaptation by Regularization

    Authors: Mayara E. Bonani, Max Schwarz, Sven Behnke

    Abstract: Domain adaptation is especially important for robotics applications, where target domain training data is usually scarce and annotations are costly to obtain. We present a method for self-supervised domain adaptation for the scenario where annotated source domain data (e.g. from synthetic generation) is available, but the target domain data is completely unannotated. Our method targets the semanti… ▽ More

    Submitted 10 May, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

  17. arXiv:2308.12238  [pdf, other

    cs.RO

    NimbRo wins ANA Avatar XPRIZE Immersive Telepresence Competition: Human-Centric Evaluation and Lessons Learned

    Authors: Christian Lenz, Max Schwarz, Andre Rochow, Bastian Pätzold, Raphael Memmesheimer, Michael Schreiber, Sven Behnke

    Abstract: Robotic avatar systems can enable immersive telepresence with locomotion, manipulation, and communication capabilities. We present such an avatar system, based on the key components of immersive 3D visualization and transparent force-feedback telemanipulation. Our avatar robot features an anthropomorphic upper body with dexterous hands. The remote human operator drives the arms and fingers through… ▽ More

    Submitted 28 August, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: C. Lenz and M. Schwarz contributed equally. Accepted for International Journal of Social Robotics (SORO), Springer, to appear 2023

  18. arXiv:2308.07878  [pdf

    cs.RO cs.AI cs.HC

    The $10 Million ANA Avatar XPRIZE Competition Advanced Immersive Telepresence Systems

    Authors: Sven Behnke, Julie A. Adams, David Locke

    Abstract: The $10M ANA Avatar XPRIZE aimed to create avatar systems that can transport human presence to remote locations in real time. The participants of this multi-year competition developed robotic systems that allow operators to see, hear, and interact with a remote environment in a way that feels as if they are truly there. On the other hand, people in the remote environment were given the impression… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

    Comments: Extended version of article accepted for competitions column

    Journal ref: IEEE Robotics and Automation Magazine, 2023

  19. arXiv:2307.16752  [pdf, other

    cs.RO

    Deep Reinforcement Learning of Dexterous Pre-grasp Manipulation for Human-like Functional Categorical Gras**

    Authors: Dmytro Pavlichenko, Sven Behnke

    Abstract: Many objects such as tools and household items can be used only if grasped in a very specific way - grasped functionally. Often, a direct functional grasp is not possible, though. We propose a method for learning a dexterous pre-grasp manipulation policy to achieve human-like functional grasps using deep reinforcement learning. We introduce a dense multi-component reward function that enables lear… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Comments: Accepted for IEEE 19th International Conference on Automation Science and Engineering (CASE), Auckland, New Zealand, August 2023

  20. arXiv:2307.16499  [pdf, other

    cs.RO cs.LG

    Learning Generalizable Tool Use with Non-rigid Grasp-pose Registration

    Authors: Malte Mosbach, Sven Behnke

    Abstract: Tool use, a hallmark feature of human intelligence, remains a challenging problem in robotics due the complex contacts and high-dimensional action space. In this work, we present a novel method to enable reinforcement learning of tool use behaviors. Our approach provides a scalable way to learn the operation of tools in a new category using only a single demonstration. To this end, we propose a ne… ▽ More

    Submitted 1 August, 2023; v1 submitted 31 July, 2023; originally announced July 2023.

    Comments: Accepted for publication at IEEE CASE 2023

  21. arXiv:2307.12292  [pdf, other

    cs.RO

    Quadrupedal Footstep Planning using Learned Motion Models of a Black-Box Controller

    Authors: Ilyass Taouil, Giulio Turrisi, Daniel Schleich, Victor Barasuol, Claudio Semini, Sven Behnke

    Abstract: Legged robots are increasingly entering new domains and applications, including search and rescue, inspection, and logistics. However, for such systems to be valuable in real-world scenarios, they must be able to autonomously and robustly navigate irregular terrains. In many cases, robots that are sold on the market do not provide such abilities, being able to perform only blind locomotion. Furthe… ▽ More

    Submitted 23 July, 2023; originally announced July 2023.

  22. arXiv:2307.11550  [pdf, other

    cs.CV

    YOLOPose V2: Understanding and Improving Transformer-based 6D Pose Estimation

    Authors: Arul Selvam Periyasamy, Arash Amini, Vladimir Tsaturyan, Sven Behnke

    Abstract: 6D object pose estimation is a crucial prerequisite for autonomous robot manipulation applications. The state-of-the-art models for pose estimation are convolutional neural network (CNN)-based. Lately, Transformers, an architecture originally proposed for natural language processing, is achieving state-of-the-art results in many computer vision tasks as well. Equipped with the multi-head self-atte… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: Robotics and Autonomous Systems Journal, Elsevier, to appear 2023. arXiv admin note: substantial text overlap with arXiv:2205.02536

  23. VR Facial Animation for Immersive Telepresence Avatars

    Authors: Andre Rochow, Max Schwarz, Michael Schreiber, Sven Behnke

    Abstract: VR Facial Animation is necessary in applications requiring clear view of the face, even though a VR headset is worn. In our case, we aim to animate the face of an operator who is controlling our robotic avatar system. We propose a real-time capable pipeline with very fast adaptation for specific operators. In a quick enrollment step, we capture a sequence of source images from the operator without… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

    Comments: Published in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022

  24. arXiv:2303.07186  [pdf, other

    cs.RO

    Audio-based Roughness Sensing and Tactile Feedback for Haptic Perception in Telepresence

    Authors: Bastian Pätzold, Andre Rochow, Michael Schreiber, Raphael Memmesheimer, Christian Lenz, Max Schwarz, Sven Behnke

    Abstract: Haptic perception is highly important for immersive teleoperation of robots, especially for accomplishing manipulation tasks. We propose a low-cost haptic sensing and rendering system, which is capable of detecting and displaying surface roughness. As the robot fingertip moves across a surface of interest, two microphones capture sound coupled directly through the fingertip and through the air, re… ▽ More

    Submitted 16 October, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: IEEE International Conference on Systems, Man, and Cybernetics (SMC), Honolulu, Hawaii, USA, October 2023

  25. arXiv:2303.03797  [pdf, other

    cs.RO cs.CV

    External Camera-based Mobile Robot Pose Estimation for Collaborative Perception with Smart Edge Sensors

    Authors: Simon Bultmann, Raphael Memmesheimer, Sven Behnke

    Abstract: We present an approach for estimating a mobile robot's pose w.r.t. the allocentric coordinates of a network of static cameras using multi-view RGB images. The images are processed online, locally on smart edge sensors by deep neural networks to detect the robot and estimate 2D keypoints defined at distinctive positions of the 3D robot model. Robot keypoint detections are synchronized and fused on… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

    Comments: Accepted for ICRA 2023, 7 pages, 8 figures

  26. arXiv:2303.03297  [pdf, other

    cs.RO

    Robust Immersive Telepresence and Mobile Telemanipulation: NimbRo wins ANA Avatar XPRIZE Finals

    Authors: Max Schwarz, Christian Lenz, Raphael Memmesheimer, Bastian Pätzold, Andre Rochow, Michael Schreiber, Sven Behnke

    Abstract: Robotic avatar systems promise to bridge distances and reduce the need for travel. We present the updated NimbRo avatar system, winner of the $5M grand prize at the international ANA Avatar XPRIZE competition, which required participants to build intuitive and immersive robotic telepresence systems that could be operated by briefly trained operators. We describe key improvements for the finals, co… ▽ More

    Submitted 6 December, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

    Comments: Accepted for IEEE-RAS International Conference on Humanoid Robots (HUMANOIDS) 2023. M. Schwarz and C. Lenz contributed equally

  27. arXiv:2302.11850  [pdf, other

    cs.CV

    Object-Centric Video Prediction via Decoupling of Object Dynamics and Interactions

    Authors: Angel Villar-Corrales, Ismail Wahdan, Sven Behnke

    Abstract: We propose a novel framework for the task of object-centric video prediction, i.e., extracting the compositional structure of a video sequence, as well as modeling objects dynamics and interactions from visual observations in order to predict the future object states, from which we can then generate subsequent video frames. With the goal of learning meaningful spatio-temporal object representation… ▽ More

    Submitted 31 July, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: Accepted for publication at IEEE International Conference on Image Processing (ICIP) 2023

  28. arXiv:2302.02956  [pdf, other

    cs.RO

    RoboCup 2022 AdultSize Winner NimbRo: Upgraded Perception, Capture Steps Gait and Phase-based In-walk Kicks

    Authors: Dmytro Pavlichenko, Grzegorz Ficht, Arash Amini, Mojtaba Hosseini, Raphael Memmesheimer, Angel Villar-Corrales, Stefan M. Schulz, Marcell Missura, Maren Bennewitz, Sven Behnke

    Abstract: Beating the human world champions by 2050 is an ambitious goal of the Humanoid League that provides a strong incentive for RoboCup teams to further improve and develop their systems. In this paper, we present upgrades of our system which enabled our team NimbRo to win the Soccer Tournament, the Drop-in Games, and the Technical Challenges in the Humanoid AdultSize League of RoboCup 2022. Strong per… ▽ More

    Submitted 7 February, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Journal ref: In: RoboCup 2022: Robot World Cup XXV. LNCS 13561, Springer, May 2023

  29. Rendering the Directional TSDF for Tracking and Multi-Sensor Registration with Point-To-Plane Scale ICP

    Authors: Malte Splietker, Sven Behnke

    Abstract: Dense real-time tracking and map** from RGB-D images is an important tool for many robotic applications, such as navigation and manipulation. The recently presented Directional Truncated Signed Distance Function (DTSDF) is an augmentation of the regular TSDF that shows potential for more coherent maps and improved tracking performance. In this work, we present methods for rendering depth- and co… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: Published in Robotics and Autonomous Systems, 2023. arXiv admin note: substantial text overlap with arXiv:2108.08115

  30. Bimanual Telemanipulation with Force and Haptic Feedback through an Anthropomorphic Avatar System

    Authors: Christian Lenz, Sven Behnke

    Abstract: Robotic teleoperation is a key technology for a wide variety of applications. It allows sending robots instead of humans in remote, possibly dangerous locations while still using the human brain with its enormous knowledge and creativity, especially for solving unexpected problems. A main challenge in teleoperation consists of providing enough feedback to the human operator for situation awareness… ▽ More

    Submitted 2 January, 2023; originally announced January 2023.

    Comments: Published in Robotics and Autonomous Systems, 2022 (https://doi.org/10.1016/j.robot.2022.104338). arXiv admin note: substantial text overlap with arXiv:2109.13382

  31. arXiv:2212.09354  [pdf, other

    cs.RO

    Lessons from Robot-Assisted Disaster Response Deployments by the German Rescue Robotics Center Task Force

    Authors: Hartmut Surmann, Ivana Kruijff-Korbayova, Kevin Daun, Marius Schnaubelt, Oskar von Stryk, Manuel Patchou, Stefan Boecker, Christian Wietfeld, Jan Quenzel, Daniel Schleich, Sven Behnke, Robert Grafe, Nils Heidemann, Dominik Slomma

    Abstract: Earthquakes, fire, and floods often cause structural collapses of buildings. The inspection of damaged buildings poses a high risk for emergency forces or is even impossible, though. We present three recent selected missions of the Robotics Task Force of the German Rescue Robotics Center, where both ground and aerial robots were used to explore destroyed buildings. We describe and reflect the miss… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: 8 pages, 9 figures, PrePrint, Technical Report, German Rescue Robotic Center

  32. arXiv:2212.02126  [pdf, other

    cs.RO cs.AI cs.LG

    Accelerating Interactive Human-like Manipulation Learning with GPU-based Simulation and High-quality Demonstrations

    Authors: Malte Mosbach, Kara Moraw, Sven Behnke

    Abstract: Dexterous manipulation with anthropomorphic robot hands remains a challenging problem in robotics because of the high-dimensional state and action spaces and complex contacts. Nevertheless, skillful closed-loop manipulation is required to enable humanoid robots to operate in unstructured real-world environments. Reinforcement learning (RL) has traditionally imposed enormous interaction data requir… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

    Journal ref: 2022 IEEE-RAS 21st International Conference on Humanoid Robots (Humanoids) 435-441

  33. arXiv:2211.12562  [pdf, other

    cs.CV

    PermutoSDF: Fast Multi-View Reconstruction with Implicit Surfaces using Permutohedral Lattices

    Authors: Radu Alexandru Rosu, Sven Behnke

    Abstract: Neural radiance-density field methods have become increasingly popular for the task of novel-view rendering. Their recent extension to hash-based positional encoding ensures fast training and inference with visually pleasing results. However, density-based methods struggle with recovering accurate surface geometry. Hybrid methods alleviate this issue by optimizing the density based on an underlyin… ▽ More

    Submitted 28 March, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: Accepted to CVPR 2023

  34. arXiv:2211.11394  [pdf, other

    cs.CV cs.RO

    Learning Implicit Probability Distribution Functions for Symmetric Orientation Estimation from RGB Images Without Pose Labels

    Authors: Arul Selvam Periyasamy, Luis Denninger, Sven Behnke

    Abstract: Object pose estimation is a necessary prerequisite for autonomous robotic manipulation, but the presence of symmetry increases the complexity of the pose estimation task. Existing methods for object pose estimation output a single 6D pose. Thus, they lack the ability to reason about symmetries. Lately, modeling object orientation as a non-parametric probability distribution on the SO(3) manifold b… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

  35. arXiv:2211.11390  [pdf, other

    cs.RO

    State Estimation for Hybrid Locomotion of Driving-Step** Quadrupeds

    Authors: Mojtaba Hosseini, Diego Rodriguez, Sven Behnke

    Abstract: Fast and versatile locomotion can be achieved with wheeled quadruped robots that drive quickly on flat terrain, but are also able to overcome challenging terrain by adapting their body pose and by making steps. In this paper, we present a state estimation approach for four-legged robots with non-steerable wheels that enables hybrid driving-step** locomotion capabilities. We formulate a Kalman Fi… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted final version. IEEE International Robotic Computing (IRC), Naples, Italy, December 2022

  36. arXiv:2211.11354  [pdf, other

    cs.CV cs.RO

    Object-level 3D Semantic Map** using a Network of Smart Edge Sensors

    Authors: Julian Hau, Simon Bultmann, Sven Behnke

    Abstract: Autonomous robots that interact with their environment require a detailed semantic scene model. For this, volumetric semantic maps are frequently used. The scene understanding can further be improved by including object-level information in the map. In this work, we extend a multi-view 3D semantic map** system consisting of a network of distributed smart edge sensors with object-level informatio… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: 9 pages, 12 figures, 6th IEEE International Conference on Robotic Computing (IRC), Naples, Italy, December 2022

  37. arXiv:2211.10957  [pdf, other

    cs.RO cs.AI cs.LG

    Efficient Representations of Object Geometry for Reinforcement Learning of Interactive Gras** Policies

    Authors: Malte Mosbach, Sven Behnke

    Abstract: Gras** objects of different shapes and sizes - a foundational, effortless skill for humans - remains a challenging task in robotics. Although model-based approaches can predict stable grasp configurations for known object models, they struggle to generalize to novel objects and often operate in a non-interactive open-loop manner. In this work, we present a reinforcement learning framework that l… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

  38. Real-Time Multi-Modal Semantic Fusion on Unmanned Aerial Vehicles with Label Propagation for Cross-Domain Adaptation

    Authors: Simon Bultmann, Jan Quenzel, Sven Behnke

    Abstract: Unmanned aerial vehicles (UAVs) equipped with multiple complementary sensors have tremendous potential for fast autonomous or remote-controlled semantic scene analysis, e.g., for disaster examination. Here, we propose a UAV system for real-time semantic inference and fusion of multiple sensor modalities. Semantic segmentation of LiDAR scans and RGB images, as well as object detection on RGB and th… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: 35 pages, 18 figures, Accepted for Robotics and Autonomous Systems, Elsevier. arXiv admin note: substantial text overlap with arXiv:2108.06608

  39. arXiv:2209.07393  [pdf, other

    cs.CV cs.RO

    Online Marker-free Extrinsic Camera Calibration using Person Keypoint Detections

    Authors: Bastian Pätzold, Simon Bultmann, Sven Behnke

    Abstract: Calibration of multi-camera systems, i.e. determining the relative poses between the cameras, is a prerequisite for many tasks in computer vision and robotics. Camera calibration is typically achieved using offline methods that use checkerboard calibration targets. These methods, however, often are cumbersome and lengthy, considering that a new calibration is required each time any camera pose cha… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

    Comments: DAGM German Conference on Pattern Recognition (GCPR), Konstanz, September 2022

  40. arXiv:2208.05873  [pdf, other

    cs.RO

    Predictive Angular Potential Field-based Obstacle Avoidance for Dynamic UAV Flights

    Authors: Daniel Schleich, Sven Behnke

    Abstract: In recent years, unmanned aerial vehicles (UAVs) are used for numerous inspection and video capture tasks. Manually controlling UAVs in the vicinity of obstacles is challenging, however, and poses a high risk of collisions. Even for autonomous flight, global navigation planning might be too slow to react to newly perceived obstacles. Disturbances such as wind might lead to deviations from the plan… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

    Comments: Accepted for IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, to appear October 2022

  41. arXiv:2208.04757  [pdf, other

    cs.RO

    Direct Centroidal Control for Balanced Humanoid Locomotion

    Authors: Grzegorz Ficht, Sven Behnke

    Abstract: We present an integrated approach to locomotion and balancing of humanoid robots based on direct centroidal control. Our method uses a five-mass description of a humanoid. It generates whole-body motions from desired foot trajectories and centroidal parameters of the robot. A set of simplified models is used to formulate general and intuitive control laws, which are then applied in real-time for e… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

    Comments: 25th International Conference Series on Climbing and Walking Robots (CLAWAR), Ponta Delgada, Azores, Portugal, September 2022

  42. arXiv:2207.14067  [pdf, other

    cs.CV cs.GR

    Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images

    Authors: Radu Alexandru Rosu, Shunsuke Saito, Ziyan Wang, Chenglei Wu, Sven Behnke, Giljoo Nam

    Abstract: We present Neural Strands, a novel learning framework for modeling accurate hair geometry and appearance from multi-view image inputs. The learned hair model can be rendered in real-time from any viewpoint with high-fidelity view-dependent effects. Our model achieves intuitive shape and style control unlike volumetric counterparts. To enable these properties, we propose a novel hair representation… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

    Comments: ECCV 2022. Project page: https://radualexandru.github.io/neural_strands/

  43. arXiv:2206.00930  [pdf, other

    cs.CV cs.RO

    Predicting Physical Object Properties from Video

    Authors: Martin Link, Max Schwarz, Sven Behnke

    Abstract: We present a novel approach to estimating physical properties of objects from video. Our approach consists of a physics engine and a correction estimator. Starting from the initial observed state, object behavior is simulated forward in time. Based on the simulated and observed behavior, the correction estimator then determines refined physical parameters for each object. The method can be iterate… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

    Comments: accepted for International Joint Conference on Neural Networks (IJCNN) 2022

  44. ConvPoseCNN2: Prediction and Refinement of Dense 6D Object Poses

    Authors: Arul Selvam Periyasamy, Catherine Capellen, Max Schwarz, Sven Behnke

    Abstract: Object pose estimation is a key perceptual capability in robotics. We propose a fully-convolutional extension of the PoseCNN method, which densely predicts object translations and orientations. This has several advantages such as improving the spatial resolution of the orientation predictions -- useful in highly-cluttered arrangements, significant reduction in parameters by avoiding full connectiv… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    Journal ref: Communications in Computer and Information Science (CCIS), vol. 1474, pp. 353-371, Springer, 2022

  45. arXiv:2205.02536  [pdf, other

    cs.CV

    YOLOPose: Transformer-based Multi-Object 6D Pose Estimation using Keypoint Regression

    Authors: Arash Amini, Arul Selvam Periyasamy, Sven Behnke

    Abstract: 6D object pose estimation is a crucial prerequisite for autonomous robot manipulation applications. The state-of-the-art models for pose estimation are convolutional neural network (CNN)-based. Lately, Transformers, an architecture originally proposed for natural language processing, is achieving state-of-the-art results in many computer vision tasks as well. Equipped with the multi-head self-atte… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

  46. arXiv:2205.02202  [pdf, other

    cs.RO

    Two-step Planning of Dynamic UAV Trajectories using Iterative $δ$-Spaces

    Authors: Sebastian Schräder, Daniel Schleich, Sven Behnke

    Abstract: UAV trajectory planning is often done in a two-step approach, where a low-dimensional path is refined to a dynamic trajectory. The resulting trajectories are only locally optimal, however. On the other hand, direct planning in higher-dimensional state spaces generates globally optimal solutions but is time-consuming and thus infeasible for time-constrained applications. To address this issue, we p… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Comments: Accepted for 17th International Conference on Intelligent Autonomous Systems (IAS), Zagreb, Croatia, to appear June 2022

  47. arXiv:2205.01460  [pdf, other

    cs.CV cs.RO

    3D Semantic Scene Perception using Distributed Smart Edge Sensors

    Authors: Simon Bultmann, Sven Behnke

    Abstract: We present a system for 3D semantic scene perception consisting of a network of distributed smart edge sensors. The sensor nodes are based on an embedded CNN inference accelerator and RGB-D and thermal cameras. Efficient vision CNN models for object detection, semantic segmentation, and human pose estimation run on-device in real time. 2D human keypoint estimations, augmented with the RGB-D depth… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: 17th International Conference on Intelligent Autonomous Systems (IAS), Zagreb, Croatia, June 2022

  48. arXiv:2203.15469  [pdf, other

    cs.CV cs.LG

    Abstract Flow for Temporal Semantic Segmentation on the Permutohedral Lattice

    Authors: Peer Schütt, Radu Alexandru Rosu, Sven Behnke

    Abstract: Semantic segmentation is a core ability required by autonomous agents, as being able to distinguish which parts of the scene belong to which object class is crucial for navigation and interaction with the environment. Approaches which use only one time-step of data cannot distinguish between moving objects nor can they benefit from temporal integration. In this work, we extend a backbone LatticeNe… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: Accepted IEEE International Conference on Robotics and Automation (ICRA) 2022, Code available at https://github.com/AIS-Bonn/temporal_latticenet

  49. Synthetic-to-Real Domain Adaptation using Contrastive Unpaired Translation

    Authors: Benedikt T. Imbusch, Max Schwarz, Sven Behnke

    Abstract: The usefulness of deep learning models in robotics is largely dependent on the availability of training data. Manual annotation of training data is often infeasible. Synthetic data is a viable alternative, but suffers from domain gap. We propose a multi-step method to obtain training data without manual annotation effort: From 3D object meshes, we generate images using a modern synthesis pipeline.… ▽ More

    Submitted 28 June, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Journal ref: 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE), 2022, pp. 595-602

  50. arXiv:2203.09303  [pdf, other

    cs.CV cs.RO

    MSPred: Video Prediction at Multiple Spatio-Temporal Scales with Hierarchical Recurrent Networks

    Authors: Angel Villar-Corrales, Ani Karapetyan, Andreas Boltres, Sven Behnke

    Abstract: Autonomous systems not only need to understand their current environment, but should also be able to predict future actions conditioned on past states, for instance based on captured camera frames. However, existing models mainly focus on forecasting future video frames for short time-horizons, hence being of limited use for long-term action planning. We propose Multi-Scale Hierarchical Prediction… ▽ More

    Submitted 9 November, 2022; v1 submitted 17 March, 2022; originally announced March 2022.