Search | arXiv e-print repository

doi 10.1109/MRA.2023.3321402

Field Robot for High-throughput and High-resolution 3D Plant Phenoty**

Authors: Felix Esser, Radu Alexandru Rosu, André Cornelißen, Lasse Klingbeil, Heiner Kuhlmann, Sven Behnke

Abstract: With the need to feed a growing world population, the efficiency of crop production is of paramount importance. To support breeding and field management, various characteristics of the plant phenotype need to be measured -- a time-consuming process when performed manually. We present a robotic platform equipped with multiple laser and camera sensors for high-throughput, high-resolution in-field pl… ▽ More With the need to feed a growing world population, the efficiency of crop production is of paramount importance. To support breeding and field management, various characteristics of the plant phenotype need to be measured -- a time-consuming process when performed manually. We present a robotic platform equipped with multiple laser and camera sensors for high-throughput, high-resolution in-field plant scanning. We create digital twins of the plants through 3D reconstruction. This allows the estimation of phenotypic traits such as leaf area, leaf angle, and plant height. We validate our system on a real field, where we reconstruct accurate point clouds and meshes of sugar beet, soybean, and maize. △ Less

Submitted 5 November, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

Comments: Accepted for IEEE Robotics and Automation Magazine

arXiv:2211.12562 [pdf, other]

PermutoSDF: Fast Multi-View Reconstruction with Implicit Surfaces using Permutohedral Lattices

Authors: Radu Alexandru Rosu, Sven Behnke

Abstract: Neural radiance-density field methods have become increasingly popular for the task of novel-view rendering. Their recent extension to hash-based positional encoding ensures fast training and inference with visually pleasing results. However, density-based methods struggle with recovering accurate surface geometry. Hybrid methods alleviate this issue by optimizing the density based on an underlyin… ▽ More Neural radiance-density field methods have become increasingly popular for the task of novel-view rendering. Their recent extension to hash-based positional encoding ensures fast training and inference with visually pleasing results. However, density-based methods struggle with recovering accurate surface geometry. Hybrid methods alleviate this issue by optimizing the density based on an underlying SDF. However, current SDF methods are overly smooth and miss fine geometric details. In this work, we combine the strengths of these two lines of work in a novel hash-based implicit surface representation. We propose improvements to the two areas by replacing the voxel hash encoding with a permutohedral lattice which optimizes faster, especially for higher dimensions. We additionally propose a regularization scheme which is crucial for recovering high-frequency geometric detail. We evaluate our method on multiple datasets and show that we can recover geometric detail at the level of pores and wrinkles while using only RGB images for supervision. Furthermore, using sphere tracing we can render novel views at 30 fps on an RTX 3090. Code is publicly available at: https://radualexandru.github.io/permuto_sdf △ Less

Submitted 28 March, 2023; v1 submitted 22 November, 2022; originally announced November 2022.

Comments: Accepted to CVPR 2023

arXiv:2207.14067 [pdf, other]

Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images

Authors: Radu Alexandru Rosu, Shunsuke Saito, Ziyan Wang, Chenglei Wu, Sven Behnke, Giljoo Nam

Abstract: We present Neural Strands, a novel learning framework for modeling accurate hair geometry and appearance from multi-view image inputs. The learned hair model can be rendered in real-time from any viewpoint with high-fidelity view-dependent effects. Our model achieves intuitive shape and style control unlike volumetric counterparts. To enable these properties, we propose a novel hair representation… ▽ More We present Neural Strands, a novel learning framework for modeling accurate hair geometry and appearance from multi-view image inputs. The learned hair model can be rendered in real-time from any viewpoint with high-fidelity view-dependent effects. Our model achieves intuitive shape and style control unlike volumetric counterparts. To enable these properties, we propose a novel hair representation based on a neural scalp texture that encodes the geometry and appearance of individual strands at each texel location. Furthermore, we introduce a novel neural rendering framework based on rasterization of the learned hair strands. Our neural rendering is strand-accurate and anti-aliased, making the rendering view-consistent and photorealistic. Combining appearance with a multi-view geometric prior, we enable, for the first time, the joint learning of appearance and explicit hair geometry from a multi-view setup. We demonstrate the efficacy of our approach in terms of fidelity and efficiency for various hairstyles. △ Less

Submitted 28 July, 2022; originally announced July 2022.

Comments: ECCV 2022. Project page: https://radualexandru.github.io/neural_strands/

arXiv:2203.15469 [pdf, other]

Abstract Flow for Temporal Semantic Segmentation on the Permutohedral Lattice

Authors: Peer Schütt, Radu Alexandru Rosu, Sven Behnke

Abstract: Semantic segmentation is a core ability required by autonomous agents, as being able to distinguish which parts of the scene belong to which object class is crucial for navigation and interaction with the environment. Approaches which use only one time-step of data cannot distinguish between moving objects nor can they benefit from temporal integration. In this work, we extend a backbone LatticeNe… ▽ More Semantic segmentation is a core ability required by autonomous agents, as being able to distinguish which parts of the scene belong to which object class is crucial for navigation and interaction with the environment. Approaches which use only one time-step of data cannot distinguish between moving objects nor can they benefit from temporal integration. In this work, we extend a backbone LatticeNet to process temporal point cloud data. Additionally, we take inspiration from optical flow methods and propose a new module called Abstract Flow which allows the network to match parts of the scene with similar abstract features and gather the information temporally. We obtain state-of-the-art results on the SemanticKITTI dataset that contains LiDAR scans from real urban environments. We share the PyTorch implementation of TemporalLatticeNet at https://github.com/AIS-Bonn/temporal_latticenet . △ Less

Submitted 29 March, 2022; originally announced March 2022.

Comments: Accepted IEEE International Conference on Robotics and Automation (ICRA) 2022, Code available at https://github.com/AIS-Bonn/temporal_latticenet

arXiv:2201.03844 [pdf, other]

Target Chase, Wall Building, and Fire Fighting: Autonomous UAVs of Team NimbRo at MBZIRC 2020

Authors: Marius Beul, Max Schwarz, Jan Quenzel, Malte Splietker, Simon Bultmann, Daniel Schleich, Andre Rochow, Dmytro Pavlichenko, Radu Alexandru Rosu, Patrick Lowin, Bruno Scheider, Michael Schreiber, Finn Süberkrüb, Sven Behnke

Abstract: The Mohamed Bin Zayed International Robotics Challenge (MBZIRC) 2020 posed diverse challenges for unmanned aerial vehicles (UAVs). We present our four tailored UAVs, specifically developed for individual aerial-robot tasks of MBZIRC, including custom hardware- and software components. In Challenge 1, a target UAV is pursued using a high-efficiency, onboard object detection pipeline to capture a… ▽ More The Mohamed Bin Zayed International Robotics Challenge (MBZIRC) 2020 posed diverse challenges for unmanned aerial vehicles (UAVs). We present our four tailored UAVs, specifically developed for individual aerial-robot tasks of MBZIRC, including custom hardware- and software components. In Challenge 1, a target UAV is pursued using a high-efficiency, onboard object detection pipeline to capture a ball from the target UAV. A second UAV uses a similar detection method to find and pop balloons scattered throughout the arena. For Challenge 2, we demonstrate a larger UAV capable of autonomous aerial manipulation: Bricks are found and tracked from camera images. Subsequently, they are approached, picked, transported, and placed on a wall. Finally, in Challenge 3, our UAV autonomously finds fires using LiDAR and thermal cameras. It extinguishes the fires with an onboard fire extinguisher. While every robot features task-specific subsystems, all UAVs rely on a standard software stack developed for this particular and future competitions. We present our mostly open-source software solutions, including tools for system configuration, monitoring, robust wireless communication, high-level control, and agile trajectory generation. For solving the MBZIRC 2020 tasks, we advanced the state of the art in multiple research areas like machine vision and trajectory generation. We present our scientific contributions that constitute the foundation for our algorithms and systems and analyze the results from the MBZIRC competition 2020 in Abu Dhabi, where our systems reached second place in the Grand Challenge. Furthermore, we discuss lessons learned from our participation in this complex robotic challenge. △ Less

Submitted 11 January, 2022; originally announced January 2022.

Comments: Accepted for Field Robotics, to appear 2022

arXiv:2108.03917 [pdf, other]

LatticeNet: Fast Spatio-Temporal Point Cloud Segmentation Using Permutohedral Lattices

Authors: Radu Alexandru Rosu, Peer Schütt, Jan Quenzel, Sven Behnke

Abstract: Deep convolutional neural networks (CNNs) have shown outstanding performance in the task of semantically segmenting images. Applying the same methods on 3D data still poses challenges due to the heavy memory requirements and the lack of structured data. Here, we propose LatticeNet, a novel approach for 3D semantic segmentation, which takes raw point clouds as input. A PointNet describes the local… ▽ More Deep convolutional neural networks (CNNs) have shown outstanding performance in the task of semantically segmenting images. Applying the same methods on 3D data still poses challenges due to the heavy memory requirements and the lack of structured data. Here, we propose LatticeNet, a novel approach for 3D semantic segmentation, which takes raw point clouds as input. A PointNet describes the local geometry which we embed into a sparse permutohedral lattice. The lattice allows for fast convolutions while kee** a low memory footprint. Further, we introduce DeformSlice, a novel learned data-dependent interpolation for projecting lattice features back onto the point cloud. We present results of 3D segmentation on multiple datasets where our method achieves state-of-the-art performance. We also extend and evaluate our network for instance and dynamic object segmentation. △ Less

Submitted 9 August, 2021; originally announced August 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:1912.05905

arXiv:2108.03880 [pdf, other]

NeuralMVS: Bridging Multi-View Stereo and Novel View Synthesis

Authors: Radu Alexandru Rosu, Sven Behnke

Abstract: Multi-View Stereo (MVS) is a core task in 3D computer vision. With the surge of novel deep learning methods, learned MVS has surpassed the accuracy of classical approaches, but still relies on building a memory intensive dense cost volume. Novel View Synthesis (NVS) is a parallel line of research and has recently seen an increase in popularity with Neural Radiance Field (NeRF) models, which optimi… ▽ More Multi-View Stereo (MVS) is a core task in 3D computer vision. With the surge of novel deep learning methods, learned MVS has surpassed the accuracy of classical approaches, but still relies on building a memory intensive dense cost volume. Novel View Synthesis (NVS) is a parallel line of research and has recently seen an increase in popularity with Neural Radiance Field (NeRF) models, which optimize a per scene radiance field. However, NeRF methods do not generalize to novel scenes and are slow to train and test. We propose to bridge the gap between these two methodologies with a novel network that can recover 3D scene geometry as a distance function, together with high-resolution color images. Our method uses only a sparse set of images as input and can generalize well to novel scenes. Additionally, we propose a coarse-to-fine sphere tracing approach in order to significantly increase speed. We show on various datasets that our method reaches comparable accuracy to per-scene optimized methods while being able to generalize and running significantly faster. We provide the source code at https://github.com/AIS-Bonn/neural_mvs △ Less

Submitted 15 June, 2022; v1 submitted 9 August, 2021; originally announced August 2021.

Comments: Accepted for International Joint Conference on Neural Networks (IJCNN) 2022. Code available at https://github.com/AIS-Bonn/neural_mvs

arXiv:2012.03325 [pdf, other]

EasyPBR: A Lightweight Physically-Based Renderer

Authors: Radu Alexandru Rosu, Sven Behnke

Abstract: Modern rendering libraries provide unprecedented realism, producing real-time photorealistic 3D graphics on commodity hardware. Visual fidelity, however, comes at the cost of increased complexity and difficulty of usage, with many rendering parameters requiring a deep understanding of the pipeline. We propose EasyPBR as an alternative rendering library that strikes a balance between ease-of-use an… ▽ More Modern rendering libraries provide unprecedented realism, producing real-time photorealistic 3D graphics on commodity hardware. Visual fidelity, however, comes at the cost of increased complexity and difficulty of usage, with many rendering parameters requiring a deep understanding of the pipeline. We propose EasyPBR as an alternative rendering library that strikes a balance between ease-of-use and visual quality. EasyPBR consists of a deferred renderer that implements recent state-of-the-art approaches in physically based rendering. It offers an easy-to-use Python and C++ interface that allows high-quality images to be created in only a few lines of code or directly through a graphical user interface. The user can choose between fully controlling the rendering pipeline or letting EasyPBR automatically infer the best parameters based on the current scene composition. The EasyPBR library can help the community to more easily leverage the power of current GPUs to create realistic images. These can then be used as synthetic data for deep learning or for creating animations for academic purposes. △ Less

Submitted 6 December, 2020; originally announced December 2020.

arXiv:2010.14913 [pdf, other]

Visually Guided Balloon Pop** with an Autonomous MAV at MBZIRC 2020

Authors: Marius Beul, Simon Bultmann, Andre Rochow, Radu Alexandru Rosu, Daniel Schleich, Malte Splietker, Sven Behnke

Abstract: Visually guided control of micro aerial vehicles (MAV) demands for robust real-time perception, fast trajectory generation, and a capable flight platform. We present a fully autonomous MAV that is able to pop balloons, relying only on onboard sensing and computing. The system is evaluated with real robot experiments during the Mohamed Bin Zayed International Robotics Challenge (MBZIRC) 2020 where… ▽ More Visually guided control of micro aerial vehicles (MAV) demands for robust real-time perception, fast trajectory generation, and a capable flight platform. We present a fully autonomous MAV that is able to pop balloons, relying only on onboard sensing and computing. The system is evaluated with real robot experiments during the Mohamed Bin Zayed International Robotics Challenge (MBZIRC) 2020 where it showed its resilience and speed. In all three competition runs we were able to pop all five balloons in less than two minutes flight time with a single MAV. △ Less

Submitted 28 October, 2020; originally announced October 2020.

Comments: Accepted for IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), Abu Dhabi, UAE, 2020

arXiv:2004.04090 [pdf, other]

Beyond Photometric Consistency: Gradient-based Dissimilarity for Improving Visual Odometry and Stereo Matching

Authors: Jan Quenzel, Radu Alexandru Rosu, Thomas Läbe, Cyrill Stachniss, Sven Behnke

Abstract: Pose estimation and map building are central ingredients of autonomous robots and typically rely on the registration of sensor data. In this paper, we investigate a new metric for registering images that builds upon on the idea of the photometric error. Our approach combines a gradient orientation-based metric with a magnitude-dependent scaling term. We integrate both into stereo estimation as wel… ▽ More Pose estimation and map building are central ingredients of autonomous robots and typically rely on the registration of sensor data. In this paper, we investigate a new metric for registering images that builds upon on the idea of the photometric error. Our approach combines a gradient orientation-based metric with a magnitude-dependent scaling term. We integrate both into stereo estimation as well as visual odometry systems and show clear benefits for typical disparity and direct image registration tasks when using our proposed metric. Our experimental evaluation indicats that our metric leads to more robust and more accurate estimates of the scene depth as well as camera trajectory. Thus, the metric improves camera pose estimation and in turn the map** capabilities of mobile robots. We believe that a series of existing visual odometry and visual SLAM systems can benefit from the findings reported in this paper. △ Less

Submitted 8 April, 2020; originally announced April 2020.

Comments: Accepted for International Conference on Robotic and Automation (ICRA), 2020

arXiv:1912.06354 [pdf, other]

Bonn Activity Maps: Dataset Description

Authors: Julian Tanke, Oh-Hun Kwon, Patrick Stotko, Radu Alexandru Rosu, Michael Weinmann, Hassan Errami, Sven Behnke, Maren Bennewitz, Reinhard Klein, Andreas Weber, Angela Yao, Juergen Gall

Abstract: The key prerequisite for accessing the huge potential of current machine learning techniques is the availability of large databases that capture the complex relations of interest. Previous datasets are focused on either 3D scene representations with semantic information, tracking of multiple persons and recognition of their actions, or activity recognition of a single person in captured 3D environ… ▽ More The key prerequisite for accessing the huge potential of current machine learning techniques is the availability of large databases that capture the complex relations of interest. Previous datasets are focused on either 3D scene representations with semantic information, tracking of multiple persons and recognition of their actions, or activity recognition of a single person in captured 3D environments. We present Bonn Activity Maps, a large-scale dataset for human tracking, activity recognition and anticipation of multiple persons. Our dataset comprises four different scenes that have been recorded by time-synchronized cameras each only capturing the scene partially, the reconstructed 3D models with semantic annotations, motion trajectories for individual people including 3D human poses as well as human activity annotations. We utilize the annotations to generate activity likelihoods on the 3D models called activity maps. △ Less

Submitted 13 December, 2019; originally announced December 2019.

arXiv:1912.05905 [pdf, other]

LatticeNet: Fast Point Cloud Segmentation Using Permutohedral Lattices

Authors: Radu Alexandru Rosu, Peer Schütt, Jan Quenzel, Sven Behnke

Abstract: Deep convolutional neural networks (CNNs) have shown outstanding performance in the task of semantically segmenting images. However, applying the same methods on 3D data still poses challenges due to the heavy memory requirements and the lack of structured data. Here, we propose LatticeNet, a novel approach for 3D semantic segmentation, which takes as input raw point clouds. A PointNet describes t… ▽ More Deep convolutional neural networks (CNNs) have shown outstanding performance in the task of semantically segmenting images. However, applying the same methods on 3D data still poses challenges due to the heavy memory requirements and the lack of structured data. Here, we propose LatticeNet, a novel approach for 3D semantic segmentation, which takes as input raw point clouds. A PointNet describes the local geometry which we embed into a sparse permutohedral lattice. The lattice allows for fast convolutions while kee** a low memory footprint. Further, we introduce DeformSlice, a novel learned data-dependent interpolation for projecting lattice features back onto the point cloud. We present results of 3D segmentation on various datasets where our method achieves state-of-the-art performance. △ Less

Submitted 16 August, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

arXiv:1906.07029 [pdf, other]

doi 10.1007/s11263-019-01187-z

Semi-Supervised Semantic Map** through Label Propagation with Semantic Texture Meshes

Authors: Radu Alexandru Rosu, Jan Quenzel, Sven Behnke

Abstract: Scene understanding is an important capability for robots acting in unstructured environments. While most SLAM approaches provide a geometrical representation of the scene, a semantic map is necessary for more complex interactions with the surroundings. Current methods treat the semantic map as part of the geometry which limits scalability and accuracy. We propose to represent the semantic map as… ▽ More Scene understanding is an important capability for robots acting in unstructured environments. While most SLAM approaches provide a geometrical representation of the scene, a semantic map is necessary for more complex interactions with the surroundings. Current methods treat the semantic map as part of the geometry which limits scalability and accuracy. We propose to represent the semantic map as a geometrical mesh and a semantic texture coupled at independent resolution. The key idea is that in many environments the geometry can be greatly simplified without loosing fidelity, while semantic information can be stored at a higher resolution, independent of the mesh. We construct a mesh from depth sensors to represent the scene geometry and fuse information into the semantic texture from segmentations of individual RGB views of the scene. Making the semantics persistent in a global mesh enables us to enforce temporal and spatial consistency of the individual view predictions. For this, we propose an efficient method of establishing consensus between individual segmentations by iteratively retraining semantic segmentation with the information stored within the map and using the retrained segmentation to re-fuse the semantics. We demonstrate the accuracy and scalability of our approach by reconstructing semantic maps of scenes from NYUv2 and a scene spanning large buildings. △ Less

Submitted 17 June, 2019; originally announced June 2019.

Comments: This is a pre-print of an article published in International Journal of Computer Vision (IJCV, 2019)

arXiv:1811.05471 [pdf, other]

doi 10.1002/rob.21817

Team NimbRo at MBZIRC 2017: Fast Landing on a Moving Target and Treasure Hunting with a Team of MAVs

Authors: Marius Beul, Matthias Nieuwenhuisen, Jan Quenzel, Radu Alexandru Rosu, Jannis Horn, Dmytro Pavlichenko, Sebastian Houben, Sven Behnke

Abstract: The Mohamed Bin Zayed International Robotics Challenge (MBZIRC) 2017 has defined ambitious new benchmarks to advance the state-of-the-art in autonomous operation of ground-based and flying robots. This article covers our approaches to solve the two challenges that involved micro aerial vehicles (MAV). Challenge 1 required reliable target perception, fast trajectory planning, and stable control of… ▽ More The Mohamed Bin Zayed International Robotics Challenge (MBZIRC) 2017 has defined ambitious new benchmarks to advance the state-of-the-art in autonomous operation of ground-based and flying robots. This article covers our approaches to solve the two challenges that involved micro aerial vehicles (MAV). Challenge 1 required reliable target perception, fast trajectory planning, and stable control of an MAV in order to land on a moving vehicle. Challenge 3 demanded a team of MAVs to perform a search and transportation task, coined "Treasure Hunt", which required mission planning and multi-robot coordination as well as adaptive control to account for the additional object weight. We describe our base MAV setup and the challenge-specific extensions, cover the camera-based perception, explain control and trajectory-planning in detail, and elaborate on mission planning and team coordination. We evaluated our systems in simulation as well as with real-robot experiments during the competition in Abu Dhabi. With our system, we-as part of the larger team NimbRo-won the MBZIRC Grand Challenge and achieved a third place in both subchallenges involving flying robots. △ Less

Submitted 2 January, 2019; v1 submitted 13 November, 2018; originally announced November 2018.

Journal ref: Journal of Field Robotics (JFR) 36(1):204-229, Wiley, January 2019

Showing 1–14 of 14 results for author: Rosu, R A