Search | arXiv e-print repository

Multi-Resolution Elevation Map** and Safe Landing Site Detection with Applications to Planetary Rotorcraft

Authors: Pascal Schoppmann, Pedro F. Proença, Jeff Delaune, Michael Pantic, Timo Hinzmann, Larry Matthies, Roland Siegwart, Roland Brockers

Abstract: In this paper, we propose a resource-efficient approach to provide an autonomous UAV with an on-board perception method to detect safe, hazard-free landing sites during flights over complex 3D terrain. We aggregate 3D measurements acquired from a sequence of monocular images by a Structure-from-Motion approach into a local, robot-centric, multi-resolution elevation map of the overflown terrain, wh… ▽ More In this paper, we propose a resource-efficient approach to provide an autonomous UAV with an on-board perception method to detect safe, hazard-free landing sites during flights over complex 3D terrain. We aggregate 3D measurements acquired from a sequence of monocular images by a Structure-from-Motion approach into a local, robot-centric, multi-resolution elevation map of the overflown terrain, which fuses depth measurements according to their lateral surface resolution (pixel-footprint) in a probabilistic framework based on the concept of dynamic Level of Detail. Map aggregation only requires depth maps and the associated poses, which are obtained from an onboard Visual Odometry algorithm. An efficient landing site detection method then exploits the features of the underlying multi-resolution map to detect safe landing sites based on slope, roughness, and quality of the reconstructed terrain surface. The evaluation of the performance of the map** and landing site detection modules are analyzed independently and jointly in simulated and real-world experiments in order to establish the efficacy of the proposed approach. △ Less

Submitted 11 November, 2021; originally announced November 2021.

Comments: 8 pages, 12 figures. Accepted at IROS 2021

arXiv:2103.16528 [pdf, other]

SD-6DoF-ICLK: Sparse and Deep Inverse Compositional Lucas-Kanade Algorithm on SE(3)

Authors: Timo Hinzmann, Roland Siegwart

Abstract: This paper introduces SD-6DoF-ICLK, a learning-based Inverse Compositional Lucas-Kanade (ICLK) pipeline that uses sparse depth information to optimize the relative pose that best aligns two images on SE(3). To compute this six Degrees-of-Freedom (DoF) relative transformation, the proposed formulation requires only sparse depth information in one of the images, which is often the only available dep… ▽ More This paper introduces SD-6DoF-ICLK, a learning-based Inverse Compositional Lucas-Kanade (ICLK) pipeline that uses sparse depth information to optimize the relative pose that best aligns two images on SE(3). To compute this six Degrees-of-Freedom (DoF) relative transformation, the proposed formulation requires only sparse depth information in one of the images, which is often the only available depth source in visual-inertial odometry or Simultaneous Localization and Map** (SLAM) pipelines. In an optional subsequent step, the framework further refines feature locations and the relative pose using individual feature alignment and bundle adjustment for pose and structure re-alignment. The resulting sparse point correspondences with subpixel-accuracy and refined relative pose can be used for depth map generation, or the image alignment module can be embedded in an odometry or map** framework. Experiments with rendered imagery show that the forward SD-6DoF-ICLK runs at 145 ms per image pair with a resolution of 752 x 480 pixels each, and vastly outperforms the classical, sparse 6DoF-ICLK algorithm, making it the ideal framework for robust image alignment under severe conditions. △ Less

Submitted 30 March, 2021; originally announced March 2021.

Comments: Initial submission; 7 pages, 3 figures, 2 tables

arXiv:2008.04619 [pdf, other]

Deep UAV Localization with Reference View Rendering

Authors: Timo Hinzmann, Roland Siegwart

Abstract: This paper presents a framework for the localization of Unmanned Aerial Vehicles (UAVs) in unstructured environments with the help of deep learning. A real-time rendering engine is introduced that generates optical and depth images given a six Degrees-of-Freedom (DoF) camera pose, camera model, geo-referenced orthoimage, and elevation map. The rendering engine is embedded into a learning-based six… ▽ More This paper presents a framework for the localization of Unmanned Aerial Vehicles (UAVs) in unstructured environments with the help of deep learning. A real-time rendering engine is introduced that generates optical and depth images given a six Degrees-of-Freedom (DoF) camera pose, camera model, geo-referenced orthoimage, and elevation map. The rendering engine is embedded into a learning-based six-DoF Inverse Compositional Lucas-Kanade (ICLK) algorithm that is able to robustly align the rendered and real-world image taken by the UAV. To learn the alignment under environmental changes, the architecture is trained using maps spanning multiple years at high resolution. The evaluation shows that the deep 6DoF-ICLK algorithm outperforms its non-trainable counterparts by a large margin. To further support the research in this field, the real-time rendering engine and accompanying datasets are released along with this publication. △ Less

Submitted 11 August, 2020; originally announced August 2020.

Comments: Initial submission; 15 pages, 3 figures, 3 tables

arXiv:2008.04197 [pdf, other]

Deep Learning-based Human Detection for UAVs with Optical and Infrared Cameras: System and Experiments

Authors: Timo Hinzmann, Tobias Stegemann, Cesar Cadena, Roland Siegwart

Abstract: In this paper, we present our deep learning-based human detection system that uses optical (RGB) and long-wave infrared (LWIR) cameras to detect, track, localize, and re-identify humans from UAVs flying at high altitude. In each spectrum, a customized RetinaNet network with ResNet backbone provides human detections which are subsequently fused to minimize the overall false detection rate. We show… ▽ More In this paper, we present our deep learning-based human detection system that uses optical (RGB) and long-wave infrared (LWIR) cameras to detect, track, localize, and re-identify humans from UAVs flying at high altitude. In each spectrum, a customized RetinaNet network with ResNet backbone provides human detections which are subsequently fused to minimize the overall false detection rate. We show that by optimizing the bounding box anchors and augmenting the image resolution the number of missed detections from high altitudes can be decreased by over 20 percent. Our proposed network is compared to different RetinaNet and YOLO variants, and to a classical optical-infrared human detection framework that uses hand-crafted features. Furthermore, along with the publication of this paper, we release a collection of annotated optical-infrared datasets recorded with different UAVs during search-and-rescue field tests and the source code of the implemented annotation tool. △ Less

Submitted 10 August, 2020; originally announced August 2020.

Comments: Initial submission; 21 pages, 16 figures, 6 tables

arXiv:1908.08891 [pdf, other]

Flexible Trinocular: Non-rigid Multi-Camera-IMU Dense Reconstruction for UAV Navigation and Map**

Authors: Timo Hinzmann, Cesar Cadena, Juan Nieto, Roland Siegwart

Abstract: In this paper, we propose a visual-inertial framework able to efficiently estimate the camera poses of a non-rigid trinocular baseline for long-range depth estimation on-board a fast moving aerial platform. The estimation of the time-varying baseline is based on relative inertial measurements, a photometric relative pose optimizer, and a probabilistic wing model fused in an efficient Extended Kalm… ▽ More In this paper, we propose a visual-inertial framework able to efficiently estimate the camera poses of a non-rigid trinocular baseline for long-range depth estimation on-board a fast moving aerial platform. The estimation of the time-varying baseline is based on relative inertial measurements, a photometric relative pose optimizer, and a probabilistic wing model fused in an efficient Extended Kalman Filter (EKF) formulation. The estimated depth measurements can be integrated into a geo-referenced global map to render a reconstruction of the environment useful for local replanning algorithms. Based on extensive real-world experiments we describe the challenges and solutions for obtaining the probabilistic wing model, reliable relative inertial measurements, and vision-based relative pose updates and demonstrate the computational efficiency and robustness of the overall system under challenging conditions. △ Less

Submitted 23 August, 2019; originally announced August 2019.

Comments: Preprint of a paper presented at the IEEE International Conference on Intelligent Robots and Systems (IROS) 2019

arXiv:1803.03932 [pdf, other]

Cubic Range Error Model for Stereo Vision with Illuminators

Authors: Marius Huber, Timo Hinzmann, Roland Siegwart, Larry H. Matthies

Abstract: Use of low-cost depth sensors, such as a stereo camera setup with illuminators, is of particular interest for numerous applications ranging from robotics and transportation to mixed and augmented reality. The ability to quantify noise is crucial for these applications, e.g., when the sensor is used for map generation or to develop a sensor scheduling policy in a multi-sensor setup. Range error mod… ▽ More Use of low-cost depth sensors, such as a stereo camera setup with illuminators, is of particular interest for numerous applications ranging from robotics and transportation to mixed and augmented reality. The ability to quantify noise is crucial for these applications, e.g., when the sensor is used for map generation or to develop a sensor scheduling policy in a multi-sensor setup. Range error models provide uncertainty estimates and help weigh the data correctly in instances where range measurements are taken from different vantage points or with different sensors. The weighing is important to fuse range data into a map in a meaningful way, i.e., the high confidence data is relied on most heavily. Such a model is derived in this work. We show that the range error for stereo systems with integrated illuminators is cubic and validate the proposed model experimentally with an off-the-shelf structured light stereo system. The experiments confirm the validity of the model and simplify the application of this type of sensor in robotics. The proposed error model is relevant to any stereo system with low ambient light where the main light source is located at the camera system. Among others, this is the case for structured light stereo systems and night stereo systems with headlights. In this work, we propose that the range error is cubic in range for stereo systems with integrated illuminators. Experimental validation with an off-the-shelf structured light stereo system shows that the exponent is between 2.4 and 2.6. The deviation is attributed to our model considering only shot noise. △ Less

Submitted 11 March, 2018; originally announced March 2018.

Comments: 6 pages, to be published at ICRA 2018

arXiv:1802.09043 [pdf, other]

Free LSD: Prior-Free Visual Landing Site Detection for Autonomous Planes

Authors: Timo Hinzmann, Thomas Stastny, Cesar Cadena, Roland Siegwart, Igor Gilitschenski

Abstract: Full autonomy for fixed-wing unmanned aerial vehicles (UAVs) requires the capability to autonomously detect potential landing sites in unknown and unstructured terrain, allowing for self-governed mission completion or handling of emergency situations. In this work, we propose a perception system addressing this challenge by detecting landing sites based on their texture and geometric shape without… ▽ More Full autonomy for fixed-wing unmanned aerial vehicles (UAVs) requires the capability to autonomously detect potential landing sites in unknown and unstructured terrain, allowing for self-governed mission completion or handling of emergency situations. In this work, we propose a perception system addressing this challenge by detecting landing sites based on their texture and geometric shape without using any prior knowledge about the environment. The proposed method considers hazards within the landing region such as terrain roughness and slope, surrounding obstacles that obscure the landing approach path, and the local wind field that is estimated by the on-board EKF. The latter enables applicability of the proposed method on small-scale autonomous planes without landing gear. A safe approach path is computed based on the UAV dynamics, expected state estimation and actuator uncertainty, and the on-board computed elevation map. The proposed framework has been successfully tested on photo-realistic synthetic datasets and in challenging real-world environments. △ Less

Submitted 25 February, 2018; originally announced February 2018.

Comments: Accepted for publication in IEEE International Conference on Robotics and Automation (ICRA), 2018, Brisbane and IEEE Robotics and Automation Letters (RA-L), 2018

arXiv:1712.06837 [pdf, other]

Flexible Stereo: Constrained, Non-rigid, Wide-baseline Stereo Vision for Fixed-wing Aerial Platforms

Authors: Timo Hinzmann, Tim Taubner, Roland Siegwart

Abstract: This paper proposes a computationally efficient method to estimate the time-varying relative pose between two visual-inertial sensor rigs mounted on the flexible wings of a fixed-wing unmanned aerial vehicle (UAV). The estimated relative poses are used to generate highly accurate depth maps in real-time and can be employed for obstacle avoidance in low-altitude flights or landing maneuvers. The ap… ▽ More This paper proposes a computationally efficient method to estimate the time-varying relative pose between two visual-inertial sensor rigs mounted on the flexible wings of a fixed-wing unmanned aerial vehicle (UAV). The estimated relative poses are used to generate highly accurate depth maps in real-time and can be employed for obstacle avoidance in low-altitude flights or landing maneuvers. The approach is structured as follows: Initially, a wing model is identified by fitting a probability density function to measured deviations from the nominal relative baseline transformation. At run-time, the prior knowledge about the wing model is fused in an Extended Kalman filter~(EKF) together with relative pose measurements obtained from solving a relative perspective N-point problem (PNP), and the linear accelerations and angular velocities measured by the two inertial measurement units (IMU) which are rigidly attached to the cameras. Results obtained from extensive synthetic experiments demonstrate that our proposed framework is able to estimate highly accurate baseline transformations and depth maps. △ Less

Submitted 26 February, 2018; v1 submitted 19 December, 2017; originally announced December 2017.

Comments: Accepted for publication in IEEE International Conference on Robotics and Automation (ICRA), 2018, Brisbane

arXiv:1610.03548 [pdf, other]

doi 10.1109/ICRA.2017.7989362

Visual Place Recognition with Probabilistic Vertex Voting

Authors: Mathias Gehrig, Elena Stumm, Timo Hinzmann, Roland Siegwart

Abstract: We propose a novel scoring concept for visual place recognition based on nearest neighbor descriptor voting and demonstrate how the algorithm naturally emerges from the problem formulation. Based on the observation that the number of votes for matching places can be evaluated using a binomial distribution model, loop closures can be detected with high precision. By casting the problem into a proba… ▽ More We propose a novel scoring concept for visual place recognition based on nearest neighbor descriptor voting and demonstrate how the algorithm naturally emerges from the problem formulation. Based on the observation that the number of votes for matching places can be evaluated using a binomial distribution model, loop closures can be detected with high precision. By casting the problem into a probabilistic framework, we not only remove the need for commonly employed heuristic parameters but also provide a powerful score to classify matching and non-matching places. We present methods for both a 2D-2D pose-graph vertex matching and a 2D-3D landmark matching based on the above scoring. The approach maintains accuracy while being efficient enough for online application through the use of compact (low dimensional) descriptors and fast nearest neighbor retrieval techniques. The proposed methods are evaluated on several challenging datasets in varied environments, showing state-of-the-art results with high precision and high recall. △ Less

Submitted 7 June, 2018; v1 submitted 11 October, 2016; originally announced October 2016.

Comments: 8 pages

Journal ref: 2017 IEEE International Conference on Robotics and Automation (ICRA)

Showing 1–9 of 9 results for author: Hinzmann, T