-
Semidefinite Relaxations for Robust Multiview Triangulation
Authors:
Linus Härenstam-Nielsen,
Niclas Zeller,
Daniel Cremers
Abstract:
We propose an approach based on convex relaxations for certifiably optimal robust multiview triangulation. To this end, we extend existing relaxation approaches to non-robust multiview triangulation by incorporating a truncated least squares cost function. We propose two formulations, one based on epipolar constraints and one based on fractional reprojection constraints. The first is lower dimensi…
▽ More
We propose an approach based on convex relaxations for certifiably optimal robust multiview triangulation. To this end, we extend existing relaxation approaches to non-robust multiview triangulation by incorporating a truncated least squares cost function. We propose two formulations, one based on epipolar constraints and one based on fractional reprojection constraints. The first is lower dimensional and remains tight under moderate noise and outlier levels, while the second is higher dimensional and therefore slower but remains tight even under extreme noise and outlier levels. We demonstrate through extensive experiments that the proposed approaches allow us to compute provably optimal reconstructions even under significant noise and a large percentage of outliers.
△ Less
Submitted 5 April, 2023; v1 submitted 26 January, 2023;
originally announced January 2023.
-
4Seasons: Benchmarking Visual SLAM and Long-Term Localization for Autonomous Driving in Challenging Conditions
Authors:
Patrick Wenzel,
Nan Yang,
Rui Wang,
Niclas Zeller,
Daniel Cremers
Abstract:
In this paper, we present a novel visual SLAM and long-term localization benchmark for autonomous driving in challenging conditions based on the large-scale 4Seasons dataset. The proposed benchmark provides drastic appearance variations caused by seasonal changes and diverse weather and illumination conditions. While significant progress has been made in advancing visual SLAM on small-scale datase…
▽ More
In this paper, we present a novel visual SLAM and long-term localization benchmark for autonomous driving in challenging conditions based on the large-scale 4Seasons dataset. The proposed benchmark provides drastic appearance variations caused by seasonal changes and diverse weather and illumination conditions. While significant progress has been made in advancing visual SLAM on small-scale datasets with similar conditions, there is still a lack of unified benchmarks representative of real-world scenarios for autonomous driving. We introduce a new unified benchmark for jointly evaluating visual odometry, global place recognition, and map-based visual localization performance which is crucial to successfully enable autonomous driving in any condition. The data has been collected for more than one year, resulting in more than 300 km of recordings in nine different environments ranging from a multi-level parking garage to urban (including tunnels) to countryside and highway. We provide globally consistent reference poses with up to centimeter-level accuracy obtained from the fusion of direct stereo-inertial odometry with RTK GNSS. We evaluate the performance of several state-of-the-art visual odometry and visual localization baseline approaches on the benchmark and analyze their properties. The experimental results provide new insights into current approaches and show promising potential for future research. Our benchmark and evaluation protocols will be available at https://www.4seasons-dataset.com/.
△ Less
Submitted 31 December, 2022;
originally announced January 2023.
-
Vision-based Large-scale 3D Semantic Map** for Autonomous Driving Applications
Authors:
Qing Cheng,
Niclas Zeller,
Daniel Cremers
Abstract:
In this paper, we present a complete pipeline for 3D semantic map** solely based on a stereo camera system. The pipeline comprises a direct sparse visual odometry front-end as well as a back-end for global optimization including GNSS integration, and semantic 3D point cloud labeling. We propose a simple but effective temporal voting scheme which improves the quality and consistency of the 3D poi…
▽ More
In this paper, we present a complete pipeline for 3D semantic map** solely based on a stereo camera system. The pipeline comprises a direct sparse visual odometry front-end as well as a back-end for global optimization including GNSS integration, and semantic 3D point cloud labeling. We propose a simple but effective temporal voting scheme which improves the quality and consistency of the 3D point labels. Qualitative and quantitative evaluations of our pipeline are performed on the KITTI-360 dataset. The results show the effectiveness of our proposed voting scheme and the capability of our pipeline for efficient large-scale 3D semantic map**. The large-scale map** capabilities of our pipeline is furthermore demonstrated by presenting a very large-scale semantic map covering 8000 km of roads generated from data collected by a fleet of vehicles.
△ Less
Submitted 2 March, 2022;
originally announced March 2022.
-
TANDEM: Tracking and Dense Map** in Real-time using Deep Multi-view Stereo
Authors:
Lukas Koestler,
Nan Yang,
Niclas Zeller,
Daniel Cremers
Abstract:
In this paper, we present TANDEM a real-time monocular tracking and dense map** framework. For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of keyframes. To increase the robustness, we propose a novel tracking front-end that performs dense direct image alignment using depth maps rendered from a global model that is built incrementally from dense depth…
▽ More
In this paper, we present TANDEM a real-time monocular tracking and dense map** framework. For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of keyframes. To increase the robustness, we propose a novel tracking front-end that performs dense direct image alignment using depth maps rendered from a global model that is built incrementally from dense depth predictions. To predict the dense depth maps, we propose Cascade View-Aggregation MVSNet (CVA-MVSNet) that utilizes the entire active keyframe window by hierarchically constructing 3D cost volumes with adaptive view aggregation to balance the different stereo baselines between the keyframes. Finally, the predicted depth maps are fused into a consistent global map represented as a truncated signed distance function (TSDF) voxel grid. Our experimental results show that TANDEM outperforms other state-of-the-art traditional and learning-based monocular visual odometry (VO) methods in terms of camera tracking. Moreover, TANDEM shows state-of-the-art real-time 3D reconstruction performance.
△ Less
Submitted 14 November, 2021;
originally announced November 2021.
-
Tight Integration of Feature-based Relocalization in Monocular Direct Visual Odometry
Authors:
Mariia Gladkova,
Rui Wang,
Niclas Zeller,
Daniel Cremers
Abstract:
In this paper we propose a framework for integrating map-based relocalization into online direct visual odometry. To achieve map-based relocalization for direct methods, we integrate image features into Direct Sparse Odometry (DSO) and rely on feature matching to associate online visual odometry (VO) with a previously built map. The integration of the relocalization poses is threefold. Firstly, th…
▽ More
In this paper we propose a framework for integrating map-based relocalization into online direct visual odometry. To achieve map-based relocalization for direct methods, we integrate image features into Direct Sparse Odometry (DSO) and rely on feature matching to associate online visual odometry (VO) with a previously built map. The integration of the relocalization poses is threefold. Firstly, they are incorporated as pose priors in the direct image alignment of the front-end tracking. Secondly, they are tightly integrated into the back-end bundle adjustment. Thirdly, an online fusion module is further proposed to combine relative VO poses and global relocalization poses in a pose graph to estimate keyframe-wise smooth and globally accurate poses. We evaluate our method on two multi-weather datasets showing the benefits of integrating different handcrafted and learned features and demonstrating promising improvements on camera tracking accuracy.
△ Less
Submitted 29 March, 2021; v1 submitted 1 February, 2021;
originally announced February 2021.
-
MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera
Authors:
Felix Wimbauer,
Nan Yang,
Lukas von Stumberg,
Niclas Zeller,
Daniel Cremers
Abstract:
In this paper, we propose MonoRec, a semi-supervised monocular dense reconstruction architecture that predicts depth maps from a single moving camera in dynamic environments. MonoRec is based on a multi-view stereo setting which encodes the information of multiple consecutive images in a cost volume. To deal with dynamic objects in the scene, we introduce a MaskModule that predicts moving object m…
▽ More
In this paper, we propose MonoRec, a semi-supervised monocular dense reconstruction architecture that predicts depth maps from a single moving camera in dynamic environments. MonoRec is based on a multi-view stereo setting which encodes the information of multiple consecutive images in a cost volume. To deal with dynamic objects in the scene, we introduce a MaskModule that predicts moving object masks by leveraging the photometric inconsistencies encoded in the cost volumes. Unlike other multi-view stereo methods, MonoRec is able to reconstruct both static and moving objects by leveraging the predicted masks. Furthermore, we present a novel multi-stage training scheme with a semi-supervised loss formulation that does not require LiDAR depth values. We carefully evaluate MonoRec on the KITTI dataset and show that it achieves state-of-the-art performance compared to both multi-view and single-view methods. With the model trained on KITTI, we further demonstrate that MonoRec is able to generalize well to both the Oxford RobotCar dataset and the more challenging TUM-Mono dataset recorded by a handheld camera. Code and related materials will be available at https://vision.in.tum.de/research/monorec.
△ Less
Submitted 6 May, 2021; v1 submitted 23 November, 2020;
originally announced November 2020.
-
4Seasons: A Cross-Season Dataset for Multi-Weather SLAM in Autonomous Driving
Authors:
Patrick Wenzel,
Rui Wang,
Nan Yang,
Qing Cheng,
Qadeer Khan,
Lukas von Stumberg,
Niclas Zeller,
Daniel Cremers
Abstract:
We present a novel dataset covering seasonal and challenging perceptual conditions for autonomous driving. Among others, it enables research on visual odometry, global place recognition, and map-based re-localization tracking. The data was collected in different scenarios and under a wide variety of weather conditions and illuminations, including day and night. This resulted in more than 350 km of…
▽ More
We present a novel dataset covering seasonal and challenging perceptual conditions for autonomous driving. Among others, it enables research on visual odometry, global place recognition, and map-based re-localization tracking. The data was collected in different scenarios and under a wide variety of weather conditions and illuminations, including day and night. This resulted in more than 350 km of recordings in nine different environments ranging from multi-level parking garage over urban (including tunnels) to countryside and highway. We provide globally consistent reference poses with up-to centimeter accuracy obtained from the fusion of direct stereo visual-inertial odometry with RTK-GNSS. The full dataset is available at https://www.4seasons-dataset.com.
△ Less
Submitted 14 October, 2020; v1 submitted 14 September, 2020;
originally announced September 2020.
-
A Synchronized Stereo and Plenoptic Visual Odometry Dataset
Authors:
Niclas Zeller,
Franz Quint,
Uwe Stilla
Abstract:
We present a new dataset to evaluate monocular, stereo, and plenoptic camera based visual odometry algorithms. The dataset comprises a set of synchronized image sequences recorded by a micro lens array (MLA) based plenoptic camera and a stereo camera system. For this, the stereo cameras and the plenoptic camera were assembled on a common hand-held platform. All sequences are recorded in a very lar…
▽ More
We present a new dataset to evaluate monocular, stereo, and plenoptic camera based visual odometry algorithms. The dataset comprises a set of synchronized image sequences recorded by a micro lens array (MLA) based plenoptic camera and a stereo camera system. For this, the stereo cameras and the plenoptic camera were assembled on a common hand-held platform. All sequences are recorded in a very large loop, where beginning and end show the same scene. Therefore, the tracking accuracy of a visual odometry algorithm can be measured from the drift between beginning and end of the sequence. For both, the plenoptic camera and the stereo system, we supply full intrinsic camera models, as well as vignetting data. The dataset consists of 11 sequences which were recorded in challenging indoor and outdoor scenarios. We present, by way of example, the results achieved by state-of-the-art algorithms.
△ Less
Submitted 26 August, 2018; v1 submitted 24 July, 2018;
originally announced July 2018.