-
Online Mutual Adaptation of Deep Depth Prediction and Visual SLAM
Authors:
Shing Yan Loo,
Moein Shakeri,
Sai Hong Tang,
Syamsiah Mashohor,
Hong Zhang
Abstract:
The ability of accurate depth prediction by a convolutional neural network (CNN) is a major challenge for its wide use in practical visual simultaneous localization and map** (SLAM) applications, such as enhanced camera tracking and dense map**. This paper is set out to answer the following question: Can we tune a depth prediction CNN with the help of a visual SLAM algorithm even if the CNN is…
▽ More
The ability of accurate depth prediction by a convolutional neural network (CNN) is a major challenge for its wide use in practical visual simultaneous localization and map** (SLAM) applications, such as enhanced camera tracking and dense map**. This paper is set out to answer the following question: Can we tune a depth prediction CNN with the help of a visual SLAM algorithm even if the CNN is not trained for the current operating environment in order to benefit the SLAM performance? To this end, we propose a novel online adaptation framework consisting of two complementary processes: a SLAM algorithm that is used to generate keyframes to fine-tune the depth prediction and another algorithm that uses the online adapted depth to improve map quality. Once the potential noisy map points are removed, we perform global photometric bundle adjustment (BA) to improve the overall SLAM performance. Experimental results on both benchmark datasets and a real robot in our own experimental environments show that our proposed method improves the overall SLAM accuracy. While regularization has been shown to be effective in multi-task classification problems, we present experimental results and an ablation study to show the effectiveness of regularization in preventing catastrophic forgetting in the online adaptation of depth prediction, a single-task regression problem. In addition, we compare our online adaptation framework against the state-of-the-art pre-trained depth prediction CNNs to show that our online adapted depth prediction CNN outperforms the depth prediction CNNs that have been trained on a large collection of datasets.
△ Less
Submitted 1 February, 2022; v1 submitted 7 November, 2021;
originally announced November 2021.
-
DeepRelativeFusion: Dense Monocular SLAM using Single-Image Relative Depth Prediction
Authors:
Shing Yan Loo,
Syamsiah Mashohor,
Sai Hong Tang,
Hong Zhang
Abstract:
In this paper, we propose a dense monocular SLAM system, named DeepRelativeFusion, that is capable to recover a globally consistent 3D structure. To this end, we use a visual SLAM algorithm to reliably recover the camera poses and semi-dense depth maps of the keyframes, and then use relative depth prediction to densify the semi-dense depth maps and refine the keyframe pose-graph. To improve the se…
▽ More
In this paper, we propose a dense monocular SLAM system, named DeepRelativeFusion, that is capable to recover a globally consistent 3D structure. To this end, we use a visual SLAM algorithm to reliably recover the camera poses and semi-dense depth maps of the keyframes, and then use relative depth prediction to densify the semi-dense depth maps and refine the keyframe pose-graph. To improve the semi-dense depth maps, we propose an adaptive filtering scheme, which is a structure-preserving weighted average smoothing filter that takes into account the pixel intensity and depth of the neighbouring pixels, yielding substantial reconstruction accuracy gain in densification. To perform densification, we introduce two incremental improvements upon the energy minimization framework proposed by DeepFusion: (1) an improved cost function, and (2) the use of single-image relative depth prediction. After densification, we update the keyframes with two-view consistent optimized semi-dense and dense depth maps to improve pose-graph optimization, providing a feedback loop to refine the keyframe poses for accurate scene reconstruction. Our system outperforms the state-of-the-art dense SLAM systems quantitatively in dense reconstruction accuracy by a large margin.
△ Less
Submitted 9 July, 2021; v1 submitted 7 June, 2020;
originally announced June 2020.
-
CNN-SVO: Improving the Map** in Semi-Direct Visual Odometry Using Single-Image Depth Prediction
Authors:
Shing Yan Loo,
Ali Jahani Amiri,
Syamsiah Mashohor,
Sai Hong Tang,
Hong Zhang
Abstract:
Reliable feature correspondence between frames is a critical step in visual odometry (VO) and visual simultaneous localization and map** (V-SLAM) algorithms. In comparison with existing VO and V-SLAM algorithms, semi-direct visual odometry (SVO) has two main advantages that lead to state-of-the-art frame rate camera motion estimation: direct pixel correspondence and efficient implementation of p…
▽ More
Reliable feature correspondence between frames is a critical step in visual odometry (VO) and visual simultaneous localization and map** (V-SLAM) algorithms. In comparison with existing VO and V-SLAM algorithms, semi-direct visual odometry (SVO) has two main advantages that lead to state-of-the-art frame rate camera motion estimation: direct pixel correspondence and efficient implementation of probabilistic map** method. This paper improves the SVO map** by initializing the mean and the variance of the depth at a feature location according to the depth prediction from a single-image depth prediction network. By significantly reducing the depth uncertainty of the initialized map point (i.e., small variance centred about the depth prediction), the benefits are twofold: reliable feature correspondence between views and fast convergence to the true depth in order to create new map points. We evaluate our method with two outdoor datasets: KITTI dataset and Oxford Robotcar dataset. The experimental results indicate that the improved SVO map** results in increased robustness and camera tracking accuracy.
△ Less
Submitted 1 October, 2018;
originally announced October 2018.