-
The Revisiting Problem in Simultaneous Localization and Map**: A Survey on Visual Loop Closure Detection
Authors:
Konstantinos A. Tsintotas,
Loukas Bampis,
Antonios Gasteratos
Abstract:
Where am I? This is one of the most critical questions that any intelligent system should answer to decide whether it navigates to a previously visited area. This problem has long been acknowledged for its challenging nature in simultaneous localization and map** (SLAM), wherein the robot needs to correctly associate the incoming sensory data to the database allowing consistent map generation. T…
▽ More
Where am I? This is one of the most critical questions that any intelligent system should answer to decide whether it navigates to a previously visited area. This problem has long been acknowledged for its challenging nature in simultaneous localization and map** (SLAM), wherein the robot needs to correctly associate the incoming sensory data to the database allowing consistent map generation. The significant advances in computer vision achieved over the last 20 years, the increased computational power, and the growing demand for long-term exploration contributed to efficiently performing such a complex task with inexpensive perception sensors. In this article, visual loop closure detection, which formulates a solution based solely on appearance input data, is surveyed. We start by briefly introducing place recognition and SLAM concepts in robotics. Then, we describe a loop closure detection system's structure, covering an extensive collection of topics, including the feature extraction, the environment representation, the decision-making step, and the evaluation process. We conclude by discussing open and new research challenges, particularly concerning the robustness in dynamic environments, the computational complexity, and scalability in long-term operations. The article aims to serve as a tutorial and a position paper for newcomers to visual loop closure detection.
△ Less
Submitted 9 November, 2022; v1 submitted 27 April, 2022;
originally announced April 2022.
-
Real-Time Monocular Human Depth Estimation and Segmentation on Embedded Systems
Authors:
Shan An,
Fangru Zhou,
Mei Yang,
Haogang Zhu,
Changhong Fu,
Konstantinos A. Tsintotas
Abstract:
Estimating a scene's depth to achieve collision avoidance against moving pedestrians is a crucial and fundamental problem in the robotic field. This paper proposes a novel, low complexity network architecture for fast and accurate human depth estimation and segmentation in indoor environments, aiming to applications for resource-constrained platforms (including battery-powered aerial, micro-aerial…
▽ More
Estimating a scene's depth to achieve collision avoidance against moving pedestrians is a crucial and fundamental problem in the robotic field. This paper proposes a novel, low complexity network architecture for fast and accurate human depth estimation and segmentation in indoor environments, aiming to applications for resource-constrained platforms (including battery-powered aerial, micro-aerial, and ground vehicles) with a monocular camera being the primary perception module. Following the encoder-decoder structure, the proposed framework consists of two branches, one for depth prediction and another for semantic segmentation. Moreover, network structure optimization is employed to improve its forward inference speed. Exhaustive experiments on three self-generated datasets prove our pipeline's capability to execute in real-time, achieving higher frame rates than contemporary state-of-the-art frameworks (114.6 frames per second on an NVIDIA Jetson Nano GPU with TensorRT) while maintaining comparable accuracy.
△ Less
Submitted 23 August, 2021;
originally announced August 2021.
-
Fast Monocular Hand Pose Estimation on Embedded Systems
Authors:
Shan An,
Xiajie Zhang,
Dong Wei,
Haogang Zhu,
Jianyu Yang,
Konstantinos A. Tsintotas
Abstract:
Hand pose estimation is a fundamental task in many human-robot interaction-related applications. However, previous approaches suffer from unsatisfying hand landmark predictions in real-world scenes and high computation burden. This paper proposes a fast and accurate framework for hand pose estimation, dubbed as "FastHand". Using a lightweight encoder-decoder network architecture, FastHand fulfills…
▽ More
Hand pose estimation is a fundamental task in many human-robot interaction-related applications. However, previous approaches suffer from unsatisfying hand landmark predictions in real-world scenes and high computation burden. This paper proposes a fast and accurate framework for hand pose estimation, dubbed as "FastHand". Using a lightweight encoder-decoder network architecture, FastHand fulfills the requirements of practical applications running on embedded devices. The encoder consists of deep layers with a small number of parameters, while the decoder makes use of spatial location information to obtain more accurate results. The evaluation took place on two publicly available datasets demonstrating the improved performance of the proposed pipeline compared to other state-of-the-art approaches. FastHand offers high accuracy scores while reaching a speed of 25 frames per second on an NVIDIA Jetson TX2 graphics processing unit.
△ Less
Submitted 11 October, 2021; v1 submitted 13 February, 2021;
originally announced February 2021.
-
Fast and Incremental Loop Closure Detection with Deep Features and Proximity Graphs
Authors:
Shan An,
Haogang Zhu,
Dong Wei,
Konstantinos A. Tsintotas,
Antonios Gasteratos
Abstract:
In recent years, the robotics community has extensively examined methods concerning the place recognition task within the scope of simultaneous localization and map** applications.This article proposes an appearance-based loop closure detection pipeline named ``FILD++" (Fast and Incremental Loop closure Detection).First, the system is fed by consecutive images and, via passing them twice through…
▽ More
In recent years, the robotics community has extensively examined methods concerning the place recognition task within the scope of simultaneous localization and map** applications.This article proposes an appearance-based loop closure detection pipeline named ``FILD++" (Fast and Incremental Loop closure Detection).First, the system is fed by consecutive images and, via passing them twice through a single convolutional neural network, global and local deep features are extracted.Subsequently, a hierarchical navigable small-world graph incrementally constructs a visual database representing the robot's traversed path based on the computed global features.Finally, a query image, grabbed each time step, is set to retrieve similar locations on the traversed route.An image-to-image pairing follows, which exploits local features to evaluate the spatial information. Thus, in the proposed article, we propose a single network for global and local feature extraction in contrast to our previous work (FILD), while an exhaustive search for the verification process is adopted over the generated deep local features avoiding the utilization of hash codes. Exhaustive experiments on eleven publicly available datasets exhibit the system's high performance (achieving the highest recall score on eight of them) and low execution times (22.05 ms on average in New College, which is the largest one containing 52480 images) compared to other state-of-the-art approaches.
△ Less
Submitted 2 January, 2022; v1 submitted 28 September, 2020;
originally announced October 2020.