Search | arXiv e-print repository

arXiv:2007.14812 [pdf, other]

What My Motion tells me about Your Pose: A Self-Supervised Monocular 3D Vehicle Detector

Authors: Cédric Picron, Punarjay Chakravarty, Tom Roussel, Tinne Tuytelaars

Abstract: The estimation of the orientation of an observed vehicle relative to an Autonomous Vehicle (AV) from monocular camera data is an important building block in estimating its 6 DoF pose. Current Deep Learning based solutions for placing a 3D bounding box around this observed vehicle are data hungry and do not generalize well. In this paper, we demonstrate the use of monocular visual odometry for the… ▽ More The estimation of the orientation of an observed vehicle relative to an Autonomous Vehicle (AV) from monocular camera data is an important building block in estimating its 6 DoF pose. Current Deep Learning based solutions for placing a 3D bounding box around this observed vehicle are data hungry and do not generalize well. In this paper, we demonstrate the use of monocular visual odometry for the self-supervised fine-tuning of a model for orientation estimation pre-trained on a reference domain. Specifically, while transitioning from a virtual dataset (vKITTI) to nuScenes, we recover up to 70% of the performance of a fully supervised method. We subsequently demonstrate an optimization-based monocular 3D bounding box detector built on top of the self-supervised vehicle orientation estimator without the requirement of expensive labeled data. This allows 3D vehicle detection algorithms to be self-trained from large amounts of monocular camera data from existing commercial vehicle fleets. △ Less

Submitted 24 March, 2021; v1 submitted 29 July, 2020; originally announced July 2020.

Comments: ICRA 2021 (presentation)

arXiv:2002.01210 [pdf, other]

Deep-Geometric 6 DoF Localization from a Single Image in Topo-metric Maps

Authors: Tom Roussel, Punarjay Chakravarty, Gaurav Pandey, Tinne Tuytelaars, Luc Van Eycken

Abstract: We describe a Deep-Geometric Localizer that is able to estimate the full 6 Degree of Freedom (DoF) global pose of the camera from a single image in a previously mapped environment. Our map is a topo-metric one, with discrete topological nodes whose 6 DoF poses are known. Each topo-node in our map also comprises of a set of points, whose 2D features and 3D locations are stored as part of the mappin… ▽ More We describe a Deep-Geometric Localizer that is able to estimate the full 6 Degree of Freedom (DoF) global pose of the camera from a single image in a previously mapped environment. Our map is a topo-metric one, with discrete topological nodes whose 6 DoF poses are known. Each topo-node in our map also comprises of a set of points, whose 2D features and 3D locations are stored as part of the map** process. For the map** phase, we utilise a stereo camera and a regular stereo visual SLAM pipeline. During the localization phase, we take a single camera image, localize it to a topological node using Deep Learning, and use a geometric algorithm (PnP) on the matched 2D features (and their 3D positions in the topo map) to determine the full 6 DoF globally consistent pose of the camera. Our method divorces the map** and the localization algorithms and sensors (stereo and mono), and allows accurate 6 DoF pose estimation in a previously mapped environment using a single camera. With potential VR/AR and localization applications in single camera devices such as mobile phones and drones, our hybrid algorithm compares favourably with the fully Deep-Learning based Pose-Net that regresses pose from a single image in simulated as well as real environments. △ Less

Submitted 4 February, 2020; originally announced February 2020.

arXiv:1902.02086 [pdf, other]

GEN-SLAM: Generative Modeling for Monocular Simultaneous Localization and Map**

Authors: Punarjay Chakravarty, Praveen Narayanan, Tom Roussel

Abstract: We present a Deep Learning based system for the twin tasks of localization and obstacle avoidance essential to any mobile robot. Our system learns from conventional geometric SLAM, and outputs, using a single camera, the topological pose of the camera in an environment, and the depth map of obstacles around it. We use a CNN to localize in a topological map, and a conditional VAE to output depth fo… ▽ More We present a Deep Learning based system for the twin tasks of localization and obstacle avoidance essential to any mobile robot. Our system learns from conventional geometric SLAM, and outputs, using a single camera, the topological pose of the camera in an environment, and the depth map of obstacles around it. We use a CNN to localize in a topological map, and a conditional VAE to output depth for a camera image, conditional on this topological location estimation. We demonstrate the effectiveness of our monocular localization and depth estimation system on simulated and real datasets. △ Less

Submitted 6 February, 2019; originally announced February 2019.

Comments: Accepted for ICRA 2019

Showing 1–3 of 3 results for author: Roussel, T