Is Map** Necessary for Realistic PointGoal Navigation?

Partsey, Ruslan; Wijmans, Erik; Yokoyama, Naoki; Dobosevych, Oles; Batra, Dhruv; Maksymets, Oleksandr

Abstract:Can an autonomous agent navigate in a new environment without building an explicit map?
For the task of PointGoal navigation ('Go to $\Delta x$, $\Delta y$') under idealized settings (no RGB-D and actuation noise, perfect GPS+Compass), the answer is a clear 'yes' - map-less neural models composed of task-agnostic components (CNNs and RNNs) trained with large-scale reinforcement learning achieve 100% Success on a standard dataset (Gibson). However, for PointNav in a realistic setting (RGB-D and actuation noise, no GPS+Compass), this is an open question; one we tackle in this paper. The strongest published result for this task is 71.7% Success.
First, we identify the main (perhaps, only) cause of the drop in performance: the absence of GPS+Compass. An agent with perfect GPS+Compass faced with RGB-D sensing and actuation noise achieves 99.8% Success (Gibson-v2 val). This suggests that (to paraphrase a meme) robust visual odometry is all we need for realistic PointNav; if we can achieve that, we can ignore the sensing and actuation noise.
With that as our operating hypothesis, we scale the dataset and model size, and develop human-annotation-free data-augmentation techniques to train models for visual odometry. We advance the state of art on the Habitat Realistic PointNav Challenge from 71% to 94% Success (+23, 31% relative) and 53% to 74% SPL (+21, 40% relative). While our approach does not saturate or 'solve' this dataset, this strong improvement combined with promising zero-shot sim2real transfer (to a LoCoBot) provides evidence consistent with the hypothesis that explicit map** may not be necessary for navigation, even in a realistic setting.

Comments:	Corrected typos in the Abstract
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2206.00997 [cs.CV]
	(or arXiv:2206.00997v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2206.00997

Computer Science > Computer Vision and Pattern Recognition

Title:Is Map** Necessary for Realistic PointGoal Navigation?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators