Learning Depth from Monocular Videos Using Synthetic Data: A Temporally-Consistent Domain Adaptation Approach

Mou, Yipeng; Gong, Mingming; Fu, Huan; Batmanghelich, Kayhan; Zhang, Kun; Tao, Dacheng

Computer Science > Computer Vision and Pattern Recognition

arXiv:1907.06882 (cs)

[Submitted on 16 Jul 2019 (v1), last revised 26 Nov 2019 (this version, v2)]

Title:Learning Depth from Monocular Videos Using Synthetic Data: A Temporally-Consistent Domain Adaptation Approach

Authors:Yipeng Mou, Mingming Gong, Huan Fu, Kayhan Batmanghelich, Kun Zhang, Dacheng Tao

View PDF

Abstract:Majority of state-of-the-art monocular depth estimation methods are supervised learning approaches. The success of such approaches heavily depends on the high-quality depth labels which are expensive to obtain. Some recent methods try to learn depth networks by leveraging unsupervised cues from monocular videos which are easier to acquire but less reliable. In this paper, we propose to resolve this dilemma by transferring knowledge from synthetic videos with easily obtainable ground-truth depth labels. Due to the stylish difference between synthetic and real images, we propose a temporally-consistent domain adaptation (TCDA) approach that simultaneously explores labels in the synthetic domain and temporal constraints in the videos to improve style transfer and depth prediction. Furthermore, we make use of the ground-truth optical flow and pose information in the synthetic data to learn moving mask and pose prediction networks. The learned moving masks can filter out moving regions that produces erroneous temporal constraints and the estimated poses provide better initializations for estimating temporal constraints. Experimental results demonstrate the effectiveness of our method and comparable performance against state-of-the-art.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1907.06882 [cs.CV]
	(or arXiv:1907.06882v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1907.06882

Submission history

From: Yipeng Mou [view email]
[v1] Tue, 16 Jul 2019 08:05:26 UTC (5,017 KB)
[v2] Tue, 26 Nov 2019 18:04:56 UTC (4,698 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Learning Depth from Monocular Videos Using Synthetic Data: A Temporally-Consistent Domain Adaptation Approach

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Learning Depth from Monocular Videos Using Synthetic Data: A Temporally-Consistent Domain Adaptation Approach

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators