Stochastic Video Prediction with Structure and Motion

Akan, Adil Kaan; Safadoust, Sadra; Güney, Fatma

Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.10528 (cs)

[Submitted on 20 Mar 2022 (v1), last revised 29 Apr 2022 (this version, v2)]

Title:Stochastic Video Prediction with Structure and Motion

Authors:Adil Kaan Akan, Sadra Safadoust, Fatma Güney

View PDF

Abstract:While stochastic video prediction models enable future prediction under uncertainty, they mostly fail to model the complex dynamics of real-world scenes. For example, they cannot provide reliable predictions for scenes with a moving camera and independently moving foreground objects in driving scenarios. The existing methods fail to fully capture the dynamics of the structured world by only focusing on changes in pixels. In this paper, we assume that there is an underlying process creating observations in a video and propose to factorize it into static and dynamic components. We model the static part based on the scene structure and the ego-motion of the vehicle, and the dynamic part based on the remaining motion of the dynamic objects. By learning separate distributions of changes in foreground and background, we can decompose the scene into static and dynamic parts and separately model the change in each. Our experiments demonstrate that disentangling structure and motion helps stochastic video prediction, leading to better future predictions in complex driving scenarios on two real-world driving datasets, KITTI and Cityscapes.

Comments:	Under review at TPAMI
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2203.10528 [cs.CV]
	(or arXiv:2203.10528v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.10528

Submission history

From: Adil Kaan Akan [view email]
[v1] Sun, 20 Mar 2022 11:29:46 UTC (9,118 KB)
[v2] Fri, 29 Apr 2022 09:06:49 UTC (9,990 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Stochastic Video Prediction with Structure and Motion

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Stochastic Video Prediction with Structure and Motion

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators