RELATE: Physically Plausible Multi-Object Scene Synthesis Using Structured Latent Spaces

Ehrhardt, Sebastien; Groth, Oliver; Monszpart, Aron; Engelcke, Martin; Posner, Ingmar; Mitra, Niloy; Vedaldi, Andrea

Computer Science > Computer Vision and Pattern Recognition

arXiv:2007.01272 (cs)

[Submitted on 2 Jul 2020 (v1), last revised 9 Nov 2020 (this version, v2)]

Title:RELATE: Physically Plausible Multi-Object Scene Synthesis Using Structured Latent Spaces

Authors:Sebastien Ehrhardt, Oliver Groth, Aron Monszpart, Martin Engelcke, Ingmar Posner, Niloy Mitra, Andrea Vedaldi

View PDF

Abstract:We present RELATE, a model that learns to generate physically plausible scenes and videos of multiple interacting objects. Similar to other generative approaches, RELATE is trained end-to-end on raw, unlabeled data. RELATE combines an object-centric GAN formulation with a model that explicitly accounts for correlations between individual objects. This allows the model to generate realistic scenes and videos from a physically-interpretable parameterization. Furthermore, we show that modeling the object correlation is necessary to learn to disentangle object positions and identity. We find that RELATE is also amenable to physically realistic scene editing and that it significantly outperforms prior art in object-centric scene generation in both synthetic (CLEVR, ShapeStacks) and real-world data (cars). In addition, in contrast to state-of-the-art methods in object-centric generative modeling, RELATE also extends naturally to dynamic scenes and generates videos of high visual fidelity. Source code, datasets and more results are available at this http URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2007.01272 [cs.CV]
	(or arXiv:2007.01272v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2007.01272

Submission history

From: Sébastien Ehrhardt [view email]
[v1] Thu, 2 Jul 2020 17:27:27 UTC (6,025 KB)
[v2] Mon, 9 Nov 2020 18:03:58 UTC (42,486 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:RELATE: Physically Plausible Multi-Object Scene Synthesis Using Structured Latent Spaces

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:RELATE: Physically Plausible Multi-Object Scene Synthesis Using Structured Latent Spaces

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators