AutoDecoding Latent 3D Diffusion Models

Ntavelis, Evangelos; Siarohin, Aliaksandr; Olszewski, Kyle; Wang, Chaoyang; Van Gool, Luc; Tulyakov, Sergey

Computer Science > Computer Vision and Pattern Recognition

arXiv:2307.05445 (cs)

[Submitted on 7 Jul 2023]

Title:AutoDecoding Latent 3D Diffusion Models

Authors:Evangelos Ntavelis, Aliaksandr Siarohin, Kyle Olszewski, Chaoyang Wang, Luc Van Gool, Sergey Tulyakov

View PDF

Abstract:We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core. The 3D autodecoder framework embeds properties learned from the target dataset in the latent space, which can then be decoded into a volumetric representation for rendering view-consistent appearance and geometry. We then identify the appropriate intermediate volumetric latent space, and introduce robust normalization and de-normalization operations to learn a 3D diffusion from 2D images or monocular videos of rigid or articulated objects. Our approach is flexible enough to use either existing camera supervision or no camera information at all -- instead efficiently learning it during training. Our evaluations demonstrate that our generation results outperform state-of-the-art alternatives on various benchmark datasets and metrics, including multi-view image datasets of synthetic objects, real in-the-wild videos of moving people, and a large-scale, real video dataset of static objects.

Comments:	Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2307.05445 [cs.CV]
	(or arXiv:2307.05445v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2307.05445

Submission history

From: Evangelos Ntavelis [view email]
[v1] Fri, 7 Jul 2023 17:59:14 UTC (2,712 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:AutoDecoding Latent 3D Diffusion Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:AutoDecoding Latent 3D Diffusion Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators