Skip to main content

Showing 1–50 of 76 results for author: Zollhofer, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.01807  [pdf, other

    cs.CV

    ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

    Authors: Lukas Höllein, Aljaž Božič, Norman Müller, David Novotny, Hung-Yu Tseng, Christian Richardt, Michael Zollhöfer, Matthias Nießner

    Abstract: 3D asset generation is getting massive amounts of attention, inspired by the recent success of text-guided 2D content creation. Existing text-to-3D methods use pretrained text-to-image diffusion models in an optimization problem or fine-tune them on synthetic data, which often results in non-photorealistic 3D objects without backgrounds. In this paper, we present a method that leverages pretrained… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024, project page: https://lukashoel.github.io/ViewDiff/, video: https://www.youtube.com/watch?v=SdjoCqHzMMk, code: https://github.com/facebookresearch/ViewDiff

  2. arXiv:2401.12977  [pdf, other

    cs.CV cs.GR

    IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images

    Authors: Zhi-Hao Lin, Jia-Bin Huang, Zhengqin Li, Zhao Dong, Christian Richardt, Tuotuo Li, Michael Zollhöfer, Johannes Kopf, Shenlong Wang, Changil Kim

    Abstract: While numerous 3D reconstruction and novel-view synthesis methods allow for photorealistic rendering of a scene from multi-view images easily captured with consumer cameras, they bake illumination in their representations and fall short of supporting advanced applications like material editing, relighting, and virtual object insertion. The reconstruction of physically based material properties and… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Project Website: https://irisldr.github.io/

  3. arXiv:2401.05334  [pdf, other

    cs.CV cs.GR

    URHand: Universal Relightable Hands

    Authors: Zhaoxi Chen, Gyeongsik Moon, Kaiwen Guo, Chen Cao, Stanislav Pidhorskyi, Tomas Simon, Rohan Joshi, Yuan Dong, Yichen Xu, Bernardo Pires, He Wen, Lucas Evans, Bo Peng, Julia Buffalini, Autumn Trimble, Kevyn McPhail, Melissa Schoeller, Shoou-I Yu, Javier Romero, Michael Zollhöfer, Yaser Sheikh, Ziwei Liu, Shunsuke Saito

    Abstract: Existing photorealistic relightable hand models require extensive identity-specific observations in different views, poses, and illuminations, and face challenges in generalizing to natural illuminations and novel identities. To bridge this gap, we present URHand, the first universal relightable hand model that generalizes across viewpoints, poses, illuminations, and identities. Our model allows f… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: Project Page https://frozenburning.github.io/projects/urhand/

  4. arXiv:2312.13102  [pdf, other

    cs.CV

    SpecNeRF: Gaussian Directional Encoding for Specular Reflections

    Authors: Li Ma, Vasu Agrawal, Haithem Turki, Changil Kim, Chen Gao, Pedro Sander, Michael Zollhöfer, Christian Richardt

    Abstract: Neural radiance fields have achieved remarkable performance in modeling the appearance of 3D scenes. However, existing approaches still struggle with the view-dependent appearance of glossy surfaces, especially under complex lighting of indoor environments. Unlike existing methods, which typically assume distant lighting like an environment map, we propose a learnable Gaussian directional encoding… ▽ More

    Submitted 16 May, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: Accepted to CVPR2024 as Highlight, Project page: https://limacv.github.io/SpecNeRF_web/

  5. arXiv:2312.08679  [pdf, other

    cs.CV cs.AI cs.GR

    A Local Appearance Model for Volumetric Capture of Diverse Hairstyle

    Authors: Ziyan Wang, Giljoo Nam, Aljaz Bozic, Chen Cao, Jason Saragih, Michael Zollhoefer, Jessica Hodgins

    Abstract: Hair plays a significant role in personal identity and appearance, making it an essential component of high-quality, photorealistic avatars. Existing approaches either focus on modeling the facial region only or rely on personalized models, limiting their generalizability and scalability. In this paper, we present a novel method for creating high-fidelity avatars with diverse hairstyles. Our metho… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  6. arXiv:2312.03160  [pdf, other

    cs.CV cs.GR cs.LG

    HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces

    Authors: Haithem Turki, Vasu Agrawal, Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder, Deva Ramanan, Michael Zollhöfer, Christian Richardt

    Abstract: Neural radiance fields provide state-of-the-art view synthesis quality but tend to be slow to render. One reason is that they make use of volume rendering, thus requiring many samples (and model queries) per ray at render time. Although this representation is flexible and easy to optimize, most real-world objects can be modeled more efficiently with surfaces instead of volumes, requiring far fewer… ▽ More

    Submitted 27 March, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 Project page: https://haithemturki.com/hybrid-nerf/

  7. arXiv:2312.00252  [pdf, other

    cs.CV cs.GR cs.LG

    PyNeRF: Pyramidal Neural Radiance Fields

    Authors: Haithem Turki, Michael Zollhöfer, Christian Richardt, Deva Ramanan

    Abstract: Neural Radiance Fields (NeRFs) can be dramatically accelerated by spatial grid representations. However, they do not explicitly reason about scale and so introduce aliasing artifacts when reconstructing scenes captured at different camera distances. Mip-NeRF and its extensions propose scale-aware renderers that project volumetric frustums rather than point samples but such approaches rely on posit… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

    Comments: Neurips 2023 Project page: https://haithemturki.com/pynerf/

  8. arXiv:2311.08581  [pdf, other

    cs.CV

    Drivable 3D Gaussian Avatars

    Authors: Wojciech Zielonka, Timur Bagautdinov, Shunsuke Saito, Michael Zollhöfer, Justus Thies, Javier Romero

    Abstract: We present Drivable 3D Gaussian Avatars (D3GA), the first 3D controllable model for human bodies rendered with Gaussian splats. Current photorealistic drivable avatars require either accurate 3D registrations during training, dense input images during testing, or both. The ones based on neural radiance fields also tend to be prohibitively slow for telepresence applications. This work uses the rece… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: Website: https://zielon.github.io/d3ga/

  9. VR-NeRF: High-Fidelity Virtualized Walkable Spaces

    Authors: Linning Xu, Vasu Agrawal, William Laney, Tony Garcia, Aayush Bansal, Changil Kim, Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder, Aljaž Božič, Dahua Lin, Michael Zollhöfer, Christian Richardt

    Abstract: We present an end-to-end system for the high-fidelity capture, model reconstruction, and real-time rendering of walkable spaces in virtual reality using neural radiance fields. To this end, we designed and built a custom multi-camera rig to densely capture walkable spaces in high fidelity and with multi-view high dynamic range images in unprecedented quality and density. We extend instant neural g… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

    Comments: SIGGRAPH Asia 2023; Project page: https://vr-nerf.github.io

  10. Diffusion Posterior Illumination for Ambiguity-aware Inverse Rendering

    Authors: Linjie Lyu, Ayush Tewari, Marc Habermann, Shunsuke Saito, Michael Zollhöfer, Thomas Leimkühler, Christian Theobalt

    Abstract: Inverse rendering, the process of inferring scene properties from images, is a challenging inverse problem. The task is ill-posed, as many different scene configurations can give rise to the same image. Most existing solutions incorporate priors into the inverse-rendering pipeline to encourage plausible solutions, but they do not consider the inherent ambiguities and the multi-modal distribution o… ▽ More

    Submitted 30 September, 2023; originally announced October 2023.

    Comments: SIGGRAPH Asia 2023

  11. arXiv:2308.08258  [pdf, other

    cs.CV cs.GR

    SceNeRFlow: Time-Consistent Reconstruction of General Dynamic Scenes

    Authors: Edith Tretschk, Vladislav Golyanik, Michael Zollhoefer, Aljaz Bozic, Christoph Lassner, Christian Theobalt

    Abstract: Existing methods for the 4D reconstruction of general, non-rigidly deforming objects focus on novel-view synthesis and neglect correspondences. However, time consistency enables advanced downstream tasks like 3D editing, motion analysis, or virtual-asset creation. We propose SceNeRFlow to reconstruct a general, non-rigid scene in a time-consistent manner. Our dynamic-NeRF method takes multi-view R… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

    Comments: Project page: https://vcai.mpi-inf.mpg.de/projects/scenerflow/

  12. arXiv:2301.02238  [pdf, other

    cs.CV

    HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling

    Authors: Benjamin Attal, Jia-Bin Huang, Christian Richardt, Michael Zollhoefer, Johannes Kopf, Matthew O'Toole, Changil Kim

    Abstract: Volumetric scene representations enable photorealistic view synthesis for static scenes and form the basis of several existing 6-DoF video techniques. However, the volume rendering procedures that drive these representations necessitate careful trade-offs in terms of quality, rendering speed, and memory efficiency. In particular, existing methods fail to simultaneously achieve real-time performanc… ▽ More

    Submitted 29 May, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

    Comments: Project page: https://hyperreel.github.io/

  13. arXiv:2212.00613  [pdf, other

    cs.CV cs.GR

    NeuWigs: A Neural Dynamic Model for Volumetric Hair Capture and Animation

    Authors: Ziyan Wang, Giljoo Nam, Tuur Stuyck, Stephen Lombardi, Chen Cao, Jason Saragih, Michael Zollhoefer, Jessica Hodgins, Christoph Lassner

    Abstract: The capture and animation of human hair are two of the major challenges in the creation of realistic avatars for the virtual reality. Both problems are highly challenging, because hair has complex geometry and appearance, as well as exhibits challenging motion. In this paper, we present a two-stage approach that models hair independently from the head to address these challenges in a data-driven m… ▽ More

    Submitted 11 October, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

  14. arXiv:2210.12003  [pdf, other

    cs.CV

    HDHumans: A Hybrid Approach for High-fidelity Digital Humans

    Authors: Marc Habermann, Lingjie Liu, Weipeng Xu, Gerard Pons-Moll, Michael Zollhoefer, Christian Theobalt

    Abstract: Photo-real digital human avatars are of enormous importance in graphics, as they enable immersive communication over the globe, improve gaming and entertainment experiences, and can be particularly beneficial for AR and VR settings. However, current avatar generation approaches either fall short in high-fidelity novel view synthesis, generalization to novel motions, reproduction of loose clothing,… ▽ More

    Submitted 14 July, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

  15. arXiv:2207.10663  [pdf, other

    cs.CV cs.GR

    Neural Pixel Composition: 3D-4D View Synthesis from Multi-Views

    Authors: Aayush Bansal, Michael Zollhoefer

    Abstract: We present Neural Pixel Composition (NPC), a novel approach for continuous 3D-4D view synthesis given only a discrete set of multi-view observations as input. Existing state-of-the-art approaches require dense multi-view supervision and an extensive computational budget. The proposed formulation reliably operates on sparse and wide-baseline multi-view imagery and can be trained efficiently within… ▽ More

    Submitted 21 July, 2022; originally announced July 2022.

    Comments: A technical report on 3D-4D view synthesis (40 pages, 22 figures and 18 tables). High-resolution version of paper: http://www.aayushbansal.xyz/npc/npc_hi-res.pdf. Project page (containing video results): http://www.aayushbansal.xyz/npc/

  16. arXiv:2207.10312  [pdf, other

    cs.CV cs.GR

    AdaNeRF: Adaptive Sampling for Real-time Rendering of Neural Radiance Fields

    Authors: Andreas Kurz, Thomas Neff, Zhaoyang Lv, Michael Zollhöfer, Markus Steinberger

    Abstract: Novel view synthesis has recently been revolutionized by learning neural radiance fields directly from sparse observations. However, rendering images with this new paradigm is slow due to the fact that an accurate quadrature of the volume rendering equation requires a large number of samples for each ray. Previous work has mainly focused on speeding up the network evaluations that are associated w… ▽ More

    Submitted 28 July, 2022; v1 submitted 21 July, 2022; originally announced July 2022.

    Comments: ECCV 2022. Project page: https://thomasneff.github.io/adanerf

  17. arXiv:2206.08929  [pdf, other

    cs.CV cs.AI

    TAVA: Template-free Animatable Volumetric Actors

    Authors: Ruilong Li, Julian Tanke, Minh Vo, Michael Zollhofer, Jurgen Gall, Angjoo Kanazawa, Christoph Lassner

    Abstract: Coordinate-based volumetric representations have the potential to generate photo-realistic virtual avatars from images. However, virtual avatars also need to be controllable even to a novel pose that may not have been observed. Traditional techniques, such as LBS, provide such a function; yet it usually requires a hand-designed body template, 3D scan data, and limited appearance models. On the oth… ▽ More

    Submitted 20 June, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: Code: https://github.com/facebookresearch/tava; Project Website: https://www.liruilong.cn/projects/tava/

  18. arXiv:2205.08525  [pdf, other

    cs.CV

    Self-supervised Neural Articulated Shape and Appearance Models

    Authors: Fangyin Wei, Rohan Chabra, Lingni Ma, Christoph Lassner, Michael Zollhöfer, Szymon Rusinkiewicz, Chris Sweeney, Richard Newcombe, Mira Slavcheva

    Abstract: Learning geometry, motion, and appearance priors of object classes is important for the solution of a large variety of computer vision problems. While the majority of approaches has focused on static objects, dynamic objects, especially with controllable articulation, are less explored. We propose a novel approach for learning a representation of the geometry, appearance, and motion of a class of… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

    Comments: 15 pages. CVPR 2022. Project page available at https://weify627.github.io/nasam/

  19. arXiv:2205.04992  [pdf, other

    cs.CV

    KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints

    Authors: Marko Mihajlovic, Aayush Bansal, Michael Zollhoefer, Siyu Tang, Shunsuke Saito

    Abstract: Image-based volumetric humans using pixel-aligned features promise generalization to unseen poses and identities. Prior work leverages global spatial encodings and multi-view geometric consistency to reduce spatial ambiguity. However, global encodings often suffer from overfitting to the distribution of the training data, and it is difficult to learn multi-view consistent reconstruction from spars… ▽ More

    Submitted 21 July, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

    Comments: To appear at ECCV 2022. The project page is available at https://markomih.github.io/KeypointNeRF

  20. arXiv:2204.06184  [pdf, other

    cs.CV

    COAP: Compositional Articulated Occupancy of People

    Authors: Marko Mihajlovic, Shunsuke Saito, Aayush Bansal, Michael Zollhoefer, Siyu Tang

    Abstract: We present a novel neural implicit representation for articulated human bodies. Compared to explicit template meshes, neural implicit body representations provide an efficient mechanism for modeling interactions with the environment, which is essential for human motion reconstruction and synthesis in 3D scenes. However, existing neural implicit bodies suffer from either poor generalization on high… ▽ More

    Submitted 13 April, 2022; originally announced April 2022.

    Comments: To appear at CVPR 2022. The project page is available at https://neuralbodies.github.io/COAP/index.html

  21. arXiv:2204.02296  [pdf, other

    cs.RO cs.CV

    iSDF: Real-Time Neural Signed Distance Fields for Robot Perception

    Authors: Joseph Ortiz, Alexander Clegg, **g Dong, Edgar Sucar, David Novotny, Michael Zollhoefer, Mustafa Mukadam

    Abstract: We present iSDF, a continual learning system for real-time signed distance field (SDF) reconstruction. Given a stream of posed depth images from a moving camera, it trains a randomly initialised neural network to map input 3D coordinate to approximate signed distance. The model is self-supervised by minimising a loss that bounds the predicted signed distance using the distance to the closest sampl… ▽ More

    Submitted 4 May, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: Published in Robotics: Science and Systems (RSS) 2022. Project page: https://joeaortiz.github.io/iSDF/

  22. arXiv:2204.00161  [pdf, other

    cs.HC cs.AI cs.CV

    Mutual Scene Synthesis for Mixed Reality Telepresence

    Authors: Mohammad Keshavarzi, Michael Zollhoefer, Allen Y. Yang, Patrick Peluse, Luisa Caldas

    Abstract: Remote telepresence via next-generation mixed reality platforms can provide higher levels of immersion for computer-mediated communications, allowing participants to engage in a wide spectrum of activities, previously not possible in 2D screen-based communication methods. However, as mixed reality experiences are limited to the local physical surrounding of each user, finding a common virtual grou… ▽ More

    Submitted 31 March, 2022; originally announced April 2022.

    Comments: 11 pages

  23. arXiv:2203.13817  [pdf, other

    cs.CV cs.GR

    AutoAvatar: Autoregressive Neural Fields for Dynamic Avatar Modeling

    Authors: Ziqian Bai, Timur Bagautdinov, Javier Romero, Michael Zollhöfer, ** Tan, Shunsuke Saito

    Abstract: Neural fields such as implicit surfaces have recently enabled avatar modeling from raw scans without explicit temporal correspondences. In this work, we exploit autoregressive modeling to further extend this notion to capture dynamic effects, such as soft-tissue deformations. Although autoregressive models are naturally capable of handling dynamics, it is non-trivial to apply them to implicit repr… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Comments: Project page: https://zqbai-jeremy.github.io/autoavatar

  24. arXiv:2112.06904  [pdf, other

    cs.CV cs.GR

    HVH: Learning a Hybrid Neural Volumetric Representation for Dynamic Hair Performance Capture

    Authors: Ziyan Wang, Giljoo Nam, Tuur Stuyck, Stephen Lombardi, Michael Zollhoefer, Jessica Hodgins, Christoph Lassner

    Abstract: Capturing and rendering life-like hair is particularly challenging due to its fine geometric structure, the complex physical interaction and its non-trivial visual appearance.Yet, hair is a critical component for believable avatars. In this paper, we address the aforementioned problems: 1) we use a novel, volumetric hair representation that is com-posed of thousands of primitives. Each primitive c… ▽ More

    Submitted 19 December, 2021; v1 submitted 13 December, 2021; originally announced December 2021.

  25. arXiv:2112.01523  [pdf, other

    cs.CV

    Learning Neural Light Fields with Ray-Space Embedding Networks

    Authors: Benjamin Attal, Jia-Bin Huang, Michael Zollhoefer, Johannes Kopf, Changil Kim

    Abstract: Neural radiance fields (NeRFs) produce state-of-the-art view synthesis results. However, they are slow to render, requiring hundreds of network evaluations per pixel to approximate a volume rendering integral. Baking NeRFs into explicit data structures enables efficient rendering, but results in a large increase in memory footprint and, in many cases, a quality reduction. In this paper, we propose… ▽ More

    Submitted 10 May, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: CVPR 2022 camera ready revision. Major changes include: 1. Additional comparison to NeX on Stanford, RealFF, Shiny datasets 2. Experiment on 360 degree lego bulldozer scene in the appendix, using Pluecker parameterization 3. Moving student-teacher results to the appendix 4. Clarity edits -- in particular, making it clear that our Stanford evaluation *does not* use subdivision

  26. arXiv:2111.10563  [pdf, other

    cs.CV

    A Deeper Look into DeepCap

    Authors: Marc Habermann, Weipeng Xu, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt

    Abstract: Human performance capture is a highly important computer vision problem with many applications in movie production and virtual/augmented reality. Many previous performance capture approaches either required expensive multi-view setups or did not recover dense space-time coherent geometry with frame-to-frame correspondences. We propose a novel deep learning approach for monocular dense human perfor… ▽ More

    Submitted 20 November, 2021; originally announced November 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2003.08325

  27. arXiv:2111.05849  [pdf, other

    cs.GR cs.CV

    Advances in Neural Rendering

    Authors: Ayush Tewari, Justus Thies, Ben Mildenhall, Pratul Srinivasan, Edgar Tretschk, Yifan Wang, Christoph Lassner, Vincent Sitzmann, Ricardo Martin-Brualla, Stephen Lombardi, Tomas Simon, Christian Theobalt, Matthias Niessner, Jonathan T. Barron, Gordon Wetzstein, Michael Zollhoefer, Vladislav Golyanik

    Abstract: Synthesizing photo-realistic images and videos is at the heart of computer graphics and has been the focus of decades of research. Traditionally, synthetic images of a scene are generated using rendering algorithms such as rasterization or ray tracing, which take specifically defined representations of geometry and material properties as input. Collectively, these inputs define the actual scene an… ▽ More

    Submitted 30 March, 2022; v1 submitted 10 November, 2021; originally announced November 2021.

    Comments: 33 pages, 14 figures, 5 tables; State of the Art Report at EUROGRAPHICS 2022

  28. arXiv:2107.02407  [pdf, other

    cs.CV

    NRST: Non-rigid Surface Tracking from Monocular Video

    Authors: Marc Habermann, Weipeng Xu, Helge Rhodin, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt

    Abstract: We propose an efficient method for non-rigid surface tracking from monocular RGB videos. Given a video and a template mesh, our algorithm sequentially registers the template non-rigidly to each frame. We formulate the per-frame registration as an optimization problem that includes a novel texture term specifically tailored towards tracking objects with uniform texture but fine-scale structure, suc… ▽ More

    Submitted 12 July, 2021; v1 submitted 6 July, 2021; originally announced July 2021.

  29. Real-time Deep Dynamic Characters

    Authors: Marc Habermann, Lingjie Liu, Weipeng Xu, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt

    Abstract: We propose a deep videorealistic 3D human character model displaying highly realistic shape, motion, and dynamic appearance learned in a new weakly supervised way from multi-view imagery. In contrast to previous work, our controllable 3D character displays dynamics, e.g., the swing of the skirt, dependent on skeletal body motion in an efficient data-driven way, without requiring complex physics si… ▽ More

    Submitted 31 August, 2021; v1 submitted 4 May, 2021; originally announced May 2021.

    Journal ref: ACM Transactions on Graphics (SIGGRAPH 2021)

  30. arXiv:2104.08223  [pdf, other

    cs.CV

    MeshTalk: 3D Face Animation from Speech using Cross-Modality Disentanglement

    Authors: Alexander Richard, Michael Zollhoefer, Yandong Wen, Fernando de la Torre, Yaser Sheikh

    Abstract: This paper presents a generic method for generating full facial 3D animation from speech. Existing approaches to audio-driven facial animation exhibit uncanny or static upper face animation, fail to produce accurate and plausible co-articulation or rely on person-specific models that limit their scalability. To improve upon existing models, we propose a generic audio-driven facial animation approa… ▽ More

    Submitted 20 May, 2022; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: updated link to github repository and supplemental video

  31. arXiv:2103.02597  [pdf, other

    cs.CV cs.GR

    Neural 3D Video Synthesis from Multi-view Video

    Authors: Tianye Li, Mira Slavcheva, Michael Zollhoefer, Simon Green, Christoph Lassner, Changil Kim, Tanner Schmidt, Steven Lovegrove, Michael Goesele, Richard Newcombe, Zhaoyang Lv

    Abstract: We propose a novel approach for 3D video synthesis that is able to represent multi-view video recordings of a dynamic real-world scene in a compact, yet expressive representation that enables high-quality view synthesis and motion interpolation. Our approach takes the high quality and compactness of static neural radiance fields in a new direction: to a model-free, dynamic setting. At the core of… ▽ More

    Submitted 2 May, 2022; v1 submitted 3 March, 2021; originally announced March 2021.

    Comments: Accepted as an oral presentation for CVPR 2022. Project website: https://neural-3d-video.github.io/

  32. arXiv:2103.01954  [pdf, other

    cs.GR cs.CV

    Mixture of Volumetric Primitives for Efficient Neural Rendering

    Authors: Stephen Lombardi, Tomas Simon, Gabriel Schwartz, Michael Zollhoefer, Yaser Sheikh, Jason Saragih

    Abstract: Real-time rendering and animation of humans is a core function in games, movies, and telepresence applications. Existing methods have a number of drawbacks we aim to address with our work. Triangle meshes have difficulty modeling thin structures like hair, volumetric representations like Neural Volumes are too low-resolution given a reasonable memory budget, and high-resolution implicit representa… ▽ More

    Submitted 6 May, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

    Comments: 13 pages; SIGGRAPH 2021

  33. arXiv:2102.06199  [pdf, other

    cs.CV cs.GR

    A-NeRF: Articulated Neural Radiance Fields for Learning Human Shape, Appearance, and Pose

    Authors: Shih-Yang Su, Frank Yu, Michael Zollhoefer, Helge Rhodin

    Abstract: While deep learning reshaped the classical motion capture pipeline with feed-forward networks, generative models are required to recover fine alignment via iterative refinement. Unfortunately, the existing models are usually hand-crafted or learned in controlled conditions, only applicable to limited domains. We propose a method to learn a generative neural body model from unlabelled monocular vid… ▽ More

    Submitted 28 October, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

    Comments: NeurIPS 2021. Project website: https://lemonatsu.github.io/anerf/

  34. arXiv:2101.02697  [pdf, other

    cs.CV

    PVA: Pixel-aligned Volumetric Avatars

    Authors: Amit Raj, Michael Zollhoefer, Tomas Simon, Jason Saragih, Shunsuke Saito, James Hays, Stephen Lombardi

    Abstract: Acquisition and rendering of photo-realistic human heads is a highly challenging research problem of particular importance for virtual telepresence. Currently, the highest quality is achieved by volumetric approaches trained in a person specific manner on multi-view data. These models better represent fine structure, such as hair, compared to simpler mesh-based models. Volumetric models typically… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

    Comments: Project page located at https://volumetric-avatars.github.io/

  35. arXiv:2012.12247  [pdf, other

    cs.CV cs.GR

    Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Dynamic Scene From Monocular Video

    Authors: Edgar Tretschk, Ayush Tewari, Vladislav Golyanik, Michael Zollhöfer, Christoph Lassner, Christian Theobalt

    Abstract: We present Non-Rigid Neural Radiance Fields (NR-NeRF), a reconstruction and novel view synthesis approach for general non-rigid dynamic scenes. Our approach takes RGB images of a dynamic scene as input (e.g., from a monocular video recording), and creates a high-quality space-time geometry and appearance representation. We show that a single handheld consumer-grade camera is sufficient to synthesi… ▽ More

    Submitted 23 August, 2021; v1 submitted 22 December, 2020; originally announced December 2020.

    Comments: Project page (incl. supplemental videos and code): https://vcai.mpi-inf.mpg.de/projects/nonrigid_nerf/ or https://gvv.mpi-inf.mpg.de/projects/nonrigid_nerf/

  36. arXiv:2012.09955  [pdf, other

    cs.CV cs.GR

    Learning Compositional Radiance Fields of Dynamic Human Heads

    Authors: Ziyan Wang, Timur Bagautdinov, Stephen Lombardi, Tomas Simon, Jason Saragih, Jessica Hodgins, Michael Zollhöfer

    Abstract: Photorealistic rendering of dynamic humans is an important ability for telepresence systems, virtual shop**, synthetic data generation, and more. Recently, neural rendering methods, which combine techniques from computer graphics and machine learning, have created high-fidelity models of humans and objects. Some of these methods do not produce results with high-enough fidelity for driveable huma… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

  37. arXiv:2012.03065  [pdf, other

    cs.CV cs.GR

    Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction

    Authors: Guy Gafni, Justus Thies, Michael Zollhöfer, Matthias Nießner

    Abstract: We present dynamic neural radiance fields for modeling the appearance and dynamics of a human face. Digitally modeling and reconstructing a talking human is a key building-block for a variety of applications. Especially, for telepresence applications in AR or VR, a faithful reproduction of the appearance including novel viewpoints or head-poses is required. In contrast to state-of-the-art approach… ▽ More

    Submitted 5 December, 2020; originally announced December 2020.

    Comments: Video: https://youtu.be/m7oROLdQnjk | Project page: https://gafniguy.github.io/4D-Facial-Avatars/

  38. arXiv:2012.01451  [pdf, other

    cs.CV cs.GR cs.LG

    Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction

    Authors: Aljaž Božič, Pablo Palafox, Michael Zollhöfer, Justus Thies, Angela Dai, Matthias Nießner

    Abstract: We introduce Neural Deformation Graphs for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects. Specifically, we implicitly model a deformation graph via a deep neural network. This neural deformation graph does not rely on any object-specific structure and, thus, can be applied to general non-rigid deformation tracking. Our method globally optimizes this neural gra… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

    Comments: Video: https://youtu.be/vyq36eFkdWo

  39. arXiv:2009.09485  [pdf, other

    cs.CV cs.GR

    PIE: Portrait Image Embedding for Semantic Control

    Authors: Ayush Tewari, Mohamed Elgharib, Mallikarjun B R., Florian Bernard, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, Christian Theobalt

    Abstract: Editing of portrait images is a very popular and important research topic with a large variety of applications. For ease of use, control should be provided via a semantically meaningful parameterization that is akin to computer animation controls. The vast majority of existing techniques do not provide such intuitive and fine-grained control, or only enable coarse editing of a single isolated cont… ▽ More

    Submitted 20 September, 2020; originally announced September 2020.

    Comments: To appear in SIGGRAPH Asia 2020. Project webpage: https://gvv.mpi-inf.mpg.de/projects/PIE/

  40. arXiv:2008.01639  [pdf, other

    cs.CV cs.GR

    PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape Representations

    Authors: Edgar Tretschk, Ayush Tewari, Vladislav Golyanik, Michael Zollhöfer, Carsten Stoll, Christian Theobalt

    Abstract: Implicit surface representations, such as signed-distance functions, combined with deep learning have led to impressive models which can represent detailed shapes of objects with arbitrary topology. Since a continuous function is learned, the reconstructions can also be extracted at any arbitrary resolution. However, large datasets such as ShapeNet are required to train such models. In this paper,… ▽ More

    Submitted 5 February, 2021; v1 submitted 4 August, 2020; originally announced August 2020.

    Comments: 25 pages, including supplementary material. Code: https://github.com/edgar-tr/patchnets Project page: https://gvv.mpi-inf.mpg.de/projects/PatchNets/

  41. arXiv:2007.14808  [pdf, other

    cs.CV

    Face2Face: Real-time Face Capture and Reenactment of RGB Videos

    Authors: Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, Matthias Nießner

    Abstract: We present Face2Face, a novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video). The source sequence is also a monocular video stream, captured live with a commodity webcam. Our goal is to animate the facial expressions of the target video by a source actor and re-render the manipulated output video in a photo-realistic fashion. To this end, we fi… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

    Comments: https://justusthies.github.io/posts/acm-research-highlight/

    Journal ref: CVPR2016

  42. arXiv:2006.13240  [pdf, other

    cs.CV cs.GR cs.LG eess.IV

    Neural Non-Rigid Tracking

    Authors: Aljaž Božič, Pablo Palafox, Michael Zollhöfer, Angela Dai, Justus Thies, Matthias Nießner

    Abstract: We introduce a novel, end-to-end learnable, differentiable non-rigid tracker that enables state-of-the-art non-rigid reconstruction by a learned robust optimization. Given two input RGB-D frames of a non-rigidly moving object, we employ a convolutional neural network to predict dense correspondences and their confidences. These correspondences are used as constraints in an as-rigid-as-possible (AR… ▽ More

    Submitted 12 January, 2021; v1 submitted 23 June, 2020; originally announced June 2020.

    Comments: Video: https://youtu.be/nqYaxM6Rj8I, Code: https://github.com/DeformableFriends/NeuralTracking

  43. arXiv:2004.07484  [pdf, other

    cs.GR

    Pulsar: Efficient Sphere-based Neural Rendering

    Authors: Christoph Lassner, Michael Zollhöfer

    Abstract: We propose Pulsar, an efficient sphere-based differentiable renderer that is orders of magnitude faster than competing techniques, modular, and easy-to-use due to its tight integration with PyTorch. Differentiable rendering is the foundation for modern neural rendering approaches, since it enables end-to-end training of 3D scene representations from image observations. However, gradient-based opti… ▽ More

    Submitted 22 December, 2020; v1 submitted 16 April, 2020; originally announced April 2020.

  44. arXiv:2004.03805  [pdf, other

    cs.CV cs.GR

    State of the Art on Neural Rendering

    Authors: Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, Rohit Pandey, Sean Fanello, Gordon Wetzstein, Jun-Yan Zhu, Christian Theobalt, Maneesh Agrawala, Eli Shechtman, Dan B Goldman, Michael Zollhöfer

    Abstract: Efficient rendering of photo-realistic virtual worlds is a long standing effort of computer graphics. Modern graphics techniques have succeeded in synthesizing photo-realistic images from hand-crafted scene representations. However, the automatic generation of shape, materials, lighting, and other aspects of scenes remains a challenging problem that, if solved, would make photo-realistic computer… ▽ More

    Submitted 8 April, 2020; originally announced April 2020.

    Comments: Eurographics 2020 survey paper

  45. arXiv:2004.00121  [pdf, other

    cs.CV cs.GR

    StyleRig: Rigging StyleGAN for 3D Control over Portrait Images

    Authors: Ayush Tewari, Mohamed Elgharib, Gaurav Bharaj, Florian Bernard, Hans-Peter Seidel, Patrick Pérez, Michael Zollhöfer, Christian Theobalt

    Abstract: StyleGAN generates photorealistic portrait images of faces with eyes, teeth, hair and context (neck, shoulders, background), but lacks a rig-like control over semantic face parameters that are interpretable in 3D, such as face pose, expressions, and scene illumination. Three-dimensional morphable face models (3DMMs) on the other hand offer control over the semantic parameters, but lack photorealis… ▽ More

    Submitted 13 June, 2020; v1 submitted 31 March, 2020; originally announced April 2020.

    Comments: CVPR 2020 (Oral). Project page: https://gvv.mpi-inf.mpg.de/projects/StyleRig/

  46. arXiv:2003.08325  [pdf, other

    cs.CV

    DeepCap: Monocular Human Performance Capture Using Weak Supervision

    Authors: Marc Habermann, Weipeng Xu, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt

    Abstract: Human performance capture is a highly important computer vision problem with many applications in movie production and virtual/augmented reality. Many previous performance capture approaches either required expensive multi-view setups or did not recover dense space-time coherent geometry with frame-to-frame correspondences. We propose a novel deep learning approach for monocular dense human perfor… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

  47. arXiv:2001.04947  [pdf, other

    cs.GR cs.CV

    Neural Human Video Rendering by Learning Dynamic Textures and Rendering-to-Video Translation

    Authors: Lingjie Liu, Weipeng Xu, Marc Habermann, Michael Zollhoefer, Florian Bernard, Hyeongwoo Kim, Wen** Wang, Christian Theobalt

    Abstract: Synthesizing realistic videos of humans using neural networks has been a popular alternative to the conventional graphics-based rendering pipeline due to its high efficiency. Existing works typically formulate this as an image-to-image translation problem in 2D screen space, which leads to artifacts such as over-smoothing, missing body parts, and temporal instability of fine-scale detail, such as… ▽ More

    Submitted 5 July, 2021; v1 submitted 14 January, 2020; originally announced January 2020.

  48. arXiv:1912.04302  [pdf, other

    cs.CV cs.AI cs.GR

    DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data

    Authors: Aljaž Božič, Michael Zollhöfer, Christian Theobalt, Matthias Nießner

    Abstract: Applying data-driven approaches to non-rigid 3D reconstruction has been difficult, which we believe can be attributed to the lack of a large-scale training corpus. Unfortunately, this method fails for important cases such as highly non-rigid deformations. We first address this problem of lack of data by introducing a novel semi-supervised strategy to obtain dense inter-frame correspondences from a… ▽ More

    Submitted 24 March, 2020; v1 submitted 9 December, 2019; originally announced December 2019.

    Comments: Video: https://youtu.be/OrHLacCDZVQ

  49. arXiv:1909.02518  [pdf, other

    cs.CV cs.GR cs.LG

    Neural Style-Preserving Visual Dubbing

    Authors: Hyeongwoo Kim, Mohamed Elgharib, Michael Zollhöfer, Hans-Peter Seidel, Thabo Beeler, Christian Richardt, Christian Theobalt

    Abstract: Dubbing is a technique for translating video content from one language to another. However, state-of-the-art visual dubbing techniques directly copy facial expressions from source to target actors without considering identity-specific idiosyncrasies such as a unique type of smile. We present a style-preserving visual dubbing approach from single video inputs, which maintains the signature style of… ▽ More

    Submitted 6 September, 2019; v1 submitted 5 September, 2019; originally announced September 2019.

    Comments: SIGGRAPH Asia 2019

  50. arXiv:1909.01815  [pdf, other

    cs.CV cs.GR cs.LG

    3D Morphable Face Models -- Past, Present and Future

    Authors: Bernhard Egger, William A. P. Smith, Ayush Tewari, Stefanie Wuhrer, Michael Zollhoefer, Thabo Beeler, Florian Bernard, Timo Bolkart, Adam Kortylewski, Sami Romdhani, Christian Theobalt, Volker Blanz, Thomas Vetter

    Abstract: In this paper, we provide a detailed survey of 3D Morphable Face Models over the 20 years since they were first proposed. The challenges in building and applying these models, namely capture, modeling, image formation, and image analysis, are still active research topics, and we review the state-of-the-art in each of these areas. We also look ahead, identifying unsolved challenges, proposing direc… ▽ More

    Submitted 16 April, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: ACM Transactions on Graphics (TOG)