Skip to main content

Showing 1–50 of 56 results for author: Sunkavalli, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.14847  [pdf, other

    cs.CV

    Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

    Authors: Liwen Wu, Sai Bi, Zexiang Xu, Fujun Luan, Kai Zhang, Iliyan Georgiev, Kalyan Sunkavalli, Ravi Ramamoorthi

    Abstract: Novel-view synthesis of specular objects like shiny metals or glossy paints remains a significant challenge. Not only the glossy appearance but also global illumination effects, including reflections of other objects in the environment, are critical components to faithfully reproduce a scene. In this paper, we present Neural Directional Encoding (NDE), a view-dependent appearance encoding of neura… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024

  2. arXiv:2404.19702  [pdf, other

    cs.CV

    GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting

    Authors: Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, Zexiang Xu

    Abstract: We propose GS-LRM, a scalable large reconstruction model that can predict high-quality 3D Gaussian primitives from 2-4 posed sparse images in 0.23 seconds on single A100 GPU. Our model features a very simple transformer-based architecture; we patchify input posed images, pass the concatenated multi-view image tokens through a sequence of transformer blocks, and decode final per-pixel Gaussian para… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: Project webpage: https://sai-bi.github.io/project/gs-lrm/

  3. arXiv:2404.12385  [pdf, other

    cs.CV cs.GR

    MeshLRM: Large Reconstruction Model for High-Quality Mesh

    Authors: Xinyue Wei, Kai Zhang, Sai Bi, Hao Tan, Fujun Luan, Valentin Deschaintre, Kalyan Sunkavalli, Hao Su, Zexiang Xu

    Abstract: We propose MeshLRM, a novel LRM-based approach that can reconstruct a high-quality mesh from merely four input images in less than one second. Different from previous large reconstruction models (LRMs) that focus on NeRF-based reconstruction, MeshLRM incorporates differentiable mesh extraction and rendering within the LRM framework. This allows for end-to-end mesh reconstruction by fine-tuning a p… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  4. arXiv:2403.10615  [pdf, other

    cs.CV cs.GR cs.LG

    LightIt: Illumination Modeling and Control for Diffusion Models

    Authors: Peter Kocsis, Julien Philip, Kalyan Sunkavalli, Matthias Nießner, Yannick Hold-Geoffroy

    Abstract: We introduce LightIt, a method for explicit illumination control for image generation. Recent generative methods lack lighting control, which is crucial to numerous artistic aspects of image generation such as setting the overall mood or cinematic appearance. To overcome these limitations, we propose to condition the generation on shading and normal maps. We model the lighting with single bounce s… ▽ More

    Submitted 25 March, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: Project page: https://peter-kocsis.github.io/LightIt/ Video: https://youtu.be/cCfSBD5aPLI

    ACM Class: I.4.8; I.2.10

  5. arXiv:2311.12024  [pdf, other

    cs.CV

    PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

    Authors: Peng Wang, Hao Tan, Sai Bi, Yinghao Xu, Fujun Luan, Kalyan Sunkavalli, Wen** Wang, Zexiang Xu, Kai Zhang

    Abstract: We propose a Pose-Free Large Reconstruction Model (PF-LRM) for reconstructing a 3D object from a few unposed images even with little visual overlap, while simultaneously estimating the relative camera poses in ~1.3 seconds on a single A100 GPU. PF-LRM is a highly scalable method utilizing the self-attention blocks to exchange information between 3D object tokens and 2D image tokens; we predict a c… ▽ More

    Submitted 23 November, 2023; v1 submitted 20 November, 2023; originally announced November 2023.

    Comments: Project website: https://totoro97.github.io/pf-lrm ; add more experiments

  6. arXiv:2311.09217  [pdf, other

    cs.CV

    DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

    Authors: Yinghao Xu, Hao Tan, Fujun Luan, Sai Bi, Peng Wang, Jiahao Li, Zifan Shi, Kalyan Sunkavalli, Gordon Wetzstein, Zexiang Xu, Kai Zhang

    Abstract: We propose \textbf{DMV3D}, a novel 3D generation approach that uses a transformer-based 3D large reconstruction model to denoise multi-view diffusion. Our reconstruction model incorporates a triplane NeRF representation and can denoise noisy multi-view images via NeRF reconstruction and rendering, achieving single-stage 3D generation in $\sim$30s on single A100 GPU. We train \textbf{DMV3D} on larg… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: Project Page: https://justimyhxu.github.io/projects/dmv3d/

  7. arXiv:2311.06214  [pdf, other

    cs.CV

    Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model

    Authors: Jiahao Li, Hao Tan, Kai Zhang, Zexiang Xu, Fujun Luan, Yinghao Xu, Yicong Hong, Kalyan Sunkavalli, Greg Shakhnarovich, Sai Bi

    Abstract: Text-to-3D with diffusion models has achieved remarkable progress in recent years. However, existing methods either rely on score distillation-based optimization which suffer from slow inference, low diversity and Janus problems, or are feed-forward methods that generate low-quality results due to the scarcity of 3D training data. In this paper, we propose Instant3D, a novel method that generates… ▽ More

    Submitted 23 November, 2023; v1 submitted 10 November, 2023; originally announced November 2023.

    Comments: Project webpage: https://jiahao.ai/instant3d/

  8. arXiv:2311.04400  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    LRM: Large Reconstruction Model for Single Image to 3D

    Authors: Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, Hao Tan

    Abstract: We propose the first Large Reconstruction Model (LRM) that predicts the 3D model of an object from a single input image within just 5 seconds. In contrast to many previous methods that are trained on small-scale datasets such as ShapeNet in a category-specific fashion, LRM adopts a highly scalable transformer-based architecture with 500 million learnable parameters to directly predict a neural rad… ▽ More

    Submitted 9 March, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: ICLR 2024

  9. arXiv:2309.11009  [pdf, other

    cs.CV

    Controllable Dynamic Appearance for Neural 3D Portraits

    Authors: ShahRukh Athar, Zhixin Shu, Zexiang Xu, Fujun Luan, Sai Bi, Kalyan Sunkavalli, Dimitris Samaras

    Abstract: Recent advances in Neural Radiance Fields (NeRFs) have made it possible to reconstruct and reanimate dynamic portrait scenes with control over head-pose, facial expressions and viewing direction. However, training such models assumes photometric consistency over the deformed region e.g. the face must be evenly lit as it deforms with changing head-pose and facial expression. Such photometric consis… ▽ More

    Submitted 21 September, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

  10. arXiv:2305.17134  [pdf, other

    cs.CV

    NeuManifold: Neural Watertight Manifold Reconstruction with Efficient and High-Quality Rendering Support

    Authors: Xinyue Wei, Fanbo Xiang, Sai Bi, Anpei Chen, Kalyan Sunkavalli, Zexiang Xu, Hao Su

    Abstract: We present a method for generating high-quality watertight manifold meshes from multi-view input images. Existing volumetric rendering methods are robust in optimization but tend to generate noisy meshes with poor topology. Differentiable rasterization-based methods can generate high-quality meshes but are sensitive to initialization. Our method combines the benefits of both worlds; we take the ge… ▽ More

    Submitted 6 November, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Project page: https://sarahweiii.github.io/neumanifold/

  11. arXiv:2305.12296  [pdf, other

    cs.CV cs.AI cs.GR

    PhotoMat: A Material Generator Learned from Single Flash Photos

    Authors: Xilong Zhou, Miloš Hašan, Valentin Deschaintre, Paul Guerrero, Yannick Hold-Geoffroy, Kalyan Sunkavalli, Nima Khademi Kalantari

    Abstract: Authoring high-quality digital materials is key to realism in 3D rendering. Previous generative models for materials have been trained exclusively on synthetic data; such data is limited in availability and has a visual gap to real materials. We circumvent this limitation by proposing PhotoMat: the first material generator trained exclusively on real photos of material samples captured using a cel… ▽ More

    Submitted 23 May, 2023; v1 submitted 20 May, 2023; originally announced May 2023.

    Journal ref: Siggraph 2023

  12. arXiv:2212.10699  [pdf, other

    cs.CV cs.GR

    PaletteNeRF: Palette-based Appearance Editing of Neural Radiance Fields

    Authors: Zhengfei Kuang, Fujun Luan, Sai Bi, Zhixin Shu, Gordon Wetzstein, Kalyan Sunkavalli

    Abstract: Recent advances in neural radiance fields have enabled the high-fidelity 3D reconstruction of complex scenes for novel view synthesis. However, it remains underexplored how the appearance of such representations can be efficiently edited while maintaining photorealism. In this work, we present PaletteNeRF, a novel method for photorealistic appearance editing of neural radiance fields (NeRF) base… ▽ More

    Submitted 24 January, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  13. arXiv:2208.12300  [pdf, other

    cs.CV

    A Deep Perceptual Measure for Lens and Camera Calibration

    Authors: Yannick Hold-Geoffroy, Dominique Piché-Meunier, Kalyan Sunkavalli, Jean-Charles Bazin, François Rameau, Jean-François Lalonde

    Abstract: Image editing and compositing have become ubiquitous in entertainment, from digital art to AR and VR experiences. To produce beautiful composites, the camera needs to be geometrically calibrated, which can be tedious and requires a physical calibration target. In place of the traditional multi-image calibration process, we propose to infer the camera calibration parameters such as pitch, roll, fie… ▽ More

    Submitted 26 July, 2023; v1 submitted 25 August, 2022; originally announced August 2022.

    Comments: 12 pages, 12 figures, project page (including live demo) available at https://lvsn.github.io/deepcalib. arXiv admin note: text overlap with arXiv:1712.01259

  14. MatFormer: A Generative Model for Procedural Materials

    Authors: Paul Guerrero, Miloš Hašan, Kalyan Sunkavalli, Radomír Měch, Tamy Boubekeur, Niloy J. Mitra

    Abstract: Procedural material graphs are a compact, parameteric, and resolution-independent representation that are a popular choice for material authoring. However, designing procedural materials requires significant expertise and publicly accessible libraries contain only a few thousand such graphs. We present MatFormer, a generative model that can produce a diverse set of high-quality procedural material… ▽ More

    Submitted 15 August, 2022; v1 submitted 3 July, 2022; originally announced July 2022.

    Journal ref: ACM Transactions on Graphics, Volume 41, Issue 4 (Proceedings of Siggraph 2022)

  15. arXiv:2207.00757  [pdf, other

    cs.CV

    PhotoScene: Photorealistic Material and Lighting Transfer for Indoor Scenes

    Authors: Yu-Ying Yeh, Zhengqin Li, Yannick Hold-Geoffroy, Rui Zhu, Zexiang Xu, Miloš Hašan, Kalyan Sunkavalli, Manmohan Chandraker

    Abstract: Most indoor 3D scene reconstruction methods focus on recovering 3D geometry and scene layout. In this work, we go beyond this to propose PhotoScene, a framework that takes input image(s) of a scene along with approximately aligned CAD geometry (either reconstructed automatically or manually specified) and builds a photorealistic digital twin with high-quality materials and similar lighting. We mod… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

    Comments: Accepted to CVPR 2022; Code is available at https://github.com/ViLab-UCSD/photoscene

  16. arXiv:2206.06481  [pdf, other

    cs.CV

    RigNeRF: Fully Controllable Neural 3D Portraits

    Authors: ShahRukh Athar, Zexiang Xu, Kalyan Sunkavalli, Eli Shechtman, Zhixin Shu

    Abstract: Volumetric neural rendering methods, such as neural radiance fields (NeRFs), have enabled photo-realistic novel view synthesis. However, in their standard form, NeRFs do not support the editing of objects, such as a human head, within a scene. In this work, we propose RigNeRF, a system that goes beyond just novel view synthesis and enables full control of head pose and facial expressions learned f… ▽ More

    Submitted 13 June, 2022; originally announced June 2022.

    Comments: The project page can be found here: http://shahrukhathar.github.io/2022/06/06/RigNeRF.html

  17. arXiv:2206.05649  [pdf, other

    cs.GR cs.CV

    TileGen: Tileable, Controllable Material Generation and Capture

    Authors: Xilong Zhou, Miloš Hašan, Valentin Deschaintre, Paul Guerrero, Kalyan Sunkavalli, Nima Kalantari

    Abstract: Recent methods (e.g. MaterialGAN) have used unconditional GANs to generate per-pixel material maps, or as a prior to reconstruct materials from input photographs. These models can generate varied random material appearance, but do not have any mechanism to constrain the generated material to a specific category or to control the coarse structure of the generated material, such as the exact brick l… ▽ More

    Submitted 19 June, 2022; v1 submitted 11 June, 2022; originally announced June 2022.

    Comments: 18 pages, 19 figures

  18. arXiv:2206.05344  [pdf, other

    cs.GR cs.CV

    Differentiable Rendering of Neural SDFs through Reparameterization

    Authors: Sai Praveen Bangaru, Michaël Gharbi, Tzu-Mao Li, Fujun Luan, Kalyan Sunkavalli, Miloš Hašan, Sai Bi, Zexiang Xu, Gilbert Bernstein, Frédo Durand

    Abstract: We present a method to automatically compute correct gradients with respect to geometric scene parameters in neural SDF renderers. Recent physically-based differentiable rendering techniques for meshes have used edge-sampling to handle discontinuities, particularly at object silhouettes, but SDFs do not have a simple parametric form amenable to sampling. Instead, our approach builds on area-sampli… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

  19. arXiv:2205.09343  [pdf, other

    cs.CV

    Physically-Based Editing of Indoor Scene Lighting from a Single Image

    Authors: Zhengqin Li, Jia Shi, Sai Bi, Rui Zhu, Kalyan Sunkavalli, Miloš Hašan, Zexiang Xu, Ravi Ramamoorthi, Manmohan Chandraker

    Abstract: We present a method to edit complex indoor lighting from a single image with its predicted depth and light source segmentation masks. This is an extremely challenging problem that requires modeling complex light transport, and disentangling HDR lighting from material and geometry with only a partial LDR observation of the scene. We tackle this problem using two novel components: 1) a holistic scen… ▽ More

    Submitted 23 July, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

  20. arXiv:2203.11283  [pdf, other

    cs.CV cs.GR

    NeRFusion: Fusing Radiance Fields for Large-Scale Scene Reconstruction

    Authors: Xiaoshuai Zhang, Sai Bi, Kalyan Sunkavalli, Hao Su, Zexiang Xu

    Abstract: While NeRF has shown great success for neural reconstruction and rendering, its limited MLP capacity and long per-scene optimization times make it challenging to model large-scale indoor scenes. In contrast, classical 3D reconstruction methods can handle large-scale scenes but do not produce realistic renderings. We propose NeRFusion, a method that combines the advantages of NeRF and TSDF-based fu… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Comments: CVPR 2022

  21. arXiv:2203.08216  [pdf, other

    cs.CV

    Interactive Portrait Harmonization

    Authors: Jeya Maria Jose Valanarasu, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Jose Echevarria, Yinglan Ma, Zijun Wei, Kalyan Sunkavalli, Vishal M. Patel

    Abstract: Current image harmonization methods consider the entire background as the guidance for harmonization. However, this may limit the capability for user to choose any specific object/person in the background to guide the harmonization. To enable flexible interaction between user and harmonization, we introduce interactive harmonization, a new setting where the harmonization is performed with respect… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

  22. arXiv:2201.08845  [pdf, other

    cs.CV

    Point-NeRF: Point-based Neural Radiance Fields

    Authors: Qiangeng Xu, Zexiang Xu, Julien Philip, Sai Bi, Zhixin Shu, Kalyan Sunkavalli, Ulrich Neumann

    Abstract: Volumetric neural rendering methods like NeRF generate high-quality view synthesis results but are optimized per-scene leading to prohibitive reconstruction time. On the other hand, deep multi-view stereo methods can quickly reconstruct scene geometry via direct network inference. Point-NeRF combines the advantages of these two approaches by using neural 3D point clouds, with associated neural fea… ▽ More

    Submitted 15 March, 2023; v1 submitted 21 January, 2022; originally announced January 2022.

    Comments: Accepted to CVPR 2022 (Oral)

    Journal ref: In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 5438-5448) (2022)

  23. arXiv:2108.06805  [pdf, other

    cs.CV

    SSH: A Self-Supervised Framework for Image Harmonization

    Authors: Yifan Jiang, He Zhang, Jianming Zhang, Yilin Wang, Zhe Lin, Kalyan Sunkavalli, Simon Chen, Sohrab Amirghodsi, Sarah Kong, Zhangyang Wang

    Abstract: Image harmonization aims to improve the quality of image compositing by matching the "appearance" (\eg, color tone, brightness and contrast) between foreground and background images. However, collecting large-scale annotated datasets for this task requires complex professional retouching. Instead, we propose a novel Self-Supervised Harmonization framework (SSH) that can be trained using just "free… ▽ More

    Submitted 17 August, 2021; v1 submitted 15 August, 2021; originally announced August 2021.

    Comments: Accepted by ICCV'2021

  24. arXiv:2103.00762  [pdf, other

    cs.CV

    NeuTex: Neural Texture Map** for Volumetric Neural Rendering

    Authors: Fanbo Xiang, Zexiang Xu, Miloš Hašan, Yannick Hold-Geoffroy, Kalyan Sunkavalli, Hao Su

    Abstract: Recent work has demonstrated that volumetric scene representations combined with differentiable volume rendering can enable photo-realistic rendering for challenging scenes that mesh reconstruction fails on. However, these methods entangle geometry and appearance in a "black-box" volume that cannot be edited. Instead, we present an approach that explicitly disentangles geometry--represented as a c… ▽ More

    Submitted 1 March, 2021; originally announced March 2021.

  25. arXiv:2012.05116  [pdf, other

    cs.CV

    Deep Denoising of Flash and No-Flash Pairs for Photography in Low-Light Environments

    Authors: Zhihao Xia, Michaël Gharbi, Federico Perazzi, Kalyan Sunkavalli, Ayan Chakrabarti

    Abstract: We introduce a neural network-based method to denoise pairs of images taken in quick succession, with and without a flash, in low-light environments. Our goal is to produce a high-quality rendering of the scene that preserves the color and mood from the ambient illumination of the noisy no-flash image, while recovering surface texture and detail revealed by the flash. Our network outputs a gain ma… ▽ More

    Submitted 14 April, 2021; v1 submitted 9 December, 2020; originally announced December 2020.

    Comments: CVPR 2021. Project page at https://www.cse.wustl.edu/~zhihao.xia/deepfnf/

  26. MaterialGAN: Reflectance Capture using a Generative SVBRDF Model

    Authors: Yu Guo, Cameron Smith, Miloš Hašan, Kalyan Sunkavalli, Shuang Zhao

    Abstract: We address the problem of reconstructing spatially-varying BRDFs from a small set of image measurements. This is a fundamentally under-constrained problem, and previous work has relied on using various regularization priors or on capturing many images to produce plausible results. In this work, we present MaterialGAN, a deep generative convolutional network based on StyleGAN2, trained to synthesiz… ▽ More

    Submitted 30 September, 2020; originally announced October 2020.

    Comments: 13 pages, 16 figures. Siggraph Asia 2020

  27. arXiv:2008.03824  [pdf, other

    cs.CV cs.GR

    Neural Reflectance Fields for Appearance Acquisition

    Authors: Sai Bi, Zexiang Xu, Pratul Srinivasan, Ben Mildenhall, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, Ravi Ramamoorthi

    Abstract: We present Neural Reflectance Fields, a novel deep scene representation that encodes volume density, normal and reflectance properties at any 3D point in a scene using a fully-connected neural network. We combine this representation with a physically-based differentiable ray marching framework that can render images from a neural reflectance field under any viewpoint and light. We demonstrate that… ▽ More

    Submitted 16 August, 2020; v1 submitted 9 August, 2020; originally announced August 2020.

  28. arXiv:2008.01815  [pdf, other

    cs.CV cs.GR

    Deep Multi Depth Panoramas for View Synthesis

    Authors: Kai-En Lin, Zexiang Xu, Ben Mildenhall, Pratul P. Srinivasan, Yannick Hold-Geoffroy, Stephen DiVerdi, Qi Sun, Kalyan Sunkavalli, Ravi Ramamoorthi

    Abstract: We propose a learning-based approach for novel view synthesis for multi-camera 360$^{\circ}$ panorama capture rigs. Previous work constructs RGBD panoramas from such data, allowing for view synthesis with small amounts of translation, but cannot handle the disocclusions and view-dependent effects that are caused by large translations. To address this issue, we present a novel scene representation… ▽ More

    Submitted 4 August, 2020; originally announced August 2020.

    Comments: Published at the European Conference on Computer Vision, 2020

  29. arXiv:2007.12868  [pdf, other

    cs.CV

    OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets

    Authors: Zhengqin Li, Ting-Wei Yu, Shen Sang, Sarah Wang, Meng Song, Yuhan Liu, Yu-Ying Yeh, Rui Zhu, Nitesh Gundavarapu, Jia Shi, Sai Bi, Zexiang Xu, Hong-Xing Yu, Kalyan Sunkavalli, Miloš Hašan, Ravi Ramamoorthi, Manmohan Chandraker

    Abstract: We propose a novel framework for creating large-scale photorealistic datasets of indoor scenes, with ground truth geometry, material, lighting and semantics. Our goal is to make the dataset creation process widely accessible, transforming scans into photorealistic datasets with high-quality ground truth for appearance, layout, semantic labels, high quality spatially-varying BRDF and complex lighti… ▽ More

    Submitted 27 September, 2021; v1 submitted 25 July, 2020; originally announced July 2020.

  30. arXiv:2007.09892  [pdf, other

    cs.CV cs.GR

    Deep Reflectance Volumes: Relightable Reconstructions from Multi-View Photometric Images

    Authors: Sai Bi, Zexiang Xu, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, Ravi Ramamoorthi

    Abstract: We present a deep learning approach to reconstruct scene appearance from unstructured images captured under collocated point lighting. At the heart of Deep Reflectance Volumes is a novel volumetric scene representation consisting of opacity, surface normal and reflectance voxel grids. We present a novel physically-based differentiable volume ray marching framework to render these scene volumes und… ▽ More

    Submitted 20 July, 2020; originally announced July 2020.

    Comments: Accepted to ECCV 2020

  31. arXiv:2007.09529  [pdf, other

    cs.CV

    Single View Metrology in the Wild

    Authors: Rui Zhu, Xingyi Yang, Yannick Hold-Geoffroy, Federico Perazzi, Jonathan Eisenmann, Kalyan Sunkavalli, Manmohan Chandraker

    Abstract: Most 3D reconstruction methods may only recover scene properties up to a global scale ambiguity. We present a novel approach to single view metrology that can recover the absolute scale of a scene represented by 3D heights of objects or camera height above the ground as well as camera parameters of orientation and field of view, using just a monocular image acquired in unconstrained condition. Our… ▽ More

    Submitted 23 February, 2021; v1 submitted 18 July, 2020; originally announced July 2020.

    Comments: ECCV 2020, camera-ready version

  32. arXiv:2004.03805  [pdf, other

    cs.CV cs.GR

    State of the Art on Neural Rendering

    Authors: Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, Rohit Pandey, Sean Fanello, Gordon Wetzstein, Jun-Yan Zhu, Christian Theobalt, Maneesh Agrawala, Eli Shechtman, Dan B Goldman, Michael Zollhöfer

    Abstract: Efficient rendering of photo-realistic virtual worlds is a long standing effort of computer graphics. Modern graphics techniques have succeeded in synthesizing photo-realistic images from hand-crafted scene representations. However, the automatic generation of shape, materials, lighting, and other aspects of scenes remains a challenging problem that, if solved, would make photo-realistic computer… ▽ More

    Submitted 8 April, 2020; originally announced April 2020.

    Comments: Eurographics 2020 survey paper

  33. arXiv:2003.12649  [pdf, other

    cs.CV

    Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement

    Authors: Sai Bi, Kalyan Sunkavalli, Federico Perazzi, Eli Shechtman, Vladimir Kim, Ravi Ramamoorthi

    Abstract: We present a method to improve the visual realism of low-quality, synthetic images, e.g. OpenGL renderings. Training an unpaired synthetic-to-real translation network in image space is severely under-constrained and produces visible artifacts. Instead, we propose a semi-supervised approach that operates on the disentangled shading and albedo layers of the image. Our two-stage pipeline first learns… ▽ More

    Submitted 27 March, 2020; originally announced March 2020.

    Comments: Accepted to ICCV 2019

  34. arXiv:2003.12642  [pdf, other

    cs.CV cs.GR

    Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images

    Authors: Sai Bi, Zexiang Xu, Kalyan Sunkavalli, David Kriegman, Ravi Ramamoorthi

    Abstract: We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object from a sparse set of only six images captured by wide-baseline cameras under collocated point lighting. We first estimate per-view depth maps using a deep multi-view stereo network; these depth maps are used to coarsely align the different views. We propose… ▽ More

    Submitted 4 July, 2020; v1 submitted 27 March, 2020; originally announced March 2020.

    Comments: Accepted to CVPR 2020

  35. arXiv:1912.12297  [pdf, other

    cs.GR eess.IV

    Automatic Scene Inference for 3D Object Compositing

    Authors: Kevin Karsch, Kalyan Sunkavalli, Sunil Hadap, Nathan Carr, Hailin **, Rafael Fonte, Michael Sittig

    Abstract: We present a user-friendly image editing system that supports a drag-and-drop object insertion (where the user merely drags objects into the image, and the system automatically places them in 3D and relights them appropriately), post-process illumination editing, and depth-of-field manipulation. Underlying our system is a fully automatic technique for recovering a comprehensive 3D scene model (geo… ▽ More

    Submitted 24 December, 2019; originally announced December 2019.

  36. arXiv:1912.04421  [pdf, other

    cs.CV

    Basis Prediction Networks for Effective Burst Denoising with Large Kernels

    Authors: Zhihao Xia, Federico Perazzi, Michaël Gharbi, Kalyan Sunkavalli, Ayan Chakrabarti

    Abstract: Bursts of images exhibit significant self-similarity across both time and space. This motivates a representation of the kernels as linear combinations of a small set of basis elements. To this end, we introduce a novel basis prediction network that, given an input burst, predicts a set of global basis kernels -- shared within the image -- and the corresponding mixing coefficients -- which are spec… ▽ More

    Submitted 2 December, 2020; v1 submitted 9 December, 2019; originally announced December 2019.

    Comments: CVPR 2020. Project website at https://www.cse.wustl.edu/~zhihao.xia/bpn/

  37. arXiv:1910.08812  [pdf, other

    cs.CV

    Deep Parametric Indoor Lighting Estimation

    Authors: Marc-André Gardner, Yannick Hold-Geoffroy, Kalyan Sunkavalli, Christian Gagné, Jean-François Lalonde

    Abstract: We present a method to estimate lighting from a single image of an indoor scene. Previous work has used an environment map representation that does not account for the localized nature of indoor lighting. Instead, we represent lighting as a set of discrete 3D lights with geometric and photometric parameters. We train a deep neural network to regress these parameters from a single image, on a datas… ▽ More

    Submitted 19 October, 2019; originally announced October 2019.

  38. arXiv:1906.04909  [pdf, other

    cs.CV

    All-Weather Deep Outdoor Lighting Estimation

    Authors: **song Zhang, Kalyan Sunkavalli, Yannick Hold-Geoffroy, Sunil Hadap, Jonathan Eisenmann, Jean-François Lalonde

    Abstract: We present a neural network that predicts HDR outdoor illumination from a single LDR image. At the heart of our work is a method to accurately learn HDR lighting from LDR panoramas under any weather condition. We achieve this by training another CNN (on a combination of synthetic and real images) to take as input an LDR panorama, and regress the parameters of the Lalonde-Matthews outdoor illuminat… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

    Comments: 8 pages, CVPR 19. Project page: http://lvsn.github.io/allweather

  39. arXiv:1906.03799  [pdf, other

    cs.CV

    Fast Spatially-Varying Indoor Lighting Estimation

    Authors: Mathieu Garon, Kalyan Sunkavalli, Sunil Hadap, Nathan Carr, Jean-François Lalonde

    Abstract: We propose a real-time method to estimate spatiallyvarying indoor lighting from a single RGB image. Given an image and a 2D location in that image, our CNN estimates a 5th order spherical harmonic representation of the lighting at the given location in less than 20ms on a laptop mobile graphics card. While existing approaches estimate a single, global lighting representation or require depth as in… ▽ More

    Submitted 10 June, 2019; originally announced June 2019.

    Comments: CVPR19

    Journal ref: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 6908-6917

  40. arXiv:1905.02722  [pdf, other

    cs.CV

    Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF from a Single Image

    Authors: Zhengqin Li, Mohammad Shafiei, Ravi Ramamoorthi, Kalyan Sunkavalli, Manmohan Chandraker

    Abstract: We propose a deep inverse rendering framework for indoor scenes. From a single RGB image of an arbitrary indoor scene, we create a complete scene reconstruction, estimating shape, spatially-varying lighting, and spatially-varying, non-Lambertian surface reflectance. To train this network, we augment the SUNCG indoor scene dataset with real-world materials and render them with a fast, high-quality,… ▽ More

    Submitted 7 May, 2019; originally announced May 2019.

  41. arXiv:1811.12481  [pdf, other

    cs.CV

    Learning to Separate Multiple Illuminants in a Single Image

    Authors: Zhuo Hui, Ayan Chakrabarti, Kalyan Sunkavalli, Aswin C. Sankaranarayanan

    Abstract: We present a method to separate a single image captured under two illuminants, with different spectra, into the two images corresponding to the appearance of the scene under each individual illuminant. We do this by training a deep neural network to predict the per-pixel reflectance chromaticity of the scene, which we use in conjunction with a previous flash/no-flash image-based separation algorit… ▽ More

    Submitted 22 April, 2019; v1 submitted 29 November, 2018; originally announced November 2018.

  42. arXiv:1808.04545  [pdf, other

    cs.LG cs.AI cs.CV cs.GR stat.ML

    MT-VAE: Learning Motion Transformations to Generate Multimodal Human Dynamics

    Authors: Xinchen Yan, Akash Rastogi, Ruben Villegas, Kalyan Sunkavalli, Eli Shechtman, Sunil Hadap, Ersin Yumer, Honglak Lee

    Abstract: Long-term human motion can be represented as a series of motion modes---motion sequences that capture short-term temporal dynamics---with transitions between them. We leverage this structure and present a novel Motion Transformation Variational Auto-Encoders (MT-VAE) for learning motion sequence generation. Our model jointly learns a feature embedding for motion modes (that the motion sequence can… ▽ More

    Submitted 14 August, 2018; originally announced August 2018.

    Comments: Published at ECCV 2018

  43. Self-supervised Multi-view Person Association and Its Applications

    Authors: Minh Vo, Ersin Yumer, Kalyan Sunkavalli, Sunil Hadap, Yaser Sheikh, Srinivasa Narasimhan

    Abstract: Reliable markerless motion tracking of people participating in a complex group activity from multiple moving cameras is challenging due to frequent occlusions, strong viewpoint and appearance variations, and asynchronous video streams. To solve this problem, reliable association of the same person across distant viewpoints and temporal instances is essential. We present a self-supervised framework… ▽ More

    Submitted 18 April, 2020; v1 submitted 22 May, 2018; originally announced May 2018.

    Comments: Accepted to IEEE TPAMI

  44. arXiv:1804.05790  [pdf, other

    cs.CV

    Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone Image

    Authors: Zhengqin Li, Kalyan Sunkavalli, Manmohan Chandraker

    Abstract: We propose a material acquisition approach to recover the spatially-varying BRDF and normal map of a near-planar surface from a single image captured by a handheld mobile phone camera. Our method images the surface under arbitrary environment lighting with the flash turned on, thereby avoiding shadows while simultaneously capturing high-frequency specular highlights. We train a CNN to regress an S… ▽ More

    Submitted 16 April, 2018; originally announced April 2018.

    Comments: submitted to European Conference on Computer Vision

  45. arXiv:1712.01259  [pdf, other

    cs.CV

    A Perceptual Measure for Deep Single Image Camera Calibration

    Authors: Yannick Hold-Geoffroy, Kalyan Sunkavalli, Jonathan Eisenmann, Matt Fisher, Emiliano Gambaretto, Sunil Hadap, Jean-François Lalonde

    Abstract: Most current single image camera calibration methods rely on specific image features or user input, and cannot be applied to natural images captured in uncontrolled settings. We propose directly inferring camera calibration parameters from a single image using a deep convolutional neural network. This network is trained using automatically generated samples from a large-scale panorama dataset, and… ▽ More

    Submitted 22 April, 2018; v1 submitted 1 December, 2017; originally announced December 2017.

    Comments: Published at CVPR'18

  46. arXiv:1710.06507  [pdf, other

    cs.CV

    Scene Parsing with Global Context Embedding

    Authors: Wei-Chih Hung, Yi-Hsuan Tsai, Xiaohui Shen, Zhe Lin, Kalyan Sunkavalli, Xin Lu, Ming-Hsuan Yang

    Abstract: We present a scene parsing method that utilizes global context information based on both the parametric and non- parametric models. Compared to previous methods that only exploit the local relationship between objects, we train a context network based on scene similarities to generate feature representations for global contexts. In addition, these learned features are utilized to generate global a… ▽ More

    Submitted 20 October, 2017; v1 submitted 17 October, 2017; originally announced October 2017.

    Comments: Accepted in ICCV'17. Code available at https://github.com/hfslyc/GCPNet

  47. arXiv:1709.10214  [pdf, other

    cs.GR

    Photometric Stabilization for Fast-forward Videos

    Authors: Xuaner Cecilia Zhang, Joon-Young Lee, Kalyan Sunkavalli, Zhaowen Wang

    Abstract: Videos captured by consumer cameras often exhibit temporal variations in color and tone that are caused by camera auto-adjustments like white-balance and exposure. When such videos are sub-sampled to play fast-forward, as in the increasingly popular forms of timelapse and hyperlapse videos, these temporal variations are exacerbated and appear as visually disturbing high frequency flickering. Previ… ▽ More

    Submitted 28 September, 2017; originally announced September 2017.

    Comments: 9 pages, 11 figures, PG 2017

    ACM Class: I.3.3; I.4.3

  48. arXiv:1704.05564  [pdf, other

    cs.CV

    Illuminant Spectra-based Source Separation Using Flash Photography

    Authors: Zhuo Hui, Kalyan Sunkavalli, Sunil Hadap, Aswin C. Sankaranarayanan

    Abstract: Real-world lighting often consists of multiple illuminants with different spectra. Separating and manipulating these illuminants in post-process is a challenging problem that requires either significant manual input or calibrated scene geometry and lighting. In this work, we leverage a flash/no-flash image pair to analyze and edit scene illuminants based on their spectral differences. We derive a… ▽ More

    Submitted 26 November, 2017; v1 submitted 18 April, 2017; originally announced April 2017.

  49. arXiv:1704.04131  [pdf, other

    cs.CV

    Neural Face Editing with Intrinsic Image Disentangling

    Authors: Zhixin Shu, Ersin Yumer, Sunil Hadap, Kalyan Sunkavalli, Eli Shechtman, Dimitris Samaras

    Abstract: Traditional face editing methods often require a number of sophisticated and task specific algorithms to be applied one after the other --- a process that is tedious, fragile, and computationally intensive. In this paper, we propose an end-to-end generative adversarial network that infers a face-specific disentangled representation of intrinsic face properties, including shape (i.e. normals), albe… ▽ More

    Submitted 13 April, 2017; originally announced April 2017.

    Comments: CVPR 2017 oral

  50. arXiv:1704.00090  [pdf, other

    cs.CV cs.GR stat.ML

    Learning to Predict Indoor Illumination from a Single Image

    Authors: Marc-André Gardner, Kalyan Sunkavalli, Ersin Yumer, Xiaohui Shen, Emiliano Gambaretto, Christian Gagné, Jean-François Lalonde

    Abstract: We propose an automatic method to infer high dynamic range illumination from a single, limited field-of-view, low dynamic range photograph of an indoor scene. In contrast to previous work that relies on specialized image capture, user input, and/or simple scene models, we train an end-to-end deep neural network that directly regresses a limited field-of-view photo to HDR illumination, without stro… ▽ More

    Submitted 21 November, 2017; v1 submitted 31 March, 2017; originally announced April 2017.