Skip to main content

Showing 1–15 of 15 results for author: Goldman, D B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.11098  [pdf, other

    cs.GR

    Generative AI for 2D Character Animation

    Authors: Jaime Guajardo, Ozgun Bursalioglu, Dan B Goldman

    Abstract: In this pilot project, we teamed up with artists to develop new workflows for 2D animation while producing a short educational cartoon. We identified several workflows to streamline the animation process, bringing the artists' vision to the screen more effectively.

    Submitted 21 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    Comments: Project page: https://genai-2d-character-animation.github.io/

  2. arXiv:2312.03869  [pdf, other

    cs.CV

    Inpaint3D: 3D Scene Content Generation using 2D Inpainting Diffusion

    Authors: Kira Prabhu, Jane Wu, Lynn Tsai, Peter Hedman, Dan B Goldman, Ben Poole, Michael Broxton

    Abstract: This paper presents a novel approach to inpainting 3D regions of a scene, given masked multi-view images, by distilling a 2D diffusion model into a learned 3D scene representation (e.g. a NeRF). Unlike 3D generative methods that explicitly condition the diffusion model on camera pose or multi-view information, our diffusion model is conditioned only on a single masked 2D image. Nevertheless, we sh… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  3. arXiv:2312.02150  [pdf, other

    cs.CV

    Readout Guidance: Learning Control from Diffusion Features

    Authors: Grace Luo, Trevor Darrell, Oliver Wang, Dan B Goldman, Aleksander Holynski

    Abstract: We present Readout Guidance, a method for controlling text-to-image diffusion models with learned signals. Readout Guidance uses readout heads, lightweight networks trained to extract signals from the features of a pre-trained, frozen diffusion model at every timestep. These readouts can encode single-image properties, such as pose, depth, and edges; or higher-order properties that relate multiple… ▽ More

    Submitted 2 April, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: CVPR 2024

  4. arXiv:2106.13228  [pdf, other

    cs.CV cs.GR

    HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields

    Authors: Keunhong Park, Utkarsh Sinha, Peter Hedman, Jonathan T. Barron, Sofien Bouaziz, Dan B Goldman, Ricardo Martin-Brualla, Steven M. Seitz

    Abstract: Neural Radiance Fields (NeRF) are able to reconstruct scenes with unprecedented fidelity, and various recent works have extended NeRF to handle dynamic scenes. A common approach to reconstruct such non-rigid scenes is through the use of a learned deformation field map** from coordinates in each input image into a canonical template coordinate space. However, these deformation-based approaches st… ▽ More

    Submitted 10 September, 2021; v1 submitted 24 June, 2021; originally announced June 2021.

    Comments: SIGGRAPH Asia 2021, Project page: https://hypernerf.github.io/

  5. arXiv:2104.04532  [pdf, other

    cs.CV

    Neural RGB-D Surface Reconstruction

    Authors: Dejan Azinović, Ricardo Martin-Brualla, Dan B Goldman, Matthias Nießner, Justus Thies

    Abstract: Obtaining high-quality 3D reconstructions of room-scale scenes is of paramount importance for upcoming applications in AR or VR. These range from mixed reality applications for teleconferencing, virtual measuring, virtual room planing, to robotic applications. While current volume-based view synthesis methods that use neural radiance fields (NeRFs) show promising results in reproducing the appeara… ▽ More

    Submitted 14 March, 2022; v1 submitted 9 April, 2021; originally announced April 2021.

    Comments: CVPR'22; Project page: https://dazinovic.github.io/neural-rgbd-surface-reconstruction/ Video: https://youtu.be/iWuSowPsC3g

  6. arXiv:2011.12948  [pdf, other

    cs.CV cs.GR

    Nerfies: Deformable Neural Radiance Fields

    Authors: Keunhong Park, Utkarsh Sinha, Jonathan T. Barron, Sofien Bouaziz, Dan B Goldman, Steven M. Seitz, Ricardo Martin-Brualla

    Abstract: We present the first method capable of photorealistically reconstructing deformable scenes using photos/videos captured casually from mobile phones. Our approach augments neural radiance fields (NeRF) by optimizing an additional continuous volumetric deformation field that warps each observed point into a canonical 5D NeRF. We observe that these NeRF-like deformation fields are prone to local mini… ▽ More

    Submitted 9 September, 2021; v1 submitted 25 November, 2020; originally announced November 2020.

    Comments: ICCV 2021, Project page with videos: https://nerfies.github.io/

  7. arXiv:2008.04852  [pdf, other

    cs.CV cs.GR cs.LG

    GeLaTO: Generative Latent Textured Objects

    Authors: Ricardo Martin-Brualla, Rohit Pandey, Sofien Bouaziz, Matthew Brown, Dan B Goldman

    Abstract: Accurate modeling of 3D objects exhibiting transparency, reflections and thin structures is an extremely challenging problem. Inspired by billboards and geometric proxies used in computer graphics, this paper proposes Generative Latent Textured Objects (GeLaTO), a compact representation that combines a set of coarse shape proxies defining low frequency geometry with learned neural textures, to enc… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

    Comments: ECCV 2020 Spotlight. Project website: https://gelato-paper.github.io

    Journal ref: European Conference on Computer Vision 2020

  8. arXiv:2004.03805  [pdf, other

    cs.CV cs.GR

    State of the Art on Neural Rendering

    Authors: Ayush Tewari, Ohad Fried, Justus Thies, Vincent Sitzmann, Stephen Lombardi, Kalyan Sunkavalli, Ricardo Martin-Brualla, Tomas Simon, Jason Saragih, Matthias Nießner, Rohit Pandey, Sean Fanello, Gordon Wetzstein, Jun-Yan Zhu, Christian Theobalt, Maneesh Agrawala, Eli Shechtman, Dan B Goldman, Michael Zollhöfer

    Abstract: Efficient rendering of photo-realistic virtual worlds is a long standing effort of computer graphics. Modern graphics techniques have succeeded in synthesizing photo-realistic images from hand-crafted scene representations. However, the automatic generation of shape, materials, lighting, and other aspects of scenes remains a challenging problem that, if solved, would make photo-realistic computer… ▽ More

    Submitted 8 April, 2020; originally announced April 2020.

    Comments: Eurographics 2020 survey paper

  9. arXiv:1906.01524  [pdf, other

    cs.CV cs.GR cs.LG

    Text-based Editing of Talking-head Video

    Authors: Ohad Fried, Ayush Tewari, Michael Zollhöfer, Adam Finkelstein, Eli Shechtman, Dan B Goldman, Kyle Genova, Zeyu **, Christian Theobalt, Maneesh Agrawala

    Abstract: Editing talking-head video to change the speech content or to remove filler words is challenging. We propose a novel method to edit talking-head video based on its transcript to produce a realistic output video in which the dialogue of the speaker has been modified, while maintaining a seamless audio-visual flow (i.e. no jump cuts). Our method automatically annotates an input talking-head video wi… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: A version with higher resolution images can be downloaded from the authors' website

  10. arXiv:1904.04290  [pdf, other

    cs.CV cs.GR

    Neural Rerendering in the Wild

    Authors: Moustafa Meshry, Dan B Goldman, Sameh Khamis, Hugues Hoppe, Rohit Pandey, Noah Snavely, Ricardo Martin-Brualla

    Abstract: We explore total scene capture -- recording, modeling, and rerendering a scene under varying appearance such as season and time of day. Starting from internet photos of a tourist landmark, we apply traditional 3D reconstruction to register the photos and approximate the scene as a point cloud. For each photo, we render the scene points into a deep framebuffer, and train a neural network to learn t… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Comments: To be presented at CVPR 2019 (oral). Supplementary video available at http://youtu.be/E1crWQn_kmY

  11. arXiv:1811.05029  [pdf, other

    cs.CV

    LookinGood: Enhancing Performance Capture with Real-time Neural Re-Rendering

    Authors: Ricardo Martin-Brualla, Rohit Pandey, Shuoran Yang, Pavel Pidlypenskyi, Jonathan Taylor, Julien Valentin, Sameh Khamis, Philip Davidson, Anastasia Tkach, Peter Lincoln, Adarsh Kowdle, Christoph Rhemann, Dan B Goldman, Cem Keskin, Steve Seitz, Shahram Izadi, Sean Fanello

    Abstract: Motivated by augmented and virtual reality applications such as telepresence, there has been a recent focus in real-time performance capture of humans under motion. However, given the real-time constraint, these systems often suffer from artifacts in geometry and texture such as holes and noise in the final rendering, poor lighting, and low-resolution textures. We take the novel approach to augmen… ▽ More

    Submitted 12 November, 2018; originally announced November 2018.

    Comments: The supplementary video is available at: http://youtu.be/Md3tdAKoLGU To be presented at SIGGRAPH Asia 2018

  12. arXiv:1611.09961  [pdf, other

    cs.CV

    Semantic Facial Expression Editing using Autoencoded Flow

    Authors: Raymond Yeh, Ziwei Liu, Dan B Goldman, Aseem Agarwala

    Abstract: High-level manipulation of facial expressions in images --- such as changing a smile to a neutral expression --- is challenging because facial expression changes are highly non-linear, and vary depending on the appearance of the face. We present a fully automatic approach to editing faces that combines the advantages of flow-based face manipulation with the more recent generative capabilities of V… ▽ More

    Submitted 29 November, 2016; originally announced November 2016.

  13. arXiv:1610.01691  [pdf, other

    cs.GR cs.RO

    Towards a Drone Cinematographer: Guiding Quadrotor Cameras using Visual Composition Principles

    Authors: Niels Joubert, Jane L. E, Dan B Goldman, Floraine Berthouzoz, Mike Roberts, James A. Landay, Pat Hanrahan

    Abstract: We present a system to capture video footage of human subjects in the real world. Our system leverages a quadrotor camera to automatically capture well-composed video of two subjects. Subjects are tracked in a large-scale outdoor environment using RTK GPS and IMU sensors. Then, given the tracked state of our subjects, our system automatically computes static shots based on well-established visual… ▽ More

    Submitted 5 October, 2016; originally announced October 2016.

  14. arXiv:1502.01954  [pdf, other

    cs.GR cs.CG

    Interactive 3D Face Stylization Using Sculptural Abstraction

    Authors: Jan Jachnik, Dan B Goldman, Linjie Luo, Andrew J. Davison

    Abstract: Sculptors often deviate from geometric accuracy in order to enhance the appearance of their sculpture. These subtle stylizations may emphasize anatomy, draw the viewer's focus to characteristic features of the subject, or symbolize textures that might not be accurately reproduced in a particular sculptural medium, while still retaining fidelity to the unique proportions of an individual. In this w… ▽ More

    Submitted 6 February, 2015; originally announced February 2015.

    Comments: 9 pages, 16 figures. For associated video and to download some of the results, see the project webpage at http://wp.doc.ic.ac.uk/robotvision/project/face-stylization/

  15. arXiv:1204.3367  [pdf, other

    cs.SI cs.HC

    Crowdsourcing Gaze Data Collection

    Authors: Dmitry Rudoy, Dan B. Goldman, Eli Shechtman, Lihi Zelnik-Manor

    Abstract: Knowing where people look is a useful tool in many various image and video applications. However, traditional gaze tracking hardware is expensive and requires local study participants, so acquiring gaze location data from a large number of participants is very problematic. In this work we propose a crowdsourced method for acquisition of gaze direction data from a virtually unlimited number of part… ▽ More

    Submitted 16 April, 2012; originally announced April 2012.

    Comments: Presented at Collective Intelligence conference, 2012 (arXiv:1204.2991)

    Report number: CollectiveIntelligence/2012/106