Skip to main content

Showing 1–20 of 20 results for author: Bhattad, A

.
  1. arXiv:2405.21074  [pdf, other

    cs.CV

    Latent Intrinsics Emerge from Training to Relight

    Authors: Xiao Zhang, William Gao, Seemandhar Jain, Michael Maire, David. A. Forsyth, Anand Bhattad

    Abstract: Image relighting is the task of showing what a scene from a source image would look like if illuminated differently. Inverse graphics schemes recover an explicit representation of geometry and a set of chosen intrinsics, then relight with some form of renderer. However error control for inverse graphics is difficult, and inverse graphics methods can represent only the effects of the chosen intrins… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  2. arXiv:2403.14617  [pdf, other

    cs.CV cs.AI cs.LG

    Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion

    Authors: Xiang Fan, Anand Bhattad, Ranjay Krishna

    Abstract: We introduce Videoshop, a training-free video editing algorithm for localized semantic edits. Videoshop allows users to use any editing software, including Photoshop and generative inpainting, to modify the first frame; it automatically propagates those changes, with semantic, spatial, and temporally consistent motion, to the remaining frames. Unlike existing methods that enable edits only through… ▽ More

    Submitted 22 March, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: Project page at https://videoshop-editing.github.io/

  3. arXiv:2311.17138  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Shadows Don't Lie and Lines Can't Bend! Generative Models don't know Projective Geometry...for now

    Authors: Ayush Sarkar, Hanlin Mai, Amitabh Mahapatra, Svetlana Lazebnik, D. A. Forsyth, Anand Bhattad

    Abstract: Generative models can produce impressively realistic images. This paper demonstrates that generated images have geometric features different from those of real images. We build a set of collections of generated images, prequalified to fool simple, signal-based classifiers into believing they are real. We then show that prequalified generated images can be identified reliably by classifiers that on… ▽ More

    Submitted 30 May, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: Project Page: https://projective-geometry.github.io | First three authors contributed equally

  4. arXiv:2311.17137  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Intrinsic LoRA: A Generalist Approach for Discovering Knowledge in Generative Models

    Authors: Xiaodan Du, Nicholas Kolkin, Greg Shakhnarovich, Anand Bhattad

    Abstract: Generative models excel at creating images that closely mimic real scenes, suggesting they inherently encode scene representations. We introduce Intrinsic LoRA (I-LoRA), a general approach that uses Low-Rank Adaptation (LoRA) to discover scene intrinsics such as normals, depth, albedo, and shading from a wide array of generative models. I-LoRA is lightweight, adding minimally to the model's parame… ▽ More

    Submitted 23 June, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: https://intrinsic-lora.github.io/

  5. arXiv:2309.16646  [pdf, other

    cs.CV

    Improving Equivariance in State-of-the-Art Supervised Depth and Normal Predictors

    Authors: Yuanyi Zhong, Anand Bhattad, Yu-Xiong Wang, David Forsyth

    Abstract: Dense depth and surface normal predictors should possess the equivariant property to crop**-and-resizing -- crop** the input image should result in crop** the same output image. However, we find that state-of-the-art depth and normal predictors, despite having strong performances, surprisingly do not respect equivariance. The problem exists even when crop-and-resize data augmentation is empl… ▽ More

    Submitted 17 October, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: ICCV 2023

  6. arXiv:2307.11073  [pdf, other

    cs.CV cs.AI cs.GR

    OBJECT 3DIT: Language-guided 3D-aware Image Editing

    Authors: Oscar Michel, Anand Bhattad, Eli VanderBilt, Ranjay Krishna, Aniruddha Kembhavi, Tanmay Gupta

    Abstract: Existing image editing tools, while powerful, typically disregard the underlying 3D geometry from which the image is projected. As a result, edits made using these tools may become detached from the geometry and lighting conditions that are at the foundation of the image formation process. In this work, we formulate the newt ask of language-guided 3D-aware editing, where objects in an image should… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

  7. arXiv:2307.03847  [pdf, other

    cs.CV

    Blocks2World: Controlling Realistic Scenes with Editable Primitives

    Authors: Vaibhav Vavilala, Seemandhar Jain, Rahul Vasanth, Anand Bhattad, David Forsyth

    Abstract: We present Blocks2World, a novel method for 3D scene rendering and editing that leverages a two-step process: convex decomposition of images and conditioned synthesis. Our technique begins by extracting 3D parallelepipeds from various objects in a given scene using convex decomposition, thus obtaining a primitive representation of the scene. These primitives are then utilized to generate paired da… ▽ More

    Submitted 13 July, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

    Comments: 16 pages, 15 figures

  8. arXiv:2306.15128  [pdf, other

    cs.CV cs.AI cs.LG

    MIMIC: Masked Image Modeling with Image Correspondences

    Authors: Kalyani Marathe, Mahtab Bigverdi, Nishat Khan, Tuhin Kundu, Patrick Howe, Sharan Ranjit S, Anand Bhattad, Aniruddha Kembhavi, Linda G. Shapiro, Ranjay Krishna

    Abstract: Dense pixel-specific representation learning at scale has been bottlenecked due to the unavailability of large-scale multi-view datasets. Current methods for building effective pretraining datasets heavily rely on annotated 3D meshes, point clouds, and camera parameters from simulated environments, preventing them from building datasets from real-world data sources where such metadata is lacking.… ▽ More

    Submitted 15 May, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

  9. arXiv:2306.09349  [pdf, other

    cs.CV

    UrbanIR: Large-Scale Urban Scene Inverse Rendering from a Single Video

    Authors: Zhi-Hao Lin, Bohan Liu, Yi-Ting Chen, David Forsyth, Jia-Bin Huang, Anand Bhattad, Shenlong Wang

    Abstract: We show how to build a model that allows realistic, free-viewpoint renderings of a scene under novel lighting conditions from video. Our method -- UrbanIR: Urban Scene Inverse Rendering -- computes an inverse graphics representation from the video. UrbanIR jointly infers shape, albedo, visibility, and sun and sky illumination from a single video of unbounded outdoor scenes with unknown lighting. U… ▽ More

    Submitted 15 June, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: https://urbaninverserendering.github.io/

  10. arXiv:2306.00987  [pdf, other

    cs.CV cs.GR cs.LG

    StyleGAN knows Normal, Depth, Albedo, and More

    Authors: Anand Bhattad, Daniel McKee, Derek Hoiem, D. A. Forsyth

    Abstract: Intrinsic images, in the original sense, are image-like maps of scene properties like depth, normal, albedo or shading. This paper demonstrates that StyleGAN can easily be induced to produce intrinsic images. The procedure is straightforward. We show that, if StyleGAN produces $G({w})$ from latents ${w}$, then for each type of intrinsic image, there is a fixed offset ${d}_c$ so that… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: Beyond Image Generation: StyleGAN knows Normals, Depth, Albedo, Shading, Segmentation and perhaps more!

  11. arXiv:2304.14403  [pdf, other

    cs.CV cs.GR cs.LG

    Make It So: Steering StyleGAN for Any Image Inversion and Editing

    Authors: Anand Bhattad, Viraj Shah, Derek Hoiem, D. A. Forsyth

    Abstract: StyleGAN's disentangled style representation enables powerful image editing by manipulating the latent variables, but accurately map** real-world images to their latent variables (GAN inversion) remains a challenge. Existing GAN inversion methods struggle to maintain editing directions and produce realistic results. To address these limitations, we propose Make It So, a novel GAN inversion met… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

    Comments: project: https://anandbhattad.github.io/makeitso/

  12. arXiv:2205.10351  [pdf, other

    cs.CV

    StyLitGAN: Prompting StyleGAN to Produce New Illumination Conditions

    Authors: Anand Bhattad, D. A. Forsyth

    Abstract: We propose a novel method, StyLitGAN, for relighting and resurfacing generated images in the absence of labeled data. Our approach generates images with realistic lighting effects, including cast shadows, soft shadows, inter-reflections, and glossy effects, without the need for paired or CGI data. StyLitGAN uses an intrinsic image method to decompose an image, followed by a search of the latent… ▽ More

    Submitted 1 May, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: https://anandbhattad.github.io/stylitgan/

  13. arXiv:2112.04497  [pdf, other

    cs.CV

    SIRfyN: Single Image Relighting from your Neighbors

    Authors: D. A. Forsyth, Anand Bhattad, Pranav Asthana, Yuanyi Zhong, Yuxiong Wang

    Abstract: We show how to relight a scene, depicted in a single image, such that (a) the overall shading has changed and (b) the resulting image looks like a natural image of that scene. Applications for such a procedure include generating training data and building authoring environments. Naive methods for doing this fail. One reason is that shading and albedo are quite strongly related; for example, sharp… ▽ More

    Submitted 8 December, 2021; originally announced December 2021.

  14. arXiv:2111.10427  [pdf, other

    cs.CV

    DIVeR: Real-time and Accurate Neural Radiance Fields with Deterministic Integration for Volume Rendering

    Authors: Liwen Wu, Jae Yong Lee, Anand Bhattad, Yuxiong Wang, David Forsyth

    Abstract: DIVeR builds on the key ideas of NeRF and its variants -- density models and volume rendering -- to learn 3D object models that can be rendered realistically from small numbers of images. In contrast to all previous NeRF methods, DIVeR uses deterministic rather than stochastic estimates of the volume rendering integral. DIVeR's representation is a voxel based field of features. To compute the volu… ▽ More

    Submitted 18 May, 2022; v1 submitted 19 November, 2021; originally announced November 2021.

  15. arXiv:2106.06533  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    View Generalization for Single Image Textured 3D Models

    Authors: Anand Bhattad, Aysegul Dundar, Guilin Liu, Andrew Tao, Bryan Catanzaro

    Abstract: Humans can easily infer the underlying 3D geometry and texture of an object only from a single 2D image. Current computer vision methods can do this, too, but suffer from view generalization problems - the models inferred tend to make poor predictions of appearance in novel views. As for generalization problems in machine learning, the difficulty is balancing single-view accuracy (cf. training err… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: CVPR 2021. Project website: https://nv-adlr.github.io/view-generalization

  16. arXiv:2010.05907  [pdf, other

    cs.CV cs.GR cs.LG

    Cut-and-Paste Object Insertion by Enabling Deep Image Prior for Reshading

    Authors: Anand Bhattad, David A. Forsyth

    Abstract: We show how to insert an object from one image to another and get realistic results in the hard case, where the shading of the inserted object clashes with the shading of the scene. Rendering objects using an illumination model of the scene doesn't work, because doing so requires a geometric and material model of the object, which is hard to recover from a single image. In this paper, we introduce… ▽ More

    Submitted 13 September, 2022; v1 submitted 12 October, 2020; originally announced October 2020.

    Comments: 3DV 2022

  17. arXiv:1910.09447  [pdf, other

    cs.CV

    Improving Style Transfer with Calibrated Metrics

    Authors: Mao-Chuang Yeh, Shuai Tang, Anand Bhattad, Chuhang Zou, David Forsyth

    Abstract: Style transfer methods produce a transferred image which is a rendering of a content image in the manner of a style image. We seek to understand how to improve style transfer. To do so requires quantitative evaluation procedures, but the current evaluation is qualitative, mostly involving user studies. We describe a novel quantitative evaluation procedure. Our procedure relies on two statistics:… ▽ More

    Submitted 13 February, 2020; v1 submitted 21 October, 2019; originally announced October 2019.

    Comments: updated conference camera ready version. arXiv admin note: text overlap with arXiv:1804.00118

  18. arXiv:1904.06347  [pdf, other

    cs.CV

    Unrestricted Adversarial Examples via Semantic Manipulation

    Authors: Anand Bhattad, Min ** Chong, Kaizhao Liang, Bo Li, D. A. Forsyth

    Abstract: Machine learning models, especially deep neural networks (DNNs), have been shown to be vulnerable against adversarial examples which are carefully crafted samples with a small magnitude of the perturbation. Such adversarial perturbations are usually restricted by bounding their $\mathcal{L}_p$ norm such that they are imperceptible, and thus many current defenses can exploit this property to reduce… ▽ More

    Submitted 20 March, 2020; v1 submitted 12 April, 2019; originally announced April 2019.

    Comments: Accepted to ICLR 2020. First two authors contributed equally. Code: https://github.com/aisecure/Big-but-Invisible-Adversarial-Attack and Openreview: https://openreview.net/forum?id=Sye_OgHFwH

  19. arXiv:1804.00118  [pdf, other

    cs.CV

    Quantitative Evaluation of Style Transfer

    Authors: Mao-Chuang Yeh, Shuai Tang, Anand Bhattad, D. A. Forsyth

    Abstract: Style transfer methods produce a transferred image which is a rendering of a content image in the manner of a style image. There is a rich literature of variant methods. However, evaluation procedures are qualitative, mostly involving user studies. We describe a novel quantitative evaluation procedure. One plots effectiveness (a measure of the extent to which the style was transferred) against coh… ▽ More

    Submitted 31 March, 2018; originally announced April 2018.

    Comments: 30 pages, including supplementary

  20. arXiv:1802.05798  [pdf, other

    cs.CV

    Detecting Anomalous Faces with 'No Peeking' Autoencoders

    Authors: Anand Bhattad, Jason Rock, David Forsyth

    Abstract: Detecting anomalous faces has important applications. For example, a system might tell when a train driver is incapacitated by a medical event, and assist in adopting a safe recovery strategy. These applications are demanding, because they require accurate detection of rare anomalies that may be seen only at runtime. Such a setting causes supervised methods to perform poorly. We describe a method… ▽ More

    Submitted 15 February, 2018; originally announced February 2018.