Skip to main content

Showing 1–50 of 63 results for author: Forsyth, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.21074  [pdf, other

    cs.CV

    Latent Intrinsics Emerge from Training to Relight

    Authors: Xiao Zhang, William Gao, Seemandhar Jain, Michael Maire, David. A. Forsyth, Anand Bhattad

    Abstract: Image relighting is the task of showing what a scene from a source image would look like if illuminated differently. Inverse graphics schemes recover an explicit representation of geometry and a set of chosen intrinsics, then relight with some form of renderer. However error control for inverse graphics is difficult, and inverse graphics methods can represent only the effects of the chosen intrins… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  2. arXiv:2405.19569  [pdf, other

    cs.CV

    Improved Convex Decomposition with Ensembling and Boolean Primitives

    Authors: Vaibhav Vavilala, Florian Kluger, Seemandhar Jain, Bodo Rosenhahn, David Forsyth

    Abstract: Describing a scene in terms of primitives -- geometrically simple shapes that offer a parsimonious but accurate abstraction of structure -- is an established vision problem. This is a good model of a difficult fitting problem: different scenes require different numbers of primitives and primitives interact strongly, but any proposed solution can be evaluated at inference time. The state of the art… ▽ More

    Submitted 9 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: 18 pages, 9 figures, 7 tables

  3. arXiv:2404.00491  [pdf, other

    cs.CV

    Denoising Monte Carlo Renders With Diffusion Models

    Authors: Vaibhav Vavilala, Rahul Vasanth, David Forsyth

    Abstract: Physically-based renderings contain Monte-Carlo noise, with variance that increases as the number of rays per pixel decreases. This noise, while zero-mean for good modern renderers, can have heavy tails (most notably, for scenes containing specular or refractive objects). Learned methods for restoring low fidelity renders are highly developed, because suppressing render noise means one can save co… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 14 pages, 12 figures

  4. arXiv:2403.13951  [pdf, other

    cs.CV cs.AI

    ACDG-VTON: Accurate and Contained Diffusion Generation for Virtual Try-On

    Authors: Jeffrey Zhang, Kedan Li, Shao-Yu Chang, David Forsyth

    Abstract: Virtual Try-on (VTON) involves generating images of a person wearing selected garments. Diffusion-based methods, in particular, can create high-quality images, but they struggle to maintain the identities of the input garments. We identified this problem stems from the specifics in the training formulation for diffusion. To address this, we propose a unique training scheme that limits the scope in… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  5. arXiv:2402.04249  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

    Authors: Mantas Mazeika, Long Phan, Xuwang Yin, Andy Zou, Zifan Wang, Norman Mu, Elham Sakhaee, Nathaniel Li, Steven Basart, Bo Li, David Forsyth, Dan Hendrycks

    Abstract: Automated red teaming holds substantial promise for uncovering and mitigating the risks associated with the malicious use of large language models (LLMs), yet the field lacks a standardized evaluation framework to rigorously assess new methods. To address this issue, we introduce HarmBench, a standardized evaluation framework for automated red teaming. We identify several desirable properties prev… ▽ More

    Submitted 26 February, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: Website: https://www.harmbench.org

  6. arXiv:2401.02097  [pdf, other

    cs.CV

    Preserving Image Properties Through Initializations in Diffusion Models

    Authors: Jeffrey Zhang, Shao-Yu Chang, Kedan Li, David Forsyth

    Abstract: Retail photography imposes specific requirements on images. For instance, images may need uniform background colors, consistent model poses, centered products, and consistent lighting. Minor deviations from these standards impact a site's aesthetic appeal, making the images unsuitable for use. We show that Stable Diffusion methods, as currently applied, do not respect these requirements. The usual… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  7. arXiv:2311.17138  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    Shadows Don't Lie and Lines Can't Bend! Generative Models don't know Projective Geometry...for now

    Authors: Ayush Sarkar, Hanlin Mai, Amitabh Mahapatra, Svetlana Lazebnik, D. A. Forsyth, Anand Bhattad

    Abstract: Generative models can produce impressively realistic images. This paper demonstrates that generated images have geometric features different from those of real images. We build a set of collections of generated images, prequalified to fool simple, signal-based classifiers into believing they are real. We then show that prequalified generated images can be identified reliably by classifiers that on… ▽ More

    Submitted 30 May, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: Project Page: https://projective-geometry.github.io | First three authors contributed equally

  8. arXiv:2309.16646  [pdf, other

    cs.CV

    Improving Equivariance in State-of-the-Art Supervised Depth and Normal Predictors

    Authors: Yuanyi Zhong, Anand Bhattad, Yu-Xiong Wang, David Forsyth

    Abstract: Dense depth and surface normal predictors should possess the equivariant property to crop**-and-resizing -- crop** the input image should result in crop** the same output image. However, we find that state-of-the-art depth and normal predictors, despite having strong performances, surprisingly do not respect equivariance. The problem exists even when crop-and-resize data augmentation is empl… ▽ More

    Submitted 17 October, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: ICCV 2023

  9. arXiv:2307.04246  [pdf, other

    cs.CV

    Convex Decomposition of Indoor Scenes

    Authors: Vaibhav Vavilala, David Forsyth

    Abstract: We describe a method to parse a complex, cluttered indoor scene into primitives which offer a parsimonious abstraction of scene structure. Our primitives are simple convexes. Our method uses a learned regression procedure to parse a scene into a fixed number of convexes from RGBD input, and can optionally accept segmentations to improve the decomposition. The result is then polished with a descent… ▽ More

    Submitted 15 August, 2023; v1 submitted 9 July, 2023; originally announced July 2023.

    Comments: 18 pages, 12 figures

  10. arXiv:2307.03847  [pdf, other

    cs.CV

    Blocks2World: Controlling Realistic Scenes with Editable Primitives

    Authors: Vaibhav Vavilala, Seemandhar Jain, Rahul Vasanth, Anand Bhattad, David Forsyth

    Abstract: We present Blocks2World, a novel method for 3D scene rendering and editing that leverages a two-step process: convex decomposition of images and conditioned synthesis. Our technique begins by extracting 3D parallelepipeds from various objects in a given scene using convex decomposition, thus obtaining a primitive representation of the scene. These primitives are then utilized to generate paired da… ▽ More

    Submitted 13 July, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

    Comments: 16 pages, 15 figures

  11. arXiv:2307.02698  [pdf, other

    cs.CV

    Applying a Color Palette with Local Control using Diffusion Models

    Authors: Vaibhav Vavilala, David Forsyth

    Abstract: We demonstrate two novel editing procedures in the context of fantasy art. Palette transfer applies a specified reference palette to a given image. For fantasy art, the desired change in palette can be very large, leading to huge changes in the ``look'' of the art. We show that a pipeline of vector quantization; matching; and ``dequantization'' (using a diffusion model) produces successful extreme… ▽ More

    Submitted 2 September, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: 14 pages, 14 figures

  12. arXiv:2307.02106  [pdf, other

    cs.CR cs.DB cs.LG

    SoK: Privacy-Preserving Data Synthesis

    Authors: Yuzheng Hu, Fan Wu, Qinbin Li, Yunhui Long, Gonzalo Munilla Garrido, Chang Ge, Bolin Ding, David Forsyth, Bo Li, Dawn Song

    Abstract: As the prevalence of data analysis grows, safeguarding data privacy has become a paramount concern. Consequently, there has been an upsurge in the development of mechanisms aimed at privacy-preserving data analyses. However, these approaches are task-specific; designing algorithms for new tasks is a cumbersome process. As an alternative, one can create synthetic data that is (ideally) devoid of pr… ▽ More

    Submitted 5 August, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: Accepted at IEEE S&P (Oakland) 2024

  13. arXiv:2306.09349  [pdf, other

    cs.CV

    UrbanIR: Large-Scale Urban Scene Inverse Rendering from a Single Video

    Authors: Zhi-Hao Lin, Bohan Liu, Yi-Ting Chen, David Forsyth, Jia-Bin Huang, Anand Bhattad, Shenlong Wang

    Abstract: We show how to build a model that allows realistic, free-viewpoint renderings of a scene under novel lighting conditions from video. Our method -- UrbanIR: Urban Scene Inverse Rendering -- computes an inverse graphics representation from the video. UrbanIR jointly infers shape, albedo, visibility, and sun and sky illumination from a single video of unbounded outdoor scenes with unknown lighting. U… ▽ More

    Submitted 15 June, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: https://urbaninverserendering.github.io/

  14. arXiv:2306.08807  [pdf, other

    cs.RO

    Sim-on-Wheels: Physical World in the Loop Simulation for Self-Driving

    Authors: Yuan Shen, Bhargav Chandaka, Zhi-hao Lin, Albert Zhai, Hang Cui, David Forsyth, Shenlong Wang

    Abstract: We present Sim-on-Wheels, a safe, realistic, and vehicle-in-loop framework to test autonomous vehicles' performance in the real world under safety-critical scenarios. Sim-on-wheels runs on a self-driving vehicle operating in the physical world. It creates virtual traffic participants with risky behaviors and seamlessly inserts the virtual events into images perceived from the physical world in rea… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

  15. arXiv:2306.00987  [pdf, other

    cs.CV cs.GR cs.LG

    StyleGAN knows Normal, Depth, Albedo, and More

    Authors: Anand Bhattad, Daniel McKee, Derek Hoiem, D. A. Forsyth

    Abstract: Intrinsic images, in the original sense, are image-like maps of scene properties like depth, normal, albedo or shading. This paper demonstrates that StyleGAN can easily be induced to produce intrinsic images. The procedure is straightforward. We show that, if StyleGAN produces $G({w})$ from latents ${w}$, then for each type of intrinsic image, there is a fixed offset ${d}_c$ so that… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: Beyond Image Generation: StyleGAN knows Normals, Depth, Albedo, Shading, Segmentation and perhaps more!

  16. arXiv:2304.14403  [pdf, other

    cs.CV cs.GR cs.LG

    Make It So: Steering StyleGAN for Any Image Inversion and Editing

    Authors: Anand Bhattad, Viraj Shah, Derek Hoiem, D. A. Forsyth

    Abstract: StyleGAN's disentangled style representation enables powerful image editing by manipulating the latent variables, but accurately map** real-world images to their latent variables (GAN inversion) remains a challenge. Existing GAN inversion methods struggle to maintain editing directions and produce realistic results. To address these limitations, we propose Make It So, a novel GAN inversion met… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

    Comments: project: https://anandbhattad.github.io/makeitso/

  17. arXiv:2211.16989  [pdf, other

    cs.GR cs.CV

    Wearing the Same Outfit in Different Ways -- A Controllable Virtual Try-on Method

    Authors: Kedan Li, Jeffrey Zhang, Shao-Yu Chang, David Forsyth

    Abstract: An outfit visualization method generates an image of a person wearing real garments from images of those garments. Current methods can produce images that look realistic and preserve garment identity, captured in details such as collar, cuffs, texture, hem, and sleeve length. However, no current method can both control how the garment is worn -- including tuck or untuck, opened or closed, high or… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  18. arXiv:2211.13226  [pdf, other

    cs.CV cs.GR

    ClimateNeRF: Extreme Weather Synthesis in Neural Radiance Field

    Authors: Yuan Li, Zhi-Hao Lin, David Forsyth, Jia-Bin Huang, Shenlong Wang

    Abstract: Physical simulations produce excellent predictions of weather effects. Neural radiance fields produce SOTA scene models. We describe a novel NeRF-editing procedure that can fuse physical simulations with NeRF models of scenes, producing realistic movies of physical phenomena in those scenes. Our application -- Climate NeRF -- allows people to visualize what climate change outcomes will do to them.… ▽ More

    Submitted 8 June, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

    Comments: project page: https://climatenerf.github.io/

  19. arXiv:2210.10039  [pdf, other

    cs.CV cs.CY cs.LG

    How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios

    Authors: Mantas Mazeika, Eric Tang, Andy Zou, Steven Basart, Jun Shern Chan, Dawn Song, David Forsyth, Jacob Steinhardt, Dan Hendrycks

    Abstract: In recent years, deep neural networks have demonstrated increasingly strong abilities to recognize objects and activities in videos. However, as video understanding becomes widely used in real-world applications, a key consideration is develo** human-centric systems that understand not only the content of the video but also how it would affect the wellbeing and emotional state of viewers. To fac… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022; datasets available at https://github.com/hendrycks/emodiversity/

  20. arXiv:2206.14157  [pdf, other

    cs.LG cs.CR

    How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection

    Authors: Mantas Mazeika, Bo Li, David Forsyth

    Abstract: Model stealing attacks present a dilemma for public machine learning APIs. To protect financial investments, companies may be forced to withhold important information about their models that could facilitate theft, including uncertainty estimates and prediction explanations. This compromise is harmful not only to users but also to external transparency. Model stealing defenses seek to resolve this… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

    Comments: ICML 2022

  21. arXiv:2206.01334  [pdf, other

    cs.CV

    Long Scale Error Control in Low Light Image and Video Enhancement Using Equivariance

    Authors: Sara Aghajanzadeh, David Forsyth

    Abstract: Image frames obtained in darkness are special. Just multiplying by a constant doesn't restore the image. Shot noise, quantization effects and camera non-linearities mean that colors and relative light levels are estimated poorly. Current methods learn a map** using real dark-bright image pairs. These are very hard to capture. A recent paper has shown that simulated data pairs produce real improv… ▽ More

    Submitted 2 June, 2022; originally announced June 2022.

  22. arXiv:2205.10351  [pdf, other

    cs.CV

    StyLitGAN: Prompting StyleGAN to Produce New Illumination Conditions

    Authors: Anand Bhattad, D. A. Forsyth

    Abstract: We propose a novel method, StyLitGAN, for relighting and resurfacing generated images in the absence of labeled data. Our approach generates images with realistic lighting effects, including cast shadows, soft shadows, inter-reflections, and glossy effects, without the need for paired or CGI data. StyLitGAN uses an intrinsic image method to decompose an image, followed by a search of the latent… ▽ More

    Submitted 1 May, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: https://anandbhattad.github.io/stylitgan/

  23. arXiv:2205.08615  [pdf, other

    cs.CV

    Towards Robust Low Light Image Enhancement

    Authors: Sara Aghajanzadeh, David Forsyth

    Abstract: In this paper, we study the problem of making brighter images from dark images found in the wild. The images are dark because they are taken in dim environments. They suffer from color shifts caused by quantization and from sensor noise. We don't know the true camera reponse function for such images and they are not RAW. We use a supervised learning method, relying on a straightforward simulation… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

  24. arXiv:2112.11641  [pdf, other

    cs.CV

    JoJoGAN: One Shot Face Stylization

    Authors: Min ** Chong, David Forsyth

    Abstract: A style mapper applies some fixed style to its input images (so, for example, taking faces to cartoons). This paper describes a simple procedure -- JoJoGAN -- to learn a style mapper from a single example of the style. JoJoGAN uses a GAN inversion procedure and StyleGAN's style-mixing property to produce a substantial paired dataset from a single example style. The paired dataset is then used to f… ▽ More

    Submitted 6 March, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

    Comments: code at https://github.com/mchong6/JoJoGAN

  25. arXiv:2112.04497  [pdf, other

    cs.CV

    SIRfyN: Single Image Relighting from your Neighbors

    Authors: D. A. Forsyth, Anand Bhattad, Pranav Asthana, Yuanyi Zhong, Yuxiong Wang

    Abstract: We show how to relight a scene, depicted in a single image, such that (a) the overall shading has changed and (b) the resulting image looks like a natural image of that scene. Applications for such a procedure include generating training data and building authoring environments. Naive methods for doing this fail. One reason is that shading and albedo are quite strongly related; for example, sharp… ▽ More

    Submitted 8 December, 2021; originally announced December 2021.

  26. arXiv:2111.10427  [pdf, other

    cs.CV

    DIVeR: Real-time and Accurate Neural Radiance Fields with Deterministic Integration for Volume Rendering

    Authors: Liwen Wu, Jae Yong Lee, Anand Bhattad, Yuxiong Wang, David Forsyth

    Abstract: DIVeR builds on the key ideas of NeRF and its variants -- density models and volume rendering -- to learn 3D object models that can be rendered realistically from small numbers of images. In contrast to all previous NeRF methods, DIVeR uses deterministic rather than stochastic estimates of the volume rendering integral. DIVeR's representation is a voxel based field of features. To compute the volu… ▽ More

    Submitted 18 May, 2022; v1 submitted 19 November, 2021; originally announced November 2021.

  27. arXiv:2111.01619  [pdf, other

    cs.CV cs.LG

    StyleGAN of All Trades: Image Manipulation with Only Pretrained StyleGAN

    Authors: Min ** Chong, Hsin-Ying Lee, David Forsyth

    Abstract: Recently, StyleGAN has enabled various image manipulation and editing tasks thanks to the high-quality generation and the disentangled latent space. However, additional architectures or task-specific training paradigms are usually required for different tasks. In this work, we take a deeper look at the spatial properties of StyleGAN. We show that with a pretrained StyleGAN along with some operatio… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

  28. arXiv:2110.02529  [pdf, other

    cs.CV cs.AI cs.LG

    On the Importance of Firth Bias Reduction in Few-Shot Classification

    Authors: Saba Ghaffari, Ehsan Saleh, David Forsyth, Yu-xiong Wang

    Abstract: Learning accurate classifiers for novel categories from very few examples, known as few-shot image classification, is a challenging task in statistical machine learning and computer vision. The performance in few-shot classification suffers from the bias in the estimation of classifier parameters; however, an effective underlying bias reduction technique that could alleviate this issue in training… ▽ More

    Submitted 14 April, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

  29. arXiv:2108.13459  [pdf, other

    cs.CV cs.GR

    LSD-StructureNet: Modeling Levels of Structural Detail in 3D Part Hierarchies

    Authors: Dominic Roberts, Ara Danielyan, Hang Chu, Mani Golparvar-Fard, David Forsyth

    Abstract: Generative models for 3D shapes represented by hierarchies of parts can generate realistic and diverse sets of outputs. However, existing models suffer from the key practical limitation of modelling shapes holistically and thus cannot perform conditional sampling, i.e. they are not able to generate variants on individual parts of generated shapes without modifying the rest of the shape. This is li… ▽ More

    Submitted 7 September, 2021; v1 submitted 18 August, 2021; originally announced August 2021.

    Comments: accepted by ICCV 2021

  30. arXiv:2108.08922  [pdf, other

    cs.CV

    Controlled GAN-Based Creature Synthesis via a Challenging Game Art Dataset -- Addressing the Noise-Latent Trade-Off

    Authors: Vaibhav Vavilala, David Forsyth

    Abstract: The state-of-the-art StyleGAN2 network supports powerful methods to create and edit art, including generating random images, finding images "like" some query, and modifying content or style. Further, recent advancements enable training with small datasets. We apply these methods to synthesize card art, by training on a novel Yu-Gi-Oh dataset. While noise inputs to StyleGAN2 are essential for good… ▽ More

    Submitted 20 October, 2021; v1 submitted 19 August, 2021; originally announced August 2021.

    Comments: 10 pages, 10 figures

  31. arXiv:2107.06256  [pdf, other

    cs.CV cs.LG

    Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval

    Authors: Min ** Chong, Wen-Sheng Chu, Abhishek Kumar, David Forsyth

    Abstract: We present Retrieve in Style (RIS), an unsupervised framework for facial feature transfer and retrieval on real images. Recent work shows capabilities of transferring local facial features by capitalizing on the disentanglement property of the StyleGAN latent space. RIS improves existing art on the following: 1) Introducing more effective feature disentanglement to allow for challenging transfers… ▽ More

    Submitted 24 August, 2021; v1 submitted 13 July, 2021; originally announced July 2021.

    Comments: Code is here https://github.com/mchong6/RetrieveInStyle

  32. arXiv:2106.06561  [pdf, other

    cs.CV cs.GR cs.LG

    GANs N' Roses: Stable, Controllable, Diverse Image to Image Translation (works for videos too!)

    Authors: Min ** Chong, David Forsyth

    Abstract: We show how to learn a map that takes a content code, derived from a face image, and a randomly chosen style code to an anime image. We derive an adversarial loss from our simple and effective definitions of style and content. This adversarial loss guarantees the map is diverse -- a very wide range of anime can be produced from a single content code. Under plausible assumptions, the map is not jus… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

    Comments: code is here https://github.com/mchong6/GANsNRoses

  33. arXiv:2011.10512  [pdf, other

    cs.CV

    Intrinsic Image Decomposition using Paradigms

    Authors: D. A. Forsyth, Jason J. Rock

    Abstract: Intrinsic image decomposition is the classical task of map** image to albedo. The WHDR dataset allows methods to be evaluated by comparing predictions to human judgements ("lighter", "same as", "darker"). The best modern intrinsic image methods learn a map from image to albedo using rendered models and human judgements. This is convenient for practical methods, but cannot explain how a visual ag… ▽ More

    Submitted 20 November, 2020; originally announced November 2020.

  34. arXiv:2011.10142  [pdf, other

    cs.CV

    Cooperating RPN's Improve Few-Shot Object Detection

    Authors: Weilin Zhang, Yu-Xiong Wang, David A. Forsyth

    Abstract: Learning to detect an object in an image from very few training examples - few-shot object detection - is challenging, because the classifier that sees proposal boxes has very little training data. A particularly challenging training regime occurs when there are one or two training examples. In this case, if the region proposal network (RPN) misses even one high intersection-over-union (IOU) train… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

  35. arXiv:2010.05907  [pdf, other

    cs.CV cs.GR cs.LG

    Cut-and-Paste Object Insertion by Enabling Deep Image Prior for Reshading

    Authors: Anand Bhattad, David A. Forsyth

    Abstract: We show how to insert an object from one image to another and get realistic results in the hard case, where the shading of the inserted object clashes with the shading of the scene. Rendering objects using an illumination model of the scene doesn't work, because doing so requires a geometric and material model of the object, which is hard to recover from a single image. In this paper, we introduce… ▽ More

    Submitted 13 September, 2022; v1 submitted 12 October, 2020; originally announced October 2020.

    Comments: 3DV 2022

  36. arXiv:2003.10817  [pdf, other

    cs.CV

    Toward Accurate and Realistic Virtual Try-on Through Shape Matching and Multiple Warps

    Authors: Kedan Li, Min ** Chong, **gen Liu, David Forsyth

    Abstract: A virtual try-on method takes a product image and an image of a model and produces an image of the model wearing the product. Most methods essentially compute warps from the product image to the model image and combine using image generation methods. However, obtaining a realistic image is challenging because the kinematics of garments is complex and because outline, texture, and shading cues in t… ▽ More

    Submitted 26 March, 2020; v1 submitted 21 March, 2020; originally announced March 2020.

  37. arXiv:1912.11568  [pdf, other

    cs.GR eess.IV

    Blind Recovery of Spatially Varying Reflectance from a Single Image

    Authors: Kevin Karsch, David Forsyth

    Abstract: We propose a new technique for estimating spatially varying parametric materials from a single image of an object with unknown shape in unknown illumination. Our method uses a low-order parametric reflectance model, and incorporates strong assumptions about lighting and shape. We develop new priors about how materials mix over space, and jointly infer all of these properties from a single image. T… ▽ More

    Submitted 24 December, 2019; originally announced December 2019.

  38. arXiv:1912.11567  [pdf, other

    cs.GR

    ConstructAide: Analyzing and Visualizing Construction Sites through Photographs and Building Models

    Authors: Kevin Karsch, Mani Golparvar-Fard, David Forsyth

    Abstract: We describe a set of tools for analyzing, visualizing, and assessing architectural/construction progress with unordered photo collections and 3D building models. With our interface, a user guides the registration of the model in one of the images, and our system automatically computes the alignment for the rest of the photos using a novel Structure-from-Motion (SfM) technique; images with nearby v… ▽ More

    Submitted 24 December, 2019; originally announced December 2019.

  39. arXiv:1912.11565  [pdf, other

    cs.GR

    Rendering Synthetic Objects into Legacy Photographs

    Authors: Kevin Karsch, Varsha Hedau, David Forsyth, Derek Hoiem

    Abstract: We propose a method to realistically insert synthetic objects into existing photographs without requiring access to the scene or any additional scene measurements. With a single image and a small amount of annotation, our method creates a physical model of the scene that is suitable for realistically rendering synthetic objects with diffuse, specular, and even glowing materials while accounting fo… ▽ More

    Submitted 24 December, 2019; originally announced December 2019.

  40. arXiv:1912.00578  [pdf, other

    cs.CV

    Exposing and Correcting the Gender Bias in Image Captioning Datasets and Models

    Authors: Shruti Bhargava, David Forsyth

    Abstract: The task of image captioning implicitly involves gender identification. However, due to the gender bias in data, gender identification by an image captioning model suffers. Also, the gender-activity bias, owing to the word-by-word prediction, influences other words in the caption prediction, resulting in the well-known problem of label bias. In this work, we investigate gender bias in the COCO cap… ▽ More

    Submitted 1 December, 2019; originally announced December 2019.

    Comments: 44 pages

  41. arXiv:1911.07023  [pdf, other

    cs.CV cs.LG

    Effectively Unbiased FID and Inception Score and where to find them

    Authors: Min ** Chong, David Forsyth

    Abstract: This paper shows that two commonly used evaluation metrics for generative models, the Fréchet Inception Distance (FID) and the Inception Score (IS), are biased -- the expected value of the score computed for a finite sample set is not the true value of the score. Worse, the paper shows that the bias term depends on the particular model being evaluated, so model A may get a better score than model… ▽ More

    Submitted 15 June, 2020; v1 submitted 16 November, 2019; originally announced November 2019.

    Comments: CVPR 2020

  42. arXiv:1910.09447  [pdf, other

    cs.CV

    Improving Style Transfer with Calibrated Metrics

    Authors: Mao-Chuang Yeh, Shuai Tang, Anand Bhattad, Chuhang Zou, David Forsyth

    Abstract: Style transfer methods produce a transferred image which is a rendering of a content image in the manner of a style image. We seek to understand how to improve style transfer. To do so requires quantitative evaluation procedures, but the current evaluation is qualitative, mostly involving user studies. We describe a novel quantitative evaluation procedure. Our procedure relies on two statistics:… ▽ More

    Submitted 13 February, 2020; v1 submitted 21 October, 2019; originally announced October 2019.

    Comments: updated conference camera ready version. arXiv admin note: text overlap with arXiv:1804.00118

  43. arXiv:1909.00915  [pdf, other

    cs.CV

    Counterfactual Depth from a Single RGB Image

    Authors: Theerasit Issaranon, Chuhang Zou, David Forsyth

    Abstract: We describe a method that predicts, from a single RGB image, a depth map that describes the scene when a masked object is removed - we call this "counterfactual depth" that models hidden scene geometry together with the observations. Our method works for the same reason that scene completion works: the spatial structure of objects is simple. But we offer a much higher resolution representation of… ▽ More

    Submitted 2 September, 2019; originally announced September 2019.

  44. arXiv:1906.07273  [pdf, other

    cs.CV

    Coherent and Controllable Outfit Generation

    Authors: Kedan Li, Chen Liu, David Forsyth

    Abstract: When thinking about dressing oneself, people often have a theme in mind whether they're going to a tropical getaway or wish to appear attractive at a cocktail party. A useful outfit generation system should come up with clothing items that are compatible while matching a theme specified by the user. Existing methods use item-wise compatibility between products but lack an effective way to enforce… ▽ More

    Submitted 16 November, 2019; v1 submitted 17 June, 2019; originally announced June 2019.

  45. arXiv:1905.10797  [pdf, other

    cs.CV

    Why do These Match? Explaining the Behavior of Image Similarity Models

    Authors: Bryan A. Plummer, Mariya I. Vasileva, Vitali Petsiuk, Kate Saenko, David Forsyth

    Abstract: Explaining a deep learning model can help users understand its behavior and allow researchers to discern its shortcomings. Recent work has primarily focused on explaining models for tasks like image classification or visual question answering. In this paper, we introduce Salient Attributes for Network Explanation (SANE) to explain image similarity models, where a model's output is a score measurin… ▽ More

    Submitted 24 August, 2020; v1 submitted 26 May, 2019; originally announced May 2019.

    Comments: Accepted at ECCV 2020

  46. arXiv:1904.06347  [pdf, other

    cs.CV

    Unrestricted Adversarial Examples via Semantic Manipulation

    Authors: Anand Bhattad, Min ** Chong, Kaizhao Liang, Bo Li, D. A. Forsyth

    Abstract: Machine learning models, especially deep neural networks (DNNs), have been shown to be vulnerable against adversarial examples which are carefully crafted samples with a small magnitude of the perturbation. Such adversarial perturbations are usually restricted by bounding their $\mathcal{L}_p$ norm such that they are imperceptible, and thus many current defenses can exploit this property to reduce… ▽ More

    Submitted 20 March, 2020; v1 submitted 12 April, 2019; originally announced April 2019.

    Comments: Accepted to ICLR 2020. First two authors contributed equally. Code: https://github.com/aisecure/Big-but-Invisible-Adversarial-Attack and Openreview: https://openreview.net/forum?id=Sye_OgHFwH

  47. arXiv:1904.05877  [pdf, other

    cs.LG cs.CV stat.ML

    Max-Sliced Wasserstein Distance and its use for GANs

    Authors: Ishan Deshpande, Yuan-Ting Hu, Ruoyu Sun, Ayis Pyrros, Nasir Siddiqui, Sanmi Koyejo, Zhizhen Zhao, David Forsyth, Alexander Schwing

    Abstract: Generative adversarial nets (GANs) and variational auto-encoders have significantly improved our distribution modeling capabilities, showing promise for dataset augmentation, image-to-image translation and feature learning. However, to model high-dimensional distributions, sequential training and stacked architectures are common, increasing the number of tunable hyper-parameters as well as the tra… ▽ More

    Submitted 11 April, 2019; originally announced April 2019.

    Comments: Accepted to CVPR 2019

  48. arXiv:1809.02129  [pdf, other

    cs.CV cs.LG

    Structural Consistency and Controllability for Diverse Colorization

    Authors: Safa Messaoud, David Forsyth, Alexander G. Schwing

    Abstract: Colorizing a given gray-level image is an important task in the media and advertising industry. Due to the ambiguity inherent to colorization (many shades are often plausible), recent approaches started to explicitly model diversity. However, one of the most obvious artifacts, structural inconsistency, is rarely considered by existing methods which predict chrominance independently for every pixel… ▽ More

    Submitted 6 September, 2018; originally announced September 2018.

    Comments: Accepted to ECCV 2018

  49. arXiv:1805.12589  [pdf, other

    cs.CV

    Fast, Diverse and Accurate Image Captioning Guided By Part-of-Speech

    Authors: Aditya Deshpande, Jyoti Aneja, Liwei Wang, Alexander Schwing, D. A. Forsyth

    Abstract: Image captioning is an ambiguous problem, with many suitable captions for an image. To address ambiguity, beam search is the de facto method for sampling multiple captions. However, beam search is computationally expensive and known to produce generic captions. To address this concern, some variational auto-encoder (VAE) and generative adversarial net (GAN) based methods have been proposed. Though… ▽ More

    Submitted 10 April, 2019; v1 submitted 31 May, 2018; originally announced May 2018.

    Comments: 12 pages with references and appendix. To appear CVPR'19

  50. An Approximate Shading Model with Detail Decomposition for Object Relighting

    Authors: Zicheng Liao, Kevin Karsch, Hongyi Zhang, David Forsyth

    Abstract: We present an object relighting system that allows an artist to select an object from an image and insert it into a target scene. Through simple interactions, the system can adjust illumination on the inserted object so that it appears naturally in the scene. To support image-based relighting, we build object model from the image, and propose a \emph{perceptually-inspired} approximate shading mode… ▽ More

    Submitted 20 April, 2018; originally announced April 2018.