Search | arXiv e-print repository

arXiv:2405.19569 [pdf, other]

Improved Convex Decomposition with Ensembling and Boolean Primitives

Authors: Vaibhav Vavilala, Florian Kluger, Seemandhar Jain, Bodo Rosenhahn, David Forsyth

Abstract: Describing a scene in terms of primitives -- geometrically simple shapes that offer a parsimonious but accurate abstraction of structure -- is an established vision problem. This is a good model of a difficult fitting problem: different scenes require different numbers of primitives and primitives interact strongly, but any proposed solution can be evaluated at inference time. The state of the art… ▽ More Describing a scene in terms of primitives -- geometrically simple shapes that offer a parsimonious but accurate abstraction of structure -- is an established vision problem. This is a good model of a difficult fitting problem: different scenes require different numbers of primitives and primitives interact strongly, but any proposed solution can be evaluated at inference time. The state of the art method involves a learned regression procedure to predict a start point consisting of a fixed number of primitives, followed by a descent method to refine the geometry and remove redundant primitives. Methods are evaluated by accuracy in depth and normal prediction and in scene segmentation. This paper shows that very significant improvements in accuracy can be obtained by (a) incorporating a small number of negative primitives and (b) ensembling over a number of different regression procedures. Ensembling is by refining each predicted start point, then choosing the best by fitting loss. Extensive experiments on a standard dataset confirm that negative primitives are useful in a large fraction of images, and that our refine-then-choose strategy outperforms choose-then-refine, confirming that the fitting problem is very difficult. △ Less

Submitted 9 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

Comments: 18 pages, 9 figures, 7 tables

arXiv:2404.00491 [pdf, other]

Denoising Monte Carlo Renders With Diffusion Models

Authors: Vaibhav Vavilala, Rahul Vasanth, David Forsyth

Abstract: Physically-based renderings contain Monte-Carlo noise, with variance that increases as the number of rays per pixel decreases. This noise, while zero-mean for good modern renderers, can have heavy tails (most notably, for scenes containing specular or refractive objects). Learned methods for restoring low fidelity renders are highly developed, because suppressing render noise means one can save co… ▽ More Physically-based renderings contain Monte-Carlo noise, with variance that increases as the number of rays per pixel decreases. This noise, while zero-mean for good modern renderers, can have heavy tails (most notably, for scenes containing specular or refractive objects). Learned methods for restoring low fidelity renders are highly developed, because suppressing render noise means one can save compute and use fast renders with few rays per pixel. We demonstrate that a diffusion model can denoise low fidelity renders successfully. Furthermore, our method can be conditioned on a variety of natural render information, and this conditioning helps performance. Quantitative experiments show that our method is competitive with SOTA across a range of sampling rates, but current metrics slightly favor competitor methods. Qualitative examination of the reconstructions suggests that the metrics themselves may not be reliable. The image prior applied by a diffusion method strongly favors reconstructions that are "like" real images -- so have straight shadow boundaries, curved specularities, no "fireflies" and the like -- and metrics do not account for this. We show numerous examples where methods preferred by current metrics produce qualitatively weaker reconstructions than ours. △ Less

Submitted 30 March, 2024; originally announced April 2024.

Comments: 14 pages, 12 figures

arXiv:2307.04246 [pdf, other]

Convex Decomposition of Indoor Scenes

Authors: Vaibhav Vavilala, David Forsyth

Abstract: We describe a method to parse a complex, cluttered indoor scene into primitives which offer a parsimonious abstraction of scene structure. Our primitives are simple convexes. Our method uses a learned regression procedure to parse a scene into a fixed number of convexes from RGBD input, and can optionally accept segmentations to improve the decomposition. The result is then polished with a descent… ▽ More We describe a method to parse a complex, cluttered indoor scene into primitives which offer a parsimonious abstraction of scene structure. Our primitives are simple convexes. Our method uses a learned regression procedure to parse a scene into a fixed number of convexes from RGBD input, and can optionally accept segmentations to improve the decomposition. The result is then polished with a descent method which adjusts the convexes to produce a very good fit, and greedily removes superfluous primitives. Because the entire scene is parsed, we can evaluate using traditional depth, normal, and segmentation error metrics. Our evaluation procedure demonstrates that the error from our primitive representation is comparable to that of predicting depth from a single image. △ Less

Submitted 15 August, 2023; v1 submitted 9 July, 2023; originally announced July 2023.

Comments: 18 pages, 12 figures

arXiv:2307.03847 [pdf, other]

Blocks2World: Controlling Realistic Scenes with Editable Primitives

Authors: Vaibhav Vavilala, Seemandhar Jain, Rahul Vasanth, Anand Bhattad, David Forsyth

Abstract: We present Blocks2World, a novel method for 3D scene rendering and editing that leverages a two-step process: convex decomposition of images and conditioned synthesis. Our technique begins by extracting 3D parallelepipeds from various objects in a given scene using convex decomposition, thus obtaining a primitive representation of the scene. These primitives are then utilized to generate paired da… ▽ More We present Blocks2World, a novel method for 3D scene rendering and editing that leverages a two-step process: convex decomposition of images and conditioned synthesis. Our technique begins by extracting 3D parallelepipeds from various objects in a given scene using convex decomposition, thus obtaining a primitive representation of the scene. These primitives are then utilized to generate paired data through simple ray-traced depth maps. The next stage involves training a conditioned model that learns to generate images from the 2D-rendered convex primitives. This step establishes a direct map** between the 3D model and its 2D representation, effectively learning the transition from a 3D model to an image. Once the model is fully trained, it offers remarkable control over the synthesis of novel and edited scenes. This is achieved by manipulating the primitives at test time, including translating or adding them, thereby enabling a highly customizable scene rendering process. Our method provides a fresh perspective on 3D scene rendering and editing, offering control and flexibility. It opens up new avenues for research and applications in the field, including authoring and data augmentation. △ Less

Submitted 13 July, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

Comments: 16 pages, 15 figures

arXiv:2307.02698 [pdf, other]

Applying a Color Palette with Local Control using Diffusion Models

Authors: Vaibhav Vavilala, David Forsyth

Abstract: We demonstrate two novel editing procedures in the context of fantasy art. Palette transfer applies a specified reference palette to a given image. For fantasy art, the desired change in palette can be very large, leading to huge changes in the ``look'' of the art. We show that a pipeline of vector quantization; matching; and ``dequantization'' (using a diffusion model) produces successful extreme… ▽ More We demonstrate two novel editing procedures in the context of fantasy art. Palette transfer applies a specified reference palette to a given image. For fantasy art, the desired change in palette can be very large, leading to huge changes in the ``look'' of the art. We show that a pipeline of vector quantization; matching; and ``dequantization'' (using a diffusion model) produces successful extreme palette transfers. A novel training loss measures the match between color distribution in control and generated images even when a ground truth target is not available. This measurably improves performance. Segment control allows an artist to move one or more image segments, and to optionally specify the desired color of the result. The combination of these two types of edit yields valuable workflows. We demonstrate our methods on the challenging Yu-Gi-Oh card art dataset. △ Less

Submitted 2 September, 2023; v1 submitted 5 July, 2023; originally announced July 2023.

Comments: 14 pages, 14 figures

arXiv:2108.08922 [pdf, other]

Controlled GAN-Based Creature Synthesis via a Challenging Game Art Dataset -- Addressing the Noise-Latent Trade-Off

Authors: Vaibhav Vavilala, David Forsyth

Abstract: The state-of-the-art StyleGAN2 network supports powerful methods to create and edit art, including generating random images, finding images "like" some query, and modifying content or style. Further, recent advancements enable training with small datasets. We apply these methods to synthesize card art, by training on a novel Yu-Gi-Oh dataset. While noise inputs to StyleGAN2 are essential for good… ▽ More The state-of-the-art StyleGAN2 network supports powerful methods to create and edit art, including generating random images, finding images "like" some query, and modifying content or style. Further, recent advancements enable training with small datasets. We apply these methods to synthesize card art, by training on a novel Yu-Gi-Oh dataset. While noise inputs to StyleGAN2 are essential for good synthesis, we find that coarse-scale noise interferes with latent variables on this dataset because both control long-scale image effects. We observe over-aggressive variation in art with changes in noise and weak content control via latent variable edits. Here, we demonstrate that training a modified StyleGAN2, where coarse-scale noise is suppressed, removes these unwanted effects. We obtain a superior FID; changes in noise result in local exploration of style; and identity control is markedly improved. These results and analysis lead towards a GAN-assisted art synthesis tool for digital artists of all skill levels, which can be used in film, games, or any creative industry for artistic ideation. △ Less

Submitted 20 October, 2021; v1 submitted 19 August, 2021; originally announced August 2021.

Comments: 10 pages, 10 figures

arXiv:1702.03235 [pdf, other]

Global Patterns of Synchronization in Human Communications

Authors: Alfredo J. Morales, Vaibhav Vavilala, Rosa M. Benito, Yaneer Bar-Yam

Abstract: Social media are transforming global communication and coordination. The data derived from social media can reveal patterns of human behavior at all levels and scales of society. Using geolocated Twitter data, we have quantified collective behaviors across multiple scales, ranging from the commutes of individuals, to the daily pulse of 50 major urban areas and global patterns of human coordination… ▽ More Social media are transforming global communication and coordination. The data derived from social media can reveal patterns of human behavior at all levels and scales of society. Using geolocated Twitter data, we have quantified collective behaviors across multiple scales, ranging from the commutes of individuals, to the daily pulse of 50 major urban areas and global patterns of human coordination. Human activity and mobility patterns manifest the synchrony required for contingency of actions between individuals. Urban areas show regular cycles of contraction and expansion that resembles heartbeats linked primarily to social rather than natural cycles. Business hours and circadian rhythms influence daily cycles of work, recreation, and sleep. Different urban areas have characteristic signatures of daily collective activities. The differences are consistent with a new emergent global synchrony that couples behavior in distant regions across the world. A globally synchronized peak that includes exchange of ideas and information across Europe, Africa, Asia and Australasia. We propose a dynamical model to explain the emergence of global synchrony in the context of increasing global communication and reproduce the observed behavior. The collective patterns we observe show how social interactions lead to interdependence of behavior manifest in the synchronization of communication. The creation and maintenance of temporally sensitive social relationships results in the emergence of complexity of the larger scale behavior of the social system. △ Less

Submitted 10 February, 2017; originally announced February 2017.

Comments: 20 pages, 12 figures. arXiv admin note: substantial text overlap with arXiv:1602.06219

arXiv:1605.09737 [pdf, other]

3D Printed Stencils for Texturing Flat Surfaces

Authors: Vaibhav Vavilala

Abstract: We address the problem of texturing flat surfaces by spray-painting through 3D printed stencils. We propose a system that (1) decomposes an image into alpha-blended layers; (2) computes a stippling given a transparency channel; (3) generates a 3D printed stencil given a stippling and (4) simulates the effects of spray-painting through the stencil. We address the problem of texturing flat surfaces by spray-painting through 3D printed stencils. We propose a system that (1) decomposes an image into alpha-blended layers; (2) computes a stippling given a transparency channel; (3) generates a 3D printed stencil given a stippling and (4) simulates the effects of spray-painting through the stencil. △ Less

Submitted 31 May, 2016; originally announced May 2016.

Comments: 4 pages, 7 figures

arXiv:1602.06219 [pdf, other]

Global Patterns of Human Synchronization

Authors: Alfredo J. Morales, Vaibhav Vavilala, Rosa M. Benito, Yaneer Bar-Yam

Abstract: Social media are transforming global communication and coordination and provide unprecedented opportunities for studying socio-technical domains. Here we study global dynamical patterns of communication on Twitter across many scales. Underlying the observed patterns is both the diurnal rotation of the earth, day and night, and the synchrony required for contingency of actions between individuals.… ▽ More Social media are transforming global communication and coordination and provide unprecedented opportunities for studying socio-technical domains. Here we study global dynamical patterns of communication on Twitter across many scales. Underlying the observed patterns is both the diurnal rotation of the earth, day and night, and the synchrony required for contingency of actions between individuals. We find that urban areas show a cyclic contraction and expansion that resembles heartbeats linked to social rather than natural cycles. Different urban areas have characteristic signatures of daily collective activities. We show that the differences detected are consistent with a new emergent global synchrony that couples behavior in distant regions across the world. Although local synchrony is the major force that shapes the collective behavior in cities, a larger-scale synchronization is beginning to occur. △ Less

Submitted 19 February, 2016; originally announced February 2016.

Comments: 10 pages 4 figures

Showing 1–9 of 9 results for author: Vavilala, V