Skip to main content

Showing 1–15 of 15 results for author: Sud, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08603  [pdf, other

    cs.CV cs.AI cs.LG

    FakeInversion: Learning to Detect Images from Unseen Text-to-Image Models by Inverting Stable Diffusion

    Authors: George Cazenavette, Avneesh Sud, Thomas Leung, Ben Usman

    Abstract: Due to the high potential for abuse of GenAI systems, the task of detecting synthetic images has recently become of great interest to the research community. Unfortunately, existing image-space detectors quickly become obsolete as new high-fidelity text-to-image models are developed at blinding speed. In this work, we propose a new synthetic image detector that uses features obtained by inverting… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Project page: https://fake-inversion.github.io

    Journal ref: CVPR 2024

  2. arXiv:2212.10957  [pdf, other

    cs.CV

    TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization

    Authors: Fabrizio Guillaro, Davide Cozzolino, Avneesh Sud, Nicholas Dufour, Luisa Verdoliva

    Abstract: In this paper we present TruFor, a forensic framework that can be applied to a large variety of image manipulation methods, from classic cheapfakes to more recent manipulations based on deep learning. We rely on the extraction of both high-level and low-level traces through a transformer-based fusion architecture that combines the RGB image and a learned noise-sensitive fingerprint. The latter lea… ▽ More

    Submitted 25 May, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

  3. arXiv:2207.13061  [pdf, other

    cs.CV cs.AI cs.CL

    NewsStories: Illustrating articles with visual summaries

    Authors: Reuben Tan, Bryan A. Plummer, Kate Saenko, JP Lewis, Avneesh Sud, Thomas Leung

    Abstract: Recent self-supervised approaches have used large-scale image-text datasets to learn powerful representations that transfer to many tasks without finetuning. These methods often assume that there is one-to-one correspondence between its images and their (short) captions. However, many tasks require reasoning about multiple images and long text narratives, such as describing news articles with visu… ▽ More

    Submitted 14 August, 2022; v1 submitted 26 July, 2022; originally announced July 2022.

    Comments: Accepted at ECCV 2022

  4. arXiv:2110.11325  [pdf, other

    cs.CV

    Learning 3D Semantic Segmentation with only 2D Image Supervision

    Authors: Kyle Genova, Xiaoqi Yin, Abhijit Kundu, Caroline Pantofaru, Forrester Cole, Avneesh Sud, Brian Brewington, Brian Shucker, Thomas Funkhouser

    Abstract: With the recent growth of urban map** and autonomous driving efforts, there has been an explosion of raw 3D data collected from terrestrial platforms with lidar scanners and color cameras. However, due to high labeling costs, ground-truth 3D semantic segmentation annotations are limited in both quantity and geographic diversity, while also being difficult to transfer across sensors. In contrast,… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: Accepted to 3DV 2021 (Oral)

  5. arXiv:2108.04886  [pdf, other

    cs.GR cs.CV

    Differentiable Surface Rendering via Non-Differentiable Sampling

    Authors: Forrester Cole, Kyle Genova, Avneesh Sud, Daniel Vlasic, Zhoutong Zhang

    Abstract: We present a method for differentiable rendering of 3D surfaces that supports both explicit and implicit representations, provides derivatives at occlusion boundaries, and is fast and simple to implement. The method first samples the surface using non-differentiable rasterization, then applies differentiable, depth-aware point splatting to produce the final image. Our approach requires no differen… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

    Comments: Accepted to ICCV 2021

  6. arXiv:2108.04869  [pdf, other

    cs.CV cs.LG

    MetaPose: Fast 3D Pose from Multiple Views without 3D Supervision

    Authors: Ben Usman, Andrea Tagliasacchi, Kate Saenko, Avneesh Sud

    Abstract: In the era of deep learning, human pose estimation from multiple cameras with unknown calibration has received little attention to date. We show how to train a neural model to perform this task with high precision and minimal latency overhead. The proposed model takes into account joint location uncertainty due to occlusion from multiple views, and requires only 2D keypoint data for training. Our… ▽ More

    Submitted 25 November, 2021; v1 submitted 10 August, 2021; originally announced August 2021.

  7. arXiv:2106.09251  [pdf, other

    cs.CV

    Optical Mouse: 3D Mouse Pose From Single-View Video

    Authors: Bo Hu, Bryan Seybold, Shan Yang, David Ross, Avneesh Sud, Graham Ruby, Yi Liu

    Abstract: We present a method to infer the 3D pose of mice, including the limbs and feet, from monocular videos. Many human clinical conditions and their corresponding animal models result in abnormal motion, and accurately measuring 3D motion at scale offers insights into health. The 3D poses improve classification of health-related attributes over 2D representations. The inferred poses are accurate enough… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

  8. arXiv:2012.10518  [pdf, other

    cs.CV

    Human 3D keypoints via spatial uncertainty modeling

    Authors: Francis Williams, Or Litany, Avneesh Sud, Kevin Swersky, Andrea Tagliasacchi

    Abstract: We introduce a technique for 3D human keypoint estimation that directly models the notion of spatial uncertainty of a keypoint. Our technique employs a principled approach to modelling spatial uncertainty inspired from techniques in robust statistics. Furthermore, our pipeline requires no 3D ground truth labels, relying instead on (possibly noisy) 2D image-level keypoints. Our method achieves near… ▽ More

    Submitted 18 December, 2020; originally announced December 2020.

  9. arXiv:2011.04755  [pdf, other

    cs.CV

    Learning to Infer Semantic Parameters for 3D Shape Editing

    Authors: Fangyin Wei, Elena Sizikova, Avneesh Sud, Szymon Rusinkiewicz, Thomas Funkhouser

    Abstract: Many applications in 3D shape design and augmentation require the ability to make specific edits to an object's semantic parameters (e.g., the pose of a person's arm or the length of an airplane's wing) while preserving as much existing details as possible. We propose to learn a deep network that infers the semantic parameters of an input shape and then allows the user to manipulate those paramete… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: 22 pages and 19 figures including supplementary material; to be published in the proceedings of 3DV 2020

  10. arXiv:2003.12170  [pdf, other

    cs.LG stat.ML

    Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment

    Authors: Ben Usman, Avneesh Sud, Nick Dufour, Kate Saenko

    Abstract: Distribution alignment has many applications in deep learning, including domain adaptation and unsupervised image-to-image translation. Most prior work on unsupervised distribution alignment relies either on minimizing simple non-parametric statistical distances such as maximum mean discrepancy or on adversarial alignment. However, the former fails to capture the structure of complex real-world di… ▽ More

    Submitted 26 October, 2020; v1 submitted 26 March, 2020; originally announced March 2020.

  11. arXiv:2003.08981  [pdf, other

    cs.CV cs.CG cs.LG

    Local Implicit Grid Representations for 3D Scenes

    Authors: Chiyu Max Jiang, Avneesh Sud, Ameesh Makadia, **gwei Huang, Matthias Nießner, Thomas Funkhouser

    Abstract: Shape priors learned from data are commonly used to reconstruct 3D objects from partial or noisy data. Yet no such shape priors are available for indoor scenes, since typical 3D autoencoders cannot handle their scale, complexity, or diversity. In this paper, we introduce Local Implicit Grid Representations, a new 3D shape representation designed for scalability and generality. The motivating idea… ▽ More

    Submitted 19 March, 2020; originally announced March 2020.

    Comments: CVPR 2020. Supplementary Video: https://youtu.be/XCyl1-vxfII

  12. arXiv:1912.06126  [pdf, other

    cs.CV cs.GR

    Local Deep Implicit Functions for 3D Shape

    Authors: Kyle Genova, Forrester Cole, Avneesh Sud, Aaron Sarna, Thomas Funkhouser

    Abstract: The goal of this project is to learn a 3D shape representation that enables accurate surface reconstruction, compact storage, efficient computation, consistency for similar shapes, generalization across diverse shape categories, and inference from depth camera observations. Towards this end, we introduce Local Deep Implicit Functions (LDIF), a 3D shape representation that decomposes space into a s… ▽ More

    Submitted 11 June, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

    Comments: Camera ready version for CVPR 2020 Oral. Prior to review, this paper was referred to as DSIF, "Deep Structured Implicit Functions." 11 pages, 9 figures. Project video at https://youtu.be/3RAITzNWVJs

  13. arXiv:1906.03281  [pdf, other

    cs.LG stat.ML

    Latent feature disentanglement for 3D meshes

    Authors: Jake Levinson, Avneesh Sud, Ameesh Makadia

    Abstract: Generative modeling of 3D shapes has become an important problem due to its relevance to many applications across Computer Vision, Graphics, and VR. In this paper we build upon recently introduced 3D mesh-convolutional Variational AutoEncoders which have shown great promise for learning rich representations of deformable 3D shapes. We introduce a supervised generative 3D mesh model that disentangl… ▽ More

    Submitted 7 June, 2019; originally announced June 2019.

  14. arXiv:1812.02716  [pdf, other

    cs.CV

    Cross-Domain 3D Equivariant Image Embeddings

    Authors: Carlos Esteves, Avneesh Sud, Zhengyi Luo, Kostas Daniilidis, Ameesh Makadia

    Abstract: Spherical convolutional networks have been introduced recently as tools to learn powerful feature representations of 3D shapes. Spherical CNNs are equivariant to 3D rotations making them ideally suited to applications where 3D data may be observed in arbitrary orientations. In this paper we learn 2D image embeddings with a similar equivariant structure: embedding the image of a 3D object should co… ▽ More

    Submitted 14 May, 2019; v1 submitted 6 December, 2018; originally announced December 2018.

    Comments: Accepted to the International Conference on Machine Learning, ICML 2019

  15. arXiv:1707.07204  [pdf, other

    cs.CV

    Eyemotion: Classifying facial expressions in VR using eye-tracking cameras

    Authors: Steven Hickson, Nick Dufour, Avneesh Sud, Vivek Kwatra, Irfan Essa

    Abstract: One of the main challenges of social interaction in virtual reality settings is that head-mounted displays occlude a large portion of the face, blocking facial expressions and thereby restricting social engagement cues among users. Hence, auxiliary means of sensing and conveying these expressions are needed. We present an algorithm to automatically infer expressions by analyzing only a partially o… ▽ More

    Submitted 28 July, 2017; v1 submitted 22 July, 2017; originally announced July 2017.

    Comments: Uploaded Supplementary PDF. Fixed author affiliation. Corrected typo in personalization accuracy