Skip to main content

Showing 1–50 of 71 results for author: Guibas, L J

.
  1. arXiv:2405.19678  [pdf, other

    cs.CV cs.AI

    View-Consistent Hierarchical 3D SegmentationUsing Ultrametric Feature Fields

    Authors: Haodi He, Colton Stearns, Adam W. Harley, Leonidas J. Guibas

    Abstract: Large-scale vision foundation models such as Segment Anything (SAM) demonstrate impressive performance in zero-shot image segmentation at multiple levels of granularity. However, these zero-shot predictions are rarely 3D-consistent. As the camera viewpoint changes in a scene, so do the segmentation predictions, as well as the characterizations of ``coarse" or ``fine" granularity. In this work, we… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  2. arXiv:2401.10822  [pdf, other

    cs.CV

    ActAnywhere: Subject-Aware Video Background Generation

    Authors: Boxiao Pan, Zhan Xu, Chun-Hao Paul Huang, Krishna Kumar Singh, Yang Zhou, Leonidas J. Guibas, Jimei Yang

    Abstract: Generating video background that tailors to foreground subject motion is an important problem for the movie industry and visual effects community. This task involves synthesizing background that aligns with the motion and appearance of the foreground subject, while also complies with the artist's creative intention. We introduce ActAnywhere, a generative model that automates this process which tra… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

  3. arXiv:2401.00850  [pdf, other

    cs.CV cs.AI

    Refining Pre-Trained Motion Models

    Authors: Xinglong Sun, Adam W. Harley, Leonidas J. Guibas

    Abstract: Given the difficulty of manually annotating motion in video, the current best motion estimation methods are trained with synthetic data, and therefore struggle somewhat due to a train/test gap. Self-supervised methods hold the promise of training directly on real video, but typically perform worse. These include methods trained with warp error (i.e., color constancy) combined with smoothness terms… ▽ More

    Submitted 16 February, 2024; v1 submitted 1 January, 2024; originally announced January 2024.

    Comments: Accepted at ICRA 2024

  4. arXiv:2312.15610  [pdf, other

    cs.CV

    Towards Learning Geometric Eigen-Lengths Crucial for Fitting Tasks

    Authors: Yijia Weng, Kaichun Mo, Ruoxi Shi, Yanchao Yang, Leonidas J. Guibas

    Abstract: Some extremely low-dimensional yet crucial geometric eigen-lengths often determine the success of some geometric tasks. For example, the height of an object is important to measure to check if it can fit between the shelves of a cabinet, while the width of a couch is crucial when trying to move it through a doorway. Humans have materialized such crucial geometric eigen-lengths in common sense sinc… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

    Comments: ICML 2023. Project page: https://yijiaweng.github.io/geo-eigen-length

    Journal ref: Proceedings of the 40th International Conference on Machine Learning, PMLR 202:36958-36977, 2023

  5. OptCtrlPoints: Finding the Optimal Control Points for Biharmonic 3D Shape Deformation

    Authors: Kunho Kim, Mikaela Angelina Uy, Despoina Paschalidou, Alec Jacobson, Leonidas J. Guibas, Minhyuk Sung

    Abstract: We propose OptCtrlPoints, a data-driven framework designed to identify the optimal sparse set of control points for reproducing target shapes using biharmonic 3D shape deformation. Control-point-based 3D deformation methods are widely utilized for interactive shape editing, and their usability is enhanced when the control points are sparse yet strategically distributed across the shape. With this… ▽ More

    Submitted 13 October, 2023; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: Pacific Graphics 2023 (Full Paper). Project page: https://soulmates2.github.io/publications/OptCtrlPoints/

  6. arXiv:2307.15055  [pdf, other

    cs.CV

    PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking

    Authors: Yang Zheng, Adam W. Harley, Bokui Shen, Gordon Wetzstein, Leonidas J. Guibas

    Abstract: We introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework, for the training and evaluation of long-term fine-grained tracking algorithms. Our goal is to advance the state-of-the-art by placing emphasis on long videos with naturalistic motion. Toward the goal of naturalism, we animate deformable characters using real-world motion capture data, we build 3D scenes to m… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

  7. arXiv:2305.01921  [pdf, other

    cs.CV

    DiffFacto: Controllable Part-Based 3D Point Cloud Generation with Cross Diffusion

    Authors: Kiyohiro Nakayama, Mikaela Angelina Uy, Jiahui Huang, Shi-Min Hu, Ke Li, Leonidas J Guibas

    Abstract: While the community of 3D point cloud generation has witnessed a big growth in recent years, there still lacks an effective way to enable intuitive user control in the generation process, hence limiting the general utility of such methods. Since an intuitive way of decomposing a shape is through its parts, we propose to tackle the task of controllable part-based point cloud generation. We introduc… ▽ More

    Submitted 20 August, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

  8. arXiv:2304.14473  [pdf, other

    cs.CV cs.AI cs.LG

    Learning a Diffusion Prior for NeRFs

    Authors: Guandao Yang, Abhijit Kundu, Leonidas J. Guibas, Jonathan T. Barron, Ben Poole

    Abstract: Neural Radiance Fields (NeRFs) have emerged as a powerful neural 3D representation for objects and scenes derived from 2D data. Generating NeRFs, however, remains difficult in many scenarios. For instance, training a NeRF with only a small number of views as supervision remains challenging since it is an under-constrained problem. In such settings, it calls for some inductive prior to filter out b… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

  9. arXiv:2303.13634  [pdf, other

    cs.LG cs.CE

    Physics-informed PointNet: On how many irregular geometries can it solve an inverse problem simultaneously? Application to linear elasticity

    Authors: Ali Kashefi, Leonidas J. Guibas, Tapan Mukerji

    Abstract: Regular physics-informed neural networks (PINNs) predict the solution of partial differential equations using sparse labeled data but only over a single domain. On the other hand, fully supervised learning models are first trained usually over a few thousand domains with known solutions (i.e., labeled data) and then predict the solution over a few hundred unseen domains. Physics-informed PointNet… ▽ More

    Submitted 18 September, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

  10. arXiv:2303.12050  [pdf, other

    cs.CV

    CurveCloudNet: Processing Point Clouds with 1D Structure

    Authors: Colton Stearns, Davis Rempe, Jiateng Liu, Alex Fu, Sebastien Mascha, Jeong Joon Park, Despoina Paschalidou, Leonidas J. Guibas

    Abstract: Modern depth sensors such as LiDAR operate by swee** laser-beams across the scene, resulting in a point cloud with notable 1D curve-like structures. In this work, we introduce a new point cloud processing scheme and backbone, called CurveCloudNet, which takes advantage of the curve-like structure inherent to these sensors. While existing backbones discard the rich 1D traversal patterns and rely… ▽ More

    Submitted 1 February, 2024; v1 submitted 21 March, 2023; originally announced March 2023.

  11. arXiv:2302.10237  [pdf, other

    cs.GR cs.CV

    SceneHGN: Hierarchical Graph Networks for 3D Indoor Scene Generation with Fine-Grained Geometry

    Authors: Lin Gao, Jia-Mu Sun, Kaichun Mo, Yu-Kun Lai, Leonidas J. Guibas, Jie Yang

    Abstract: 3D indoor scenes are widely used in computer graphics, with applications ranging from interior design to gaming to virtual and augmented reality. They also contain rich information, including room layout, as well as furniture type, geometry, and placement. High-quality 3D indoor scenes are highly demanded while it requires expertise and is time-consuming to design high-quality 3D indoor scenes man… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: 21 pages, 21 figures, Project: http://geometrylearning.com/scenehgn/

  12. arXiv:2211.15903  [pdf, ps, other

    cs.CG cs.LG

    Equivalence Between SE(3) Equivariant Networks via Steerable Kernels and Group Convolution

    Authors: Adrien Poulenard, Maks Ovsjanikov, Leonidas J. Guibas

    Abstract: A wide range of techniques have been proposed in recent years for designing neural networks for 3D data that are equivariant under rotation and translation of the input. Most approaches for equivariance under the Euclidean group $\mathrm{SE}(3)$ of rotations and translations fall within one of the two major categories. The first category consists of methods that use $\mathrm{SE}(3)$-convolution wh… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  13. arXiv:2210.01781  [pdf, other

    cs.CV cs.RO

    COPILOT: Human-Environment Collision Prediction and Localization from Egocentric Videos

    Authors: Boxiao Pan, Bokui Shen, Davis Rempe, Despoina Paschalidou, Kaichun Mo, Yanchao Yang, Leonidas J. Guibas

    Abstract: The ability to forecast human-environment collisions from egocentric observations is vital to enable collision avoidance in applications such as VR, AR, and wearable assistive robotics. In this work, we introduce the challenging problem of predicting collisions in diverse environments from multi-view egocentric videos captured from body-mounted cameras. Solving this problem requires a generalizabl… ▽ More

    Submitted 26 March, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

  14. arXiv:2207.06333  [pdf, other

    cs.CV

    6D Camera Relocalization in Visually Ambiguous Extreme Environments

    Authors: Yang Zheng, Tolga Birdal, Fei Xia, Yanchao Yang, Yueqi Duan, Leonidas J. Guibas

    Abstract: We propose a novel method to reliably estimate the pose of a camera given a sequence of images acquired in extreme environments such as deep seas or extraterrestrial terrains. Data acquired under these challenging conditions are corrupted by textureless surfaces, image degradation, and presence of repetitive and highly ambiguous structures. When naively deployed, the state-of-the-art methods can f… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

  15. arXiv:2207.05856  [pdf, other

    cs.CV

    SpOT: Spatiotemporal Modeling for 3D Object Tracking

    Authors: Colton Stearns, Davis Rempe, Jie Li, Rares Ambrus, Sergey Zakharov, Vitor Guizilini, Yanchao Yang, Leonidas J Guibas

    Abstract: 3D multi-object tracking aims to uniquely and consistently identify all mobile entities through time. Despite the rich spatiotemporal information available in this setting, current 3D tracking methods primarily rely on abstracted information and limited history, e.g. single-frame object bounding boxes. In this work, we develop a holistic representation of traffic scenes that leverages both spatial… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

  16. arXiv:2206.06922  [pdf, other

    cs.CV cs.AI cs.LG

    Object Scene Representation Transformer

    Authors: Mehdi S. M. Sajjadi, Daniel Duckworth, Aravindh Mahendran, Sjoerd van Steenkiste, Filip Pavetić, Mario Lučić, Leonidas J. Guibas, Klaus Greff, Thomas Kipf

    Abstract: A compositional understanding of the world in terms of objects and their geometry in 3D space is considered a cornerstone of human cognition. Facilitating the learning of such a representation in neural networks holds promise for substantially improving labeled data efficiency. As a key step in this direction, we make progress on the problem of learning 3D-consistent decompositions of complex scen… ▽ More

    Submitted 12 October, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: Accepted at NeurIPS '22. Project page: https://osrt-paper.github.io/

  17. arXiv:2205.02834  [pdf, other

    cs.CV cs.AI cs.GR cs.LG cs.RO

    Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction

    Authors: Yining Hong, Kaichun Mo, Li Yi, Leonidas J. Guibas, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan

    Abstract: This paper studies the problem of fixing malfunctional 3D objects. While previous works focus on building passive perception models to learn the functionality from static 3D objects, we argue that functionality is reckoned with respect to the physical interactions between the object and the user. Given a malfunctional object, humans can perform mental simulations to reason about its functionality… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

    Comments: CVPR 2022. Project page: http://fixing-malfunctional.csail.mit.edu

  18. arXiv:2204.09443  [pdf, other

    cs.CV

    GIMO: Gaze-Informed Human Motion Prediction in Context

    Authors: Yang Zheng, Yanchao Yang, Kaichun Mo, Jiaman Li, Tao Yu, Yebin Liu, C. Karen Liu, Leonidas J. Guibas

    Abstract: Predicting human motion is critical for assistive robots and AR/VR applications, where the interaction with humans needs to be safe and comfortable. Meanwhile, an accurate prediction depends on understanding both the scene context and human intentions. Even though many works study scene-aware human motion prediction, the latter is largely underexplored due to the lack of ego-centric views that dis… ▽ More

    Submitted 19 July, 2022; v1 submitted 20 April, 2022; originally announced April 2022.

  19. arXiv:2203.06856  [pdf, other

    cs.CV cs.AI cs.RO

    ACID: Action-Conditional Implicit Visual Dynamics for Deformable Object Manipulation

    Authors: Bokui Shen, Zhenyu Jiang, Christopher Choy, Leonidas J. Guibas, Silvio Savarese, Anima Anandkumar, Yuke Zhu

    Abstract: Manipulating volumetric deformable objects in the real world, like plush toys and pizza dough, bring substantial challenges due to infinite shape variations, non-rigid motions, and partial observability. We introduce ACID, an action-conditional visual dynamics model for volumetric deformable objects based on structured implicit neural representations. ACID integrates two new techniques: implicit r… ▽ More

    Submitted 5 August, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: RSS 2022 Best Student Paper Award Finalist. Please check out more details at https://b0ku1.github.io/acid/

    Journal ref: Robotics: Science and Systems (RSS), 2022

  20. arXiv:2201.07788  [pdf, other

    cs.CV cs.AI cs.LG

    ConDor: Self-Supervised Canonicalization of 3D Pose for Partial Shapes

    Authors: Rahul Sajnani, Adrien Poulenard, Jivitesh Jain, Radhika Dua, Leonidas J. Guibas, Srinath Sridhar

    Abstract: Progress in 3D object understanding has relied on manually canonicalized shape datasets that contain instances with consistent position and orientation (3D pose). This has made it hard to generalize these methods to in-the-wild shapes, eg., from internet model collections or depth sensors. ConDor is a self-supervised method that learns to Canonicalize the 3D orientation and position for full and p… ▽ More

    Submitted 14 April, 2022; v1 submitted 19 January, 2022; originally announced January 2022.

    Comments: Accepted to CVPR 2022, New Orleans, Louisiana. For project page and code, see https://ivl.cs.brown.edu/ConDor/

  21. arXiv:2112.05077  [pdf, other

    cs.CV cs.LG cs.RO

    Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior

    Authors: Davis Rempe, Jonah Philion, Leonidas J. Guibas, Sanja Fidler, Or Litany

    Abstract: Evaluating and improving planning for autonomous vehicles requires scalable generation of long-tail traffic scenarios. To be useful, these scenarios must be realistic and challenging, but not impossible to drive through safely. In this work, we introduce STRIVE, a method to automatically generate challenging scenarios that cause a given planner to produce undesirable behavior, like collisions. To… ▽ More

    Submitted 28 March, 2022; v1 submitted 9 December, 2021; originally announced December 2021.

    Comments: CVPR 2022 camera-ready

  22. Multiway Non-rigid Point Cloud Registration via Learned Functional Map Synchronization

    Authors: Jiahui Huang, Tolga Birdal, Zan Gojcic, Leonidas J. Guibas, Shi-Min Hu

    Abstract: We present SyNoRiM, a novel way to jointly register multiple non-rigid shapes by synchronizing the maps relating learned functions defined on the point clouds. Even though the ability to process non-rigid shapes is critical in various applications ranging from computer animation to 3D digitization, the literature still lacks a robust and flexible framework to match and align a collection of real,… ▽ More

    Submitted 1 April, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence 2022

  23. arXiv:2107.07905  [pdf, other

    cs.CV cs.AI

    Unsupervised Discovery of Object Radiance Fields

    Authors: Hong-Xing Yu, Leonidas J. Guibas, Jiajun Wu

    Abstract: We study the problem of inferring an object-centric scene representation from a single image, aiming to derive a representation that explains the image formation process, captures the scene's 3D nature, and is learned without supervision. Most existing methods on scene decomposition lack one or more of these characteristics, due to the fundamental challenge in integrating the complex 3D-to-2D imag… ▽ More

    Submitted 16 March, 2022; v1 submitted 16 July, 2021; originally announced July 2021.

    Comments: ICLR 2022. Project page: https://kovenyu.com/uorf/

  24. arXiv:2105.04668  [pdf, other

    cs.CV cs.LG

    HuMoR: 3D Human Motion Model for Robust Pose Estimation

    Authors: Davis Rempe, Tolga Birdal, Aaron Hertzmann, Jimei Yang, Srinath Sridhar, Leonidas J. Guibas

    Abstract: We introduce HuMoR: a 3D Human Motion Model for Robust Estimation of temporal pose and shape. Though substantial progress has been made in estimating 3D human motion and shape from dynamic observations, recovering plausible pose sequences in the presence of noise and occlusions remains a challenge. For this purpose, we propose an expressive generative model in the form of a conditional variational… ▽ More

    Submitted 18 August, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

    Comments: ICCV 2021 camera ready

  25. arXiv:2104.03437  [pdf, other

    cs.CV

    CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

    Authors: Yijia Weng, He Wang, Qiang Zhou, Yuzhe Qin, Yueqi Duan, Qingnan Fan, Baoquan Chen, Hao Su, Leonidas J. Guibas

    Abstract: In this work, we tackle the problem of category-level online pose tracking of objects from point cloud sequences. For the first time, we propose a unified framework that can handle 9DoF pose tracking for novel rigid object instances as well as per-part pose tracking for articulated objects from known categories. Here the 9DoF pose, comprising 6D pose and 3D size, is equivalent to a 3D amodal bound… ▽ More

    Submitted 21 October, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

    Comments: ICCV 2021 (Oral). Project page: https://yijiaweng.github.io/CAPTRA/

  26. arXiv:2102.08945  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Weakly Supervised Learning of Rigid 3D Scene Flow

    Authors: Zan Gojcic, Or Litany, Andreas Wieser, Leonidas J. Guibas, Tolga Birdal

    Abstract: We propose a data-driven scene flow estimation algorithm exploiting the observation that many 3D scenes can be explained by a collection of agents moving as rigid bodies. At the core of our method lies a deep architecture able to reason at the \textbf{object-level} by considering 3D scene flow in conjunction with other 3D tasks. This object level abstraction, enables us to relax the requirement fo… ▽ More

    Submitted 17 February, 2021; originally announced February 2021.

  27. arXiv:2012.04355  [pdf, other

    cs.CV

    3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection

    Authors: He Wang, Yezhen Cong, Or Litany, Yue Gao, Leonidas J. Guibas

    Abstract: 3D object detection is an important yet demanding task that heavily relies on difficult to obtain 3D annotations. To reduce the required amount of supervision, we propose 3DIoUMatch, a novel semi-supervised method for 3D object detection applicable to both indoor and outdoor scenes. We leverage a teacher-student mutual learning framework to propagate information from the labeled to the unlabeled t… ▽ More

    Submitted 6 July, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

    Comments: CVPR 2021

  28. arXiv:2010.09469  [pdf, other

    cs.LG physics.flu-dyn

    A Point-Cloud Deep Learning Framework for Prediction of Fluid Flow Fields on Irregular Geometries

    Authors: Ali Kashefi, Davis Rempe, Leonidas J. Guibas

    Abstract: We present a novel deep learning framework for flow field predictions in irregular domains when the solution is a function of the geometry of either the domain or objects inside the domain. Grid vertices in a computational fluid dynamics (CFD) domain are viewed as point clouds and used as inputs to a neural network based on the PointNet architecture, which learns an end-to-end map** between spat… ▽ More

    Submitted 16 September, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

  29. arXiv:2010.05272  [pdf, other

    cs.CV

    IF-Defense: 3D Adversarial Point Cloud Defense via Implicit Function based Restoration

    Authors: Ziyi Wu, Yueqi Duan, He Wang, Qingnan Fan, Leonidas J. Guibas

    Abstract: Point cloud is an important 3D data representation widely used in many essential applications. Leveraging deep neural networks, recent works have shown great success in processing 3D point clouds. However, those deep neural networks are vulnerable to various 3D adversarial attacks, which can be summarized as two primary types: point perturbation that affects local point distribution, and surface d… ▽ More

    Submitted 18 March, 2021; v1 submitted 11 October, 2020; originally announced October 2020.

    Comments: 17 pages, 8 figures. Update several experimental results compared with v2, e.g. cross dataset evaluation and baseline results of adversarial training

  30. arXiv:2009.01456  [pdf, other

    cs.GR

    DeformSyncNet: Deformation Transfer via Synchronized Shape Deformation Spaces

    Authors: Minhyuk Sung, Zhenyu Jiang, Panos Achlioptas, Niloy J. Mitra, Leonidas J. Guibas

    Abstract: Shape deformation is an important component in any geometry processing toolbox. The goal is to enable intuitive deformations of single or multiple shapes or to transfer example deformations to new shapes while preserving the plausibility of the deformed shape(s). Existing approaches assume access to point-level or part-level correspondence or establish them in a preprocessing phase, thus limiting… ▽ More

    Submitted 3 September, 2020; originally announced September 2020.

    Comments: SIGGRAPH Asia 2020

  31. arXiv:2008.07760  [pdf, other

    cs.CV

    Pix2Surf: Learning Parametric 3D Surface Models of Objects from Images

    Authors: Jiahui Lei, Srinath Sridhar, Paul Guerrero, Minhyuk Sung, Niloy Mitra, Leonidas J. Guibas

    Abstract: We investigate the problem of learning to generate 3D parametric surface representations for novel object instances, as seen from one or more views. Previous work on learning shape reconstruction from multiple views uses discrete representations such as point clouds or voxels, while continuous surface generation approaches lack multi-view consistency. We address these issues by designing neural ne… ▽ More

    Submitted 18 August, 2020; originally announced August 2020.

    Comments: ECCV 2020

  32. arXiv:2008.05440  [pdf, other

    cs.GR cs.CV

    DSG-Net: Learning Disentangled Structure and Geometry for 3D Shape Generation

    Authors: Jie Yang, Kaichun Mo, Yu-Kun Lai, Leonidas J. Guibas, Lin Gao

    Abstract: D shape generation is a fundamental operation in computer graphics. While significant progress has been made, especially with recent deep generative models, it remains a challenge to synthesize high-quality shapes with rich geometric details and complex structure, in a controllable manner. To tackle this, we introduce DSG-Net, a deep neural network that learns a disentangled structured and geometr… ▽ More

    Submitted 28 May, 2022; v1 submitted 12 August, 2020; originally announced August 2020.

    Comments: Accept to ACM Transaction on Graphics 2022, 26 pages

  33. arXiv:2008.02792  [pdf, other

    cs.CV cs.LG

    CaSPR: Learning Canonical Spatiotemporal Point Cloud Representations

    Authors: Davis Rempe, Tolga Birdal, Yongheng Zhao, Zan Gojcic, Srinath Sridhar, Leonidas J. Guibas

    Abstract: We propose CaSPR, a method to learn object-centric Canonical Spatiotemporal Point Cloud Representations of dynamically moving or evolving objects. Our goal is to enable information aggregation over time and the interrogation of object state at any spatiotemporal neighborhood in the past, observed or not. Different from previous work, CaSPR learns representations that support spacetime continuity,… ▽ More

    Submitted 11 November, 2020; v1 submitted 6 August, 2020; originally announced August 2020.

    Comments: NeurIPS 2020

  34. arXiv:2007.11678  [pdf, other

    cs.CV

    Contact and Human Dynamics from Monocular Video

    Authors: Davis Rempe, Leonidas J. Guibas, Aaron Hertzmann, Bryan Russell, Ruben Villegas, Jimei Yang

    Abstract: Existing deep models predict 2D and 3D kinematic poses from video that are approximately accurate, but contain visible errors that violate physical constraints, such as feet penetrating the ground and bodies leaning at extreme angles. In this paper, we present a physics-based method for inferring 3D human motion from video sequences that takes initial 2D and 3D pose estimates as input. We first es… ▽ More

    Submitted 24 July, 2020; v1 submitted 22 July, 2020; originally announced July 2020.

    Comments: ECCV 2020

  35. arXiv:2007.10985  [pdf, other

    cs.CV

    PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding

    Authors: Saining Xie, Jiatao Gu, Demi Guo, Charles R. Qi, Leonidas J. Guibas, Or Litany

    Abstract: Arguably one of the top success stories of deep learning is transfer learning. The finding that pre-training a network on a rich source set (eg., ImageNet) can help boost performance once fine-tuned on a usually much smaller target set, has been instrumental to many applications in language and vision. Yet, very little is known about its usefulness in 3D point cloud understanding. We see this as a… ▽ More

    Submitted 20 November, 2020; v1 submitted 21 July, 2020; originally announced July 2020.

    Comments: ECCV 2020 (Spotlight); code available at https://github.com/facebookresearch/PointContrast

  36. arXiv:2007.10300  [pdf, other

    cs.CV

    Object-Centric Multi-View Aggregation

    Authors: Shubham Tulsiani, Or Litany, Charles R. Qi, He Wang, Leonidas J. Guibas

    Abstract: We present an approach for aggregating a sparse set of views of an object in order to compute a semi-implicit 3D representation in the form of a volumetric feature grid. Key to our approach is an object-centric canonical 3D coordinate system into which views can be lifted, without explicit camera pose estimation, and then combined -- in a manner that can accommodate a variable number of views and… ▽ More

    Submitted 21 July, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

  37. arXiv:2006.07029  [pdf, other

    cs.CV cs.LG eess.IV

    Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks

    Authors: He Wang, Zetian Jiang, Li Yi, Kaichun Mo, Hao Su, Leonidas J. Guibas

    Abstract: In this paper, we examine the long-neglected yet important effects of point sampling patterns in point cloud GANs. Through extensive experiments, we show that sampling-insensitive discriminators (e.g.PointNet-Max) produce shape point clouds with point clustering artifacts while sampling-oversensitive discriminators (e.g.PointNet++, DGCNN) fail to guide valid shape generation. We propose the concep… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

  38. arXiv:2003.08624  [pdf, other

    cs.CV cs.CG cs.GR

    PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions

    Authors: Kaichun Mo, He Wang, Xinchen Yan, Leonidas J. Guibas

    Abstract: 3D generative shape modeling is a fundamental research area in computer vision and interactive computer graphics, with many real-world applications. This paper investigates the novel problem of generating 3D shape point cloud geometry from a symbolic part tree representation. In order to learn such a conditional shape generation procedure in an end-to-end fashion, we propose a conditional GAN "par… ▽ More

    Submitted 15 July, 2020; v1 submitted 19 March, 2020; originally announced March 2020.

    Comments: ECCV 2020

  39. arXiv:2003.08593  [pdf, other

    cs.CV

    Curriculum DeepSDF

    Authors: Yueqi Duan, Haidong Zhu, He Wang, Li Yi, Ram Nevatia, Leonidas J. Guibas

    Abstract: When learning to sketch, beginners start with simple and flexible shapes, and then gradually strive for more complex and accurate ones in the subsequent training sessions. In this paper, we design a "shape curriculum" for learning continuous Signed Distance Function (SDF) on shapes, namely Curriculum DeepSDF. Inspired by how humans learn, Curriculum DeepSDF organizes the learning task in ascending… ▽ More

    Submitted 16 July, 2020; v1 submitted 19 March, 2020; originally announced March 2020.

    Comments: ECCV 2020

  40. arXiv:2003.08515  [pdf, other

    cs.CV cs.RO

    SAPIEN: A SimulAted Part-based Interactive ENvironment

    Authors: Fanbo Xiang, Yuzhe Qin, Kaichun Mo, Yikuan Xia, Hao Zhu, Fangchen Liu, Minghua Liu, Hanxiao Jiang, Yifu Yuan, He Wang, Li Yi, Angel X. Chang, Leonidas J. Guibas, Hao Su

    Abstract: Building home assistant robots has long been a pursuit for vision and robotics researchers. To achieve this task, a simulated environment with physically realistic simulation, sufficient articulated objects, and transferability to the real robot is indispensable. Existing environments achieve these requirements for robotics simulation with different levels of simplification and focus. We take one… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

  41. arXiv:2001.10692  [pdf, other

    cs.CV

    ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes

    Authors: Charles R. Qi, Xinlei Chen, Or Litany, Leonidas J. Guibas

    Abstract: 3D object detection has seen quick progress thanks to advances in deep learning on point clouds. A few recent works have even shown state-of-the-art performance with just point clouds input (e.g. VoteNet). However, point cloud data have inherent limitations. They are sparse, lack color information and often suffer from sensor noise. Images, on the other hand, have high resolution and rich texture.… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

  42. arXiv:2001.06291  [pdf, other

    cs.CV cs.LG

    Predicting the Physical Dynamics of Unseen 3D Objects

    Authors: Davis Rempe, Srinath Sridhar, He Wang, Leonidas J. Guibas

    Abstract: Machines that can predict the effect of physical interactions on the dynamics of previously unseen object instances are important for creating better robots and interactive virtual worlds. In this work, we focus on predicting the dynamics of 3D objects on a plane that have just been subjected to an impulsive force. In particular, we predict the changes in state - 3D position, rotation, velocities,… ▽ More

    Submitted 16 January, 2020; originally announced January 2020.

    Comments: In Proceedings of Winter Conference on Applications of Computer Vision (WACV) 2020. arXiv admin note: text overlap with arXiv:1901.00466

  43. arXiv:2001.05119  [pdf, other

    cs.CV cs.LG

    Learning multiview 3D point cloud registration

    Authors: Zan Gojcic, Caifa Zhou, Jan D. Wegner, Leonidas J. Guibas, Tolga Birdal

    Abstract: We present a novel, end-to-end learnable, multiview 3D point cloud registration algorithm. Registration of multiple scans typically follows a two-stage pipeline: the initial pairwise alignment and the globally consistent refinement. The former is often ambiguous due to the low overlap of neighboring point clouds, symmetries and repetitive scene parts. Therefore, the latter global refinement aims a… ▽ More

    Submitted 31 March, 2020; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: CVPR2020 - Camera Ready

  44. arXiv:1911.11098  [pdf, other

    cs.CV cs.CG cs.GR

    StructEdit: Learning Structural Shape Variations

    Authors: Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, Leonidas J. Guibas

    Abstract: Learning to encode differences in the geometry and (topological) structure of the shapes of ordinary objects is key to generating semantically plausible variations of a given shape, transferring edits from one shape to another, and many other applications in 3D content creation. The common approach of encoding shapes as points in a high-dimensional latent feature space suggests treating shape diff… ▽ More

    Submitted 25 November, 2019; originally announced November 2019.

  45. arXiv:1908.09073  [pdf, other

    cs.CV

    Situational Fusion of Visual Representation for Visual Navigation

    Authors: Bokui Shen, Danfei Xu, Yuke Zhu, Leonidas J. Guibas, Li Fei-Fei, Silvio Savarese

    Abstract: A complex visual navigation task puts an agent in different situations which call for a diverse range of visual perception abilities. For example, to "go to the nearest chair", the agent might need to identify a chair in a living room using semantics, follow along a hallway using vanishing point cues, and avoid obstacles using depth. Therefore, utilizing the appropriate visual perception abilities… ▽ More

    Submitted 3 August, 2021; v1 submitted 23 August, 2019; originally announced August 2019.

  46. arXiv:1908.00575  [pdf, other

    cs.GR cs.CG cs.CV

    StructureNet: Hierarchical Graph Networks for 3D Shape Generation

    Authors: Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, Leonidas J. Guibas

    Abstract: The ability to generate novel, diverse, and realistic 3D shapes along with associated part semantics and structure is central to many applications requiring high-quality 3D assets or large volumes of realistic training data. A key challenge towards this goal is how to accommodate diverse shape variations, including both continuous deformations of parts as well as structural or discrete alterations… ▽ More

    Submitted 1 August, 2019; originally announced August 2019.

    Comments: Conditionally Accepted to Siggraph Asia 2019

  47. arXiv:1907.01085  [pdf, other

    cs.CV cs.LG eess.IV

    Multiview Aggregation for Learning Category-Specific Shape Reconstruction

    Authors: Srinath Sridhar, Davis Rempe, Julien Valentin, Sofien Bouaziz, Leonidas J. Guibas

    Abstract: We investigate the problem of learning category-specific 3D shape reconstruction from a variable number of RGB views of previously unobserved object instances. Most approaches for multiview shape reconstruction operate on sparse shape representations, or assume a fixed number of views. We present a method that can estimate dense 3D shape, and aggregate shape across multiple and varying number of i… ▽ More

    Submitted 8 December, 2019; v1 submitted 1 July, 2019; originally announced July 2019.

    Comments: In Proceedings of Advances in Neural Information Processing Systems (NeurIPS) 2019

  48. arXiv:1905.12200  [pdf, other

    cs.LG math.AT stat.ML

    A Topology Layer for Machine Learning

    Authors: Rickard Brüel-Gabrielsson, Bradley J. Nelson, Anjan Dwaraknath, Primoz Skraba, Leonidas J. Guibas, Gunnar Carlsson

    Abstract: Topology applied to real world data using persistent homology has started to find applications within machine learning, including deep learning. We present a differentiable topology layer that computes persistent homology based on level set filtrations and edge-based filtrations. We present three novel applications: the topological layer can (i) regularize data reconstruction or the weights of mac… ▽ More

    Submitted 24 April, 2020; v1 submitted 28 May, 2019; originally announced May 2019.

  49. arXiv:1905.02925  [pdf, other

    cs.CL cs.CV

    ShapeGlot: Learning Language for Shape Differentiation

    Authors: Panos Achlioptas, Judy Fan, Robert X. D. Hawkins, Noah D. Goodman, Leonidas J. Guibas

    Abstract: In this work we explore how fine-grained differences between the shapes of common objects are expressed in language, grounded on images and 3D models of the objects. We first build a large scale, carefully controlled dataset of human utterances that each refers to a 2D rendering of a 3D CAD model so as to distinguish it from a set of shape-wise similar alternatives. Using this dataset, we develop… ▽ More

    Submitted 8 May, 2019; originally announced May 2019.

  50. arXiv:1904.09664  [pdf, other

    cs.CV

    Deep Hough Voting for 3D Object Detection in Point Clouds

    Authors: Charles R. Qi, Or Litany, Kaiming He, Leonidas J. Guibas

    Abstract: Current 3D object detection methods are heavily influenced by 2D detectors. In order to leverage architectures in 2D detectors, they often convert 3D point clouds to regular grids (i.e., to voxel grids or to bird's eye view images), or rely on detection in 2D images to propose 3D boxes. Few works have attempted to directly detect objects in point clouds. In this work, we return to first principles… ▽ More

    Submitted 22 August, 2019; v1 submitted 21 April, 2019; originally announced April 2019.

    Comments: ICCV 2019