Skip to main content

Showing 1–50 of 50 results for author: Birdal, T

Searching in archive cs. Search in all archives.
.
  1. NeRF-Feat: 6D Object Pose Estimation using Feature Rendering

    Authors: Shishir Reddy Vutukur, Heike Brock, Benjamin Busam, Tolga Birdal, Andreas Hutter, Slobodan Ilic

    Abstract: Object Pose Estimation is a crucial component in robotic gras** and augmented reality. Learning based approaches typically require training data from a highly accurate CAD model or labeled training data acquired using a complex setup. We address this by learning to estimate pose from weakly labeled data without a known CAD model. We propose to use a NeRF to learn object shape implicitly which is… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 3DV 2024

    Journal ref: 3DV 2024

  2. arXiv:2405.14094  [pdf, other

    cs.LG cs.AI cs.CV math.AT stat.ML

    Attending to Topological Spaces: The Cellular Transformer

    Authors: Rubén Ballester, Pablo Hernández-García, Mathilde Papillon, Claudio Battiloro, Nina Miolane, Tolga Birdal, Carles Casacuberta, Sergio Escalera, Mustafa Hajij

    Abstract: Topological Deep Learning seeks to enhance the predictive performance of neural network models by harnessing topological structures in input data. Topological neural networks operate on spaces such as cell complexes and hypergraphs, that can be seen as generalizations of graphs. In this work, we introduce the Cellular Transformer (CT), a novel architecture that generalizes graph-based transformers… ▽ More

    Submitted 26 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  3. arXiv:2403.03122  [pdf, other

    cs.CV

    NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors

    Authors: Yannan He, Garvita Tiwari, Tolga Birdal, Jan Eric Lenssen, Gerard Pons-Moll

    Abstract: Faithfully modeling the space of articulations is a crucial task that allows recovery and generation of realistic poses, and remains a notorious challenge. To this end, we introduce Neural Riemannian Distance Fields (NRDFs), data-driven priors modeling the space of plausible articulations, represented as the zero-level-set of a neural field in a high-dimensional product-quaternion space. To train… ▽ More

    Submitted 11 April, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024. Project page: https://virtualhumans.mpi-inf.mpg.de/nrdf

  4. arXiv:2403.00372  [pdf, other

    cs.CV

    HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry for Enhanced 3D Text2Shape Generation

    Authors: Zhiying Leng, Tolga Birdal, Xiaohui Liang, Federico Tombari

    Abstract: 3D shape generation from text is a fundamental task in 3D representation learning. The text-shape pairs exhibit a hierarchical structure, where a general text like ``chair" covers all 3D shapes of the chair, while more detailed prompts refer to more specific shapes. Furthermore, both text and 3D shapes are inherently hierarchical structures. However, existing Text2Shape methods, such as SDFusion,… ▽ More

    Submitted 30 April, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Journal ref: IEEE/CVF conference on computer vision and pattern recognition 2024

  5. arXiv:2402.08871  [pdf, other

    cs.LG stat.ML

    Position: Topological Deep Learning is the New Frontier for Relational Learning

    Authors: Theodore Papamarkou, Tolga Birdal, Michael Bronstein, Gunnar Carlsson, Justin Curry, Yue Gao, Mustafa Hajij, Roland Kwitt, Pietro Liò, Paolo Di Lorenzo, Vasileios Maroulas, Nina Miolane, Farzana Nasrin, Karthikeyan Natesan Ramamurthy, Bastian Rieck, Simone Scardapane, Michael T. Schaub, Petar Veličković, Bei Wang, Yusu Wang, Guo-Wei Wei, Ghada Zamzmi

    Abstract: Topological deep learning (TDL) is a rapidly evolving field that uses topological features to understand and design deep learning models. This paper posits that TDL is the new frontier for relational learning. TDL may complement graph representation learning and geometric deep learning by incorporating topological concepts, and can thus provide a natural choice for various machine learning setting… ▽ More

    Submitted 30 May, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  6. arXiv:2402.02441  [pdf, other

    cs.LG cs.AI cs.MS stat.CO

    TopoX: A Suite of Python Packages for Machine Learning on Topological Domains

    Authors: Mustafa Hajij, Mathilde Papillon, Florian Frantzen, Jens Agerberg, Ibrahem AlJabea, Ruben Ballester, Claudio Battiloro, Guillermo Bernárdez, Tolga Birdal, Aiden Brent, Peter Chin, Sergio Escalera, Simone Fiorellino, Odin Hoff Gardaa, Gurusankar Gopalakrishnan, Devendra Govil, Josef Hoppe, Maneel Reddy Karri, Jude Khouja, Manuel Lecha, Neal Livesay, Jan Meißner, Soham Mukherjee, Alexander Nikitin, Theodore Papamarkou , et al. (18 additional authors not shown)

    Abstract: We introduce TopoX, a Python software suite that provides reliable and user-friendly building blocks for computing and machine learning on topological domains that extend graphs: hypergraphs, simplicial, cellular, path and combinatorial complexes. TopoX consists of three packages: TopoNetX facilitates constructing and computing on these domains, including working with nodes, edges and higher-order… ▽ More

    Submitted 17 February, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  7. arXiv:2401.04071  [pdf, other

    cs.CV cs.LG math.DG math.OC stat.ML

    Fun with Flags: Robust Principal Directions via Flag Manifolds

    Authors: Nathan Mankovich, Gustau Camps-Valls, Tolga Birdal

    Abstract: Principal component analysis (PCA), along with its extensions to manifolds and outlier contaminated data, have been indispensable in computer vision and machine learning. In this work, we present a unifying formalism for PCA and its variants, and introduce a framework based on the flags of linear subspaces, ie a hierarchy of nested linear subspaces of increasing dimension, which not only allows fo… ▽ More

    Submitted 11 June, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

  8. arXiv:2312.09504  [pdf, other

    cs.LG cs.SI math.AT math.CO stat.ML

    Combinatorial Complexes: Bridging the Gap Between Cell Complexes and Hypergraphs

    Authors: Mustafa Hajij, Ghada Zamzmi, Theodore Papamarkou, Aldo Guzmán-Sáenz, Tolga Birdal, Michael T. Schaub

    Abstract: Graph-based signal processing techniques have become essential for handling data in non-Euclidean spaces. However, there is a growing awareness that these graph models might need to be expanded into `higher-order' domains to effectively represent the complex relations found in high-dimensional data. Such higher-order domains are typically modeled either as hypergraphs, or as simplicial, cubical or… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Journal ref: 57th Asilomar Conference on Signals, Systems, and Computers, 2023

  9. arXiv:2310.20436  [pdf, other

    cs.CV

    SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark

    Authors: Zhengdi Yu, Shaoli Huang, Yongkang Cheng, Tolga Birdal

    Abstract: We present SignAvatars, the first large-scale, multi-prompt 3D sign language (SL) motion dataset designed to bridge the communication gap for Deaf and hard-of-hearing individuals. While there has been an exponentially growing number of research regarding digital communication, the majority of existing communication technologies primarily cater to spoken or written languages, instead of SL, the ess… ▽ More

    Submitted 2 July, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: ECCV2024 14 pages; Project page available at https://signavatars.github.io/

  10. arXiv:2310.17638  [pdf, other

    cs.LG stat.ML

    Generative Fractional Diffusion Models

    Authors: Gabriel Nobis, Maximilian Springenberg, Marco Aversa, Michael Detzel, Rembert Daems, Roderick Murray-Smith, Shinichi Nakajima, Sebastian Lapuschkin, Stefano Ermon, Tolga Birdal, Manfred Opper, Christoph Knochenhauer, Luis Oala, Wojciech Samek

    Abstract: We introduce the first continuous-time score-based generative model that leverages fractional diffusion processes for its underlying dynamics. Although diffusion models have excelled at capturing data distributions, they still suffer from various limitations such as slow convergence, mode-collapse on imbalanced data, and lack of diversity. These issues are partially linked to the use of light-tail… ▽ More

    Submitted 24 June, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

    ACM Class: I.2.4; F.4.1; G.3

  11. arXiv:2310.15128  [pdf, other

    cs.CV cs.LG quant-ph

    Projected Stochastic Gradient Descent with Quantum Annealed Binary Gradients

    Authors: Maximilian Krahn, Michelle Sasdelli, Fengyi Yang, Vladislav Golyanik, Juho Kannala, Tat-Jun Chin, Tolga Birdal

    Abstract: We present, QP-SBGD, a novel layer-wise stochastic optimiser tailored towards training neural networks with binary weights, known as binary neural networks (BNNs), on quantum hardware. BNNs reduce the computational requirements and energy consumption of deep learning models with minimal loss in accuracy. However, training them in practice remains to be an open challenge. Most known BNN-optimisers… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  12. arXiv:2310.12975  [pdf, other

    cs.LG cs.AI cs.CV stat.AP stat.ML

    Variational Inference for SDEs Driven by Fractional Noise

    Authors: Rembert Daems, Manfred Opper, Guillaume Crevecoeur, Tolga Birdal

    Abstract: We present a novel variational framework for performing inference in (neural) stochastic differential equations (SDEs) driven by Markov-approximate fractional Brownian motion (fBM). SDEs offer a versatile tool for modeling real-world continuous-time dynamic systems with inherent noise and randomness. Combining SDEs with the powerful inference capabilities of variational methods, enables the learni… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 24 pages, under review

  13. arXiv:2310.12153  [pdf, other

    cs.LG cs.AI cs.CV

    Probabilistic Sampling of Balanced K-Means using Adiabatic Quantum Computing

    Authors: Jan-Nico Zaech, Martin Danelljan, Tolga Birdal, Luc Van Gool

    Abstract: Adiabatic quantum computing (AQC) is a promising approach for discrete and often NP-hard optimization problems. Current AQCs allow to implement problems of research interest, which has sparked the development of quantum representations for many computer vision tasks. Despite requiring multiple measurements from the noisy AQC, current approaches only utilize the best measurement, discarding informa… ▽ More

    Submitted 1 May, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted at CVPR 2024

  14. ICML 2023 Topological Deep Learning Challenge : Design and Results

    Authors: Mathilde Papillon, Mustafa Hajij, Helen Jenne, Johan Mathe, Audun Myers, Theodore Papamarkou, Tolga Birdal, Tamal Dey, Tim Doster, Tegan Emerson, Gurusankar Gopalakrishnan, Devendra Govil, Aldo Guzmán-Sáenz, Henry Kvinge, Neal Livesay, Soham Mukherjee, Shreyas N. Samaga, Karthikeyan Natesan Ramamurthy, Maneel Reddy Karri, Paul Rosen, Sophia Sanborn, Robin Walters, Jens Agerberg, Sadrodin Barikbin, Claudio Battiloro , et al. (31 additional authors not shown)

    Abstract: This paper presents the computational challenge on topological deep learning that was hosted within the ICML 2023 Workshop on Topology and Geometry in Machine Learning. The competition asked participants to provide open-source implementations of topological neural networks from the literature by contributing to the python packages TopoNetX (data processing) and TopoModelX (deep learning). The chal… ▽ More

    Submitted 18 January, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

  15. arXiv:2304.06020  [pdf, other

    cs.CV

    VidStyleODE: Disentangled Video Editing via StyleGAN and NeuralODEs

    Authors: Moayed Haji Ali, Andrew Bond, Tolga Birdal, Duygu Ceylan, Levent Karacan, Erkut Erdem, Aykut Erdem

    Abstract: We propose $\textbf{VidStyleODE}$, a spatiotemporally continuous disentangled $\textbf{Vid}$eo representation based upon $\textbf{Style}$GAN and Neural-$\textbf{ODE}$s. Effective traversal of the latent space learned by Generative Adversarial Networks (GANs) has been the basis for recent breakthroughs in image editing. However, the applicability of such advancements to the video domain has been hi… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

    Journal ref: ICCV 2023

  16. arXiv:2303.13501  [pdf, other

    cs.CV cs.LG math.DG math.OC stat.ML

    Chordal Averaging on Flag Manifolds and Its Applications

    Authors: Nathan Mankovich, Tolga Birdal

    Abstract: This paper presents a new, provably-convergent algorithm for computing the flag-mean and flag-median of a set of points on a flag manifold under the chordal metric. The flag manifold is a mathematical space consisting of flags, which are sequences of nested subspaces of a vector space that increase in dimension. The flag manifold is a superset of a wide range of known matrix spaces, including Stie… ▽ More

    Submitted 17 July, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: Appears at ICCV 2023

  17. arXiv:2211.02980  [pdf, other

    cs.CV

    Disentangling Content and Motion for Text-Based Neural Video Manipulation

    Authors: Levent Karacan, Tolga Kerimoğlu, İsmail İnan, Tolga Birdal, Erkut Erdem, Aykut Erdem

    Abstract: Giving machines the ability to imagine possible new objects or scenes from linguistic descriptions and produce their realistic renderings is arguably one of the most challenging problems in computer vision. Recent advances in deep generative models have led to new approaches that give promising results towards this goal. In this paper, we introduce a new method called DiCoMoGAN for manipulating vi… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

  18. arXiv:2207.06333  [pdf, other

    cs.CV

    6D Camera Relocalization in Visually Ambiguous Extreme Environments

    Authors: Yang Zheng, Tolga Birdal, Fei Xia, Yanchao Yang, Yueqi Duan, Leonidas J. Guibas

    Abstract: We propose a novel method to reliably estimate the pose of a camera given a sequence of images acquired in extreme environments such as deep seas or extraterrestrial terrains. Data acquired under these challenging conditions are corrupted by textureless surfaces, image degradation, and presence of repetitive and highly ambiguous structures. When naively deployed, the state-of-the-art methods can f… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

  19. arXiv:2206.00606  [pdf, other

    cs.LG cs.CV cs.SI math.AT stat.ML

    Topological Deep Learning: Going Beyond Graph Data

    Authors: Mustafa Hajij, Ghada Zamzmi, Theodore Papamarkou, Nina Miolane, Aldo Guzmán-Sáenz, Karthikeyan Natesan Ramamurthy, Tolga Birdal, Tamal K. Dey, Soham Mukherjee, Shreyas N. Samaga, Neal Livesay, Robin Walters, Paul Rosen, Michael T. Schaub

    Abstract: Topological deep learning is a rapidly growing field that pertains to the development of deep learning models for data supported on topological domains such as simplicial complexes, cell complexes, and hypergraphs, which generalize many domains encountered in scientific computations. In this paper, we present a unifying deep learning framework built upon a richer data structure that includes widel… ▽ More

    Submitted 19 May, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

  20. arXiv:2203.12633  [pdf, other

    cs.CV cs.LG math.OC

    Q-FW: A Hybrid Classical-Quantum Frank-Wolfe for Quadratic Binary Optimization

    Authors: Alp Yurtsever, Tolga Birdal, Vladislav Golyanik

    Abstract: We present a hybrid classical-quantum framework based on the Frank-Wolfe algorithm, Q-FW, for solving quadratic, linearly-constrained, binary optimization problems on quantum annealers (QA). The computational premise of quantum computers has cultivated the re-design of various existing vision problems into quantum-friendly forms. Experimental QA realizations can solve a particular non-convex probl… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: 26 pages with supplementary material

  21. arXiv:2112.09329  [pdf, other

    cs.CV

    Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders

    Authors: Mikaela Angelina Uy, Yen-yu Chang, Minhyuk Sung, Purvi Goel, Joseph Lambourne, Tolga Birdal, Leonidas Guibas

    Abstract: We propose Point2Cyl, a supervised network transforming a raw 3D point cloud to a set of extrusion cylinders. Reverse engineering from a raw geometry to a CAD model is an essential task to enable manipulation of the 3D data in shape editing software and thus expand their usages in many downstream applications. Particularly, the form of CAD models having a sequence of extrusion cylinders -- a 2D sk… ▽ More

    Submitted 29 May, 2022; v1 submitted 17 December, 2021; originally announced December 2021.

    Comments: CVPR 2022

  22. arXiv:2111.14762  [pdf, other

    cs.CV cs.GR

    Riemannian Functional Map Synchronization for Probabilistic Partial Correspondence in Shape Networks

    Authors: Faria Huq, Adrish Dey, Sahra Yusuf, Dena Bazazian, Tolga Birdal, Nina Miolane

    Abstract: We consider the problem of graph-matching on a network of 3D shapes with uncertainty quantification. We assume that the pairwise shape correspondences are efficiently represented as \emph{functional maps}, that match real-valued functions defined over pairs of shapes. By modeling functional maps between nearly isometric shapes as elements of the Lie group $SO(n)$, we employ \emph{synchronization}… ▽ More

    Submitted 3 January, 2023; v1 submitted 29 November, 2021; originally announced November 2021.

    Comments: 16 pages

  23. arXiv:2111.13171  [pdf, other

    cs.LG cs.AI cs.CV math.GN stat.ML

    Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

    Authors: Tolga Birdal, Aaron Lou, Leonidas Guibas, Umut Şimşekli

    Abstract: Disobeying the classical wisdom of statistical learning theory, modern deep neural networks generalize well even though they typically contain millions of parameters. Recently, it has been shown that the trajectories of iterative optimization algorithms can possess fractal structures, and their generalization error can be formally linked to the complexity of such fractals. This complexity is measu… ▽ More

    Submitted 25 November, 2021; originally announced November 2021.

    Comments: Appears at NeurIPS 2021

  24. Multiway Non-rigid Point Cloud Registration via Learned Functional Map Synchronization

    Authors: Jiahui Huang, Tolga Birdal, Zan Gojcic, Leonidas J. Guibas, Shi-Min Hu

    Abstract: We present SyNoRiM, a novel way to jointly register multiple non-rigid shapes by synchronizing the maps relating learned functions defined on the point clouds. Even though the ability to process non-rigid shapes is critical in various applications ranging from computer animation to 3D digitization, the literature still lacks a robust and flexible framework to match and align a collection of real,… ▽ More

    Submitted 1 April, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence 2022

  25. arXiv:2110.11657  [pdf, other

    cs.CV

    Projective Manifold Gradient Layer for Deep Rotation Regression

    Authors: Jiayi Chen, Yingda Yin, Tolga Birdal, Baoquan Chen, Leonidas Guibas, He Wang

    Abstract: Regressing rotations on SO(3) manifold using deep neural networks is an important yet unsolved problem. The gap between the Euclidean network output space and the non-Euclidean SO(3) manifold imposes a severe challenge for neural network learning in both forward and backward passes. While several works have proposed different regression-friendly rotation representations, very few works have been d… ▽ More

    Submitted 29 March, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

    Comments: CVPR2022

  26. arXiv:2105.04668  [pdf, other

    cs.CV cs.LG

    HuMoR: 3D Human Motion Model for Robust Pose Estimation

    Authors: Davis Rempe, Tolga Birdal, Aaron Hertzmann, Jimei Yang, Srinath Sridhar, Leonidas J. Guibas

    Abstract: We introduce HuMoR: a 3D Human Motion Model for Robust Estimation of temporal pose and shape. Though substantial progress has been made in estimating 3D human motion and shape from dynamic observations, recovering plausible pose sequences in the presence of noise and occlusions remains a challenge. For this purpose, we propose an expressive generative model in the form of a conditional variational… ▽ More

    Submitted 18 August, 2021; v1 submitted 10 May, 2021; originally announced May 2021.

    Comments: ICCV 2021 camera ready

  27. arXiv:2102.08945  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Weakly Supervised Learning of Rigid 3D Scene Flow

    Authors: Zan Gojcic, Or Litany, Andreas Wieser, Leonidas J. Guibas, Tolga Birdal

    Abstract: We propose a data-driven scene flow estimation algorithm exploiting the observation that many 3D scenes can be explained by a collection of agents moving as rigid bodies. At the core of our method lies a deep architecture able to reason at the \textbf{object-level} by considering 3D scene flow in conjunction with other 3D tasks. This object level abstraction, enables us to relax the requirement fo… ▽ More

    Submitted 17 February, 2021; originally announced February 2021.

  28. arXiv:2101.07755  [pdf, other

    quant-ph cs.CV cs.ET cs.LG cs.RO

    Quantum Permutation Synchronization

    Authors: Tolga Birdal, Vladislav Golyanik, Christian Theobalt, Leonidas Guibas

    Abstract: We present QuantumSync, the first quantum algorithm for solving a synchronization problem in the context of computer vision. In particular, we focus on permutation synchronization which involves solving a non-convex optimization problem in discrete variables. We start by formulating synchronization into a quadratic unconstrained binary optimization problem (QUBO). While such formulation respects t… ▽ More

    Submitted 26 November, 2021; v1 submitted 19 January, 2021; originally announced January 2021.

    Comments: 19 pages, 15 figures, 4 tables; web pages: https://vcai.mpi-inf.mpg.de/projects/QUANTUMSYNC/, https://quantumcomputervision.github.io/

    Journal ref: Computer Vision and Pattern Recognition (CVPR) 2021

  29. arXiv:2101.06605  [pdf, other

    cs.CV cs.LG

    MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

    Authors: Jiahui Huang, He Wang, Tolga Birdal, Minhyuk Sung, Federica Arrigoni, Shi-Min Hu, Leonidas Guibas

    Abstract: We present MultiBodySync, a novel, end-to-end trainable multi-body motion segmentation and rigid registration framework for multiple input 3D point clouds. The two non-trivial challenges posed by this multi-scan multibody setting that we investigate are: (i) guaranteeing correspondence and segmentation consistency across multiple input point clouds capturing different spatial arrangements of bodie… ▽ More

    Submitted 28 March, 2021; v1 submitted 17 January, 2021; originally announced January 2021.

    Comments: Contact: huang-jh18<at>mails<dot>tsinghua<dot>edu<dot>cn

  30. arXiv:2012.11002  [pdf, other

    cs.CV

    Deep Bingham Networks: Dealing with Uncertainty and Ambiguity in Pose Estimation

    Authors: Haowen Deng, Mai Bui, Nassir Navab, Leonidas Guibas, Slobodan Ilic, Tolga Birdal

    Abstract: In this work, we introduce Deep Bingham Networks (DBN), a generic framework that can naturally handle pose-related uncertainties and ambiguities arising in almost all real life applications concerning 3D data. While existing works strive to find a single solution to the pose estimation problem, we make peace with the ambiguities causing high uncertainty around which solutions to identify as the be… ▽ More

    Submitted 20 December, 2020; originally announced December 2020.

    Comments: arXiv admin note: text overlap with arXiv:2004.04807

  31. arXiv:2008.02792  [pdf, other

    cs.CV cs.LG

    CaSPR: Learning Canonical Spatiotemporal Point Cloud Representations

    Authors: Davis Rempe, Tolga Birdal, Yongheng Zhao, Zan Gojcic, Srinath Sridhar, Leonidas J. Guibas

    Abstract: We propose CaSPR, a method to learn object-centric Canonical Spatiotemporal Point Cloud Representations of dynamically moving or evolving objects. Our goal is to enable information aggregation over time and the interrogation of object state at any spatiotemporal neighborhood in the past, observed or not. Different from previous work, CaSPR learns representations that support spacetime continuity,… ▽ More

    Submitted 11 November, 2020; v1 submitted 6 August, 2020; originally announced August 2020.

    Comments: NeurIPS 2020

  32. arXiv:2004.04807  [pdf, other

    cs.CV cs.LG cs.RO eess.IV

    6D Camera Relocalization in Ambiguous Scenes via Continuous Multimodal Inference

    Authors: Mai Bui, Tolga Birdal, Haowen Deng, Shadi Albarqouni, Leonidas Guibas, Slobodan Ilic, Nassir Navab

    Abstract: We present a multimodal camera relocalization framework that captures ambiguities and uncertainties with continuous mixture models defined on the manifold of camera poses. In highly ambiguous environments, which can easily arise due to symmetries and repetitive structures in the scene, computing one plausible solution (what most state-of-the-art methods currently regress) may not be sufficient. In… ▽ More

    Submitted 16 July, 2020; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: Accepted for publication at ECCV 2020. Project page under https://multimodal3dvision.github.io

  33. arXiv:2004.01228  [pdf, other

    cs.CV cs.GR cs.LG eess.IV

    Deformation-Aware 3D Model Embedding and Retrieval

    Authors: Mikaela Angelina Uy, **gwei Huang, Minhyuk Sung, Tolga Birdal, Leonidas Guibas

    Abstract: We introduce a new problem of retrieving 3D models that are deformable to a given query shape and present a novel deep deformation-aware embedding to solve this retrieval task. 3D model retrieval is a fundamental operation for recovering a clean and complete 3D model from a noisy and partial 3D scan. However, given a finite collection of 3D shapes, even the closest model to a query may not be sati… ▽ More

    Submitted 31 July, 2020; v1 submitted 2 April, 2020; originally announced April 2020.

    Comments: Accepted for publication at ECCV 2020. Project page under https://deformscan2cad.github.io

  34. arXiv:2004.00663  [pdf, other

    cs.CV cs.GR cs.LG cs.RO stat.ML

    Synchronizing Probability Measures on Rotations via Optimal Transport

    Authors: Tolga Birdal, Michael Arbel, Umut Şimşekli, Leonidas Guibas

    Abstract: We introduce a new paradigm, $\textit{measure synchronization}$, for synchronizing graphs with measure-valued edges. We formulate this problem as maximization of the cycle-consistency in the space of probability measures over relative rotations. In particular, we aim at estimating marginal distributions of absolute orientations by synchronizing the $\textit{conditional}$ ones, which are defined on… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

    Comments: Accepted for publication at CVPR 2020, includes supplementary material. Project website: https://github.com/SynchInVision/probsync

  35. arXiv:2002.02506  [pdf, other

    cs.CV cs.CG

    Continuous Geodesic Convolutions for Learning on 3D Shapes

    Authors: Zhangsihao Yang, Or Litany, Tolga Birdal, Srinath Sridhar, Leonidas Guibas

    Abstract: The majority of descriptor-based methods for geometric processing of non-rigid shape rely on hand-crafted descriptors. Recently, learning-based techniques have been shown effective, achieving state-of-the-art results in a variety of tasks. Yet, even though these methods can in principle work directly on raw data, most methods still rely on hand-crafted descriptors at the input layer. In this work,… ▽ More

    Submitted 6 February, 2020; originally announced February 2020.

  36. From Planes to Corners: Multi-Purpose Primitive Detection in Unorganized 3D Point Clouds

    Authors: Christiane Sommer, Yumin Sun, Leonidas Guibas, Daniel Cremers, Tolga Birdal

    Abstract: We propose a new method for segmentation-free joint estimation of orthogonal planes, their intersection lines, relationship graph and corners lying at the intersection of three orthogonal planes. Such unified scene exploration under orthogonality allows for multitudes of applications such as semantic plane detection or local and global scan alignment, which in turn can aid robot localization or gr… ▽ More

    Submitted 24 April, 2020; v1 submitted 21 January, 2020; originally announced January 2020.

    Comments: Accepted to IEEE Robotics and Automation Letters 2020 | Video: https://youtu.be/nHWJrA6RcB0 | Code: https://github.com/c-sommer/orthogonal-planes

    Journal ref: IEEE Robotics and Automation Letters 5(2) 2020, 1764-1771

  37. arXiv:2001.05119  [pdf, other

    cs.CV cs.LG

    Learning multiview 3D point cloud registration

    Authors: Zan Gojcic, Caifa Zhou, Jan D. Wegner, Leonidas J. Guibas, Tolga Birdal

    Abstract: We present a novel, end-to-end learnable, multiview 3D point cloud registration algorithm. Registration of multiple scans typically follows a two-stage pipeline: the initial pairwise alignment and the globally consistent refinement. The former is often ambiguous due to the low overlap of neighboring point clouds, symmetries and repetitive scene parts. Therefore, the latter global refinement aims a… ▽ More

    Submitted 31 March, 2020; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: CVPR2020 - Camera Ready

  38. arXiv:1912.12098  [pdf, other

    cs.LG cs.CV cs.GR cs.RO stat.ML

    Quaternion Equivariant Capsule Networks for 3D Point Clouds

    Authors: Yongheng Zhao, Tolga Birdal, Jan Eric Lenssen, Emanuele Menegatti, Leonidas Guibas, Federico Tombari

    Abstract: We present a 3D capsule module for processing point clouds that is equivariant to 3D rotations and translations, as well as invariant to permutations of the input points. The operator receives a sparse set of local reference frames, computed from an input point cloud and establishes end-to-end transformation equivariance through a novel dynamic routing procedure on quaternions. Further, we theoret… ▽ More

    Submitted 23 August, 2020; v1 submitted 27 December, 2019; originally announced December 2019.

    Comments: Oral Presentation at ECCV 2020. Find our video under: https://youtu.be/LHh56snwhTA. We release our sources at: http://tolgabirdal.github.io/qecnetworks

  39. arXiv:1904.05814  [pdf, other

    cs.CV cs.GR cs.LG cs.RO math.NA

    Probabilistic Permutation Synchronization using the Riemannian Structure of the Birkhoff Polytope

    Authors: Tolga Birdal, Umut Şimşekli

    Abstract: We present an entirely new geometric and probabilistic approach to synchronization of correspondences across multiple sets of objects or images. In particular, we present two algorithms: (1) Birkhoff-Riemannian L-BFGS for optimizing the relaxed version of the combinatorially intractable cycle consistency loss in a principled manner, (2) Birkhoff-Riemannian Langevin Monte Carlo for generating sampl… ▽ More

    Submitted 11 April, 2019; originally announced April 2019.

    Comments: To appear as oral presentation at CVPR 2019. 20 pages including the supplementary material

  40. arXiv:1904.04281  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    3D Local Features for Direct Pairwise Registration

    Authors: Haowen Deng, Tolga Birdal, Slobodan Ilic

    Abstract: We present a novel, data driven approach for solving the problem of registration of two point cloud scans. Our approach is direct in the sense that a single pair of corresponding local patches already provides the necessary transformation cue for the global registration. To achieve that, we first endow the state of the art PPF-FoldNet auto-encoder (AE) with a pose-variant sibling, where the discre… ▽ More

    Submitted 8 April, 2019; originally announced April 2019.

    Comments: To appear in CVPR 2019. 16 pages, identical to the camera ready submission

  41. arXiv:1901.01255  [pdf, other

    cs.CV cs.CG cs.GR cs.RO

    Generic Primitive Detection in Point Clouds Using Novel Minimal Quadric Fits

    Authors: Tolga Birdal, Benjamin Busam, Nassir Navab, Slobodan Ilic, Peter Sturm

    Abstract: We present a novel and effective method for detecting 3D primitives in cluttered, unorganized point clouds, without axillary segmentation or type specification. We consider the quadric surfaces for encapsulating the basic building blocks of our environments - planes, spheres, ellipsoids, cones or cylinders, in a unified fashion. Moreover, quadrics allow us to model higher degree of freedom shapes,… ▽ More

    Submitted 4 January, 2019; originally announced January 2019.

    Comments: Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI). arXiv admin note: substantial text overlap with arXiv:1803.07191

  42. arXiv:1812.10775  [pdf, other

    cs.CV cs.LG cs.NE

    3D Point Capsule Networks

    Authors: Yongheng Zhao, Tolga Birdal, Haowen Deng, Federico Tombari

    Abstract: In this paper, we propose 3D point-capsule networks, an auto-encoder designed to process sparse 3D point clouds while preserving spatial arrangements of the input data. 3D capsule networks arise as a direct consequence of our novel unified 3D auto-encoder formulation. Their dynamic routing scheme and the peculiar 2D latent space deployed by our approach bring in improvements for several common poi… ▽ More

    Submitted 11 July, 2019; v1 submitted 27 December, 2018; originally announced December 2018.

    Comments: As published in CVPR 2019 (camera ready version), with supplementary material

  43. arXiv:1812.00287  [pdf, other

    cs.CV

    Explaining the Ambiguity of Object Detection and 6D Pose From Visual Data

    Authors: Fabian Manhardt, Diego Martin Arroyo, Christian Rupprecht, Benjamin Busam, Tolga Birdal, Nassir Navab, Federico Tombari

    Abstract: 3D object detection and pose estimation from a single image are two inherently ambiguous problems. Oftentimes, objects appear similar from different viewpoints due to shape symmetries, occlusion and repetitive textures. This ambiguity in both detection and pose estimation means that an object instance can be perfectly described by several different poses and even classes. In this work we propose t… ▽ More

    Submitted 20 August, 2019; v1 submitted 1 December, 2018; originally announced December 2018.

    Comments: ICCV 2019

  44. arXiv:1808.10322  [pdf, other

    cs.CV cs.CG cs.LG cs.RO

    PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors

    Authors: Haowen Deng, Tolga Birdal, Slobodan Ilic

    Abstract: We present PPF-FoldNet for unsupervised learning of 3D local descriptors on pure point cloud geometry. Based on the folding-based auto-encoding of well known point pair features, PPF-FoldNet offers many desirable properties: it necessitates neither supervision, nor a sensitive local reference frame, benefits from point-set sparsity, is end-to-end, fast, and can extract powerful rotation invariant… ▽ More

    Submitted 30 August, 2018; originally announced August 2018.

    Comments: Accepted for publication at ECCV 2018

  45. arXiv:1805.12279  [pdf, other

    cs.CV cs.AI cs.CG cs.RO stat.ML

    Bayesian Pose Graph Optimization via Bingham Distributions and Tempered Geodesic MCMC

    Authors: Tolga Birdal, Umut Şimşekli, M. Onur Eken, Slobodan Ilic

    Abstract: We introduce Tempered Geodesic Markov Chain Monte Carlo (TG-MCMC) algorithm for initializing pose graph optimization problems, arising in various scenarios such as SFM (structure from motion) or SLAM (simultaneous localization and map**). TG-MCMC is first of its kind as it unites asymptotically global non-convex optimization on the spherical manifold of quaternions with posterior sampling, in or… ▽ More

    Submitted 30 March, 2019; v1 submitted 30 May, 2018; originally announced May 2018.

    Comments: Published at NeurIPS 2018, 25 pages with supplements

  46. arXiv:1803.07191  [pdf, other

    cs.CV cs.CG cs.RO

    A Minimalist Approach to Type-Agnostic Detection of Quadrics in Point Clouds

    Authors: Tolga Birdal, Benjamin Busam, Nassir Navab, Slobodan Ilic, Peter Sturm

    Abstract: This paper proposes a segmentation-free, automatic and efficient procedure to detect general geometric quadric forms in point clouds, where clutter and occlusions are inevitable. Our everyday world is dominated by man-made objects which are designed using 3D primitives (such as planes, cones, spheres, cylinders, etc.). These objects are also omnipresent in industrial environments. This gives rise… ▽ More

    Submitted 19 March, 2018; originally announced March 2018.

    Comments: Accepted for publication at CVPR 2018

  47. arXiv:1802.02669  [pdf, other

    cs.CV cs.AI

    PPFNet: Global Context Aware Local Features for Robust 3D Point Matching

    Authors: Haowen Deng, Tolga Birdal, Slobodan Ilic

    Abstract: We present PPFNet - Point Pair Feature NETwork for deeply learning a globally informed 3D local feature descriptor to find correspondences in unorganized point clouds. PPFNet learns local descriptors on pure geometry and is highly aware of the global context, an important cue in deep learning. Our 3D representation is computed as a collection of point-pair-features combined with the points and nor… ▽ More

    Submitted 1 March, 2018; v1 submitted 7 February, 2018; originally announced February 2018.

    Comments: Accepted for publication at CVPR 2018

  48. arXiv:1705.03111  [pdf, other

    cs.CV

    CAD Priors for Accurate and Flexible Instance Reconstruction

    Authors: Tolga Birdal, Slobodan Ilic

    Abstract: We present an efficient and automatic approach for accurate reconstruction of instances of big 3D objects from multiple, unorganized and unstructured point clouds, in presence of dynamic clutter and occlusions. In contrast to conventional scanning, where the background is assumed to be rather static, we aim at handling dynamic clutter where background drastically changes during the object scanning… ▽ More

    Submitted 16 August, 2017; v1 submitted 8 May, 2017; originally announced May 2017.

    Comments: Published at International Conference on Computer Vision (ICCV) 2017

  49. arXiv:1704.07072  [pdf, other

    cs.CV

    Camera Pose Filtering with Local Regression Geodesics on the Riemannian Manifold of Dual Quaternions

    Authors: Benjamin Busam, Tolga Birdal, Nassir Navab

    Abstract: Time-varying, smooth trajectory estimation is of great interest to the vision community for accurate and well behaving 3D systems. In this paper, we propose a novel principal component local regression filter acting directly on the Riemannian manifold of unit dual quaternions $\mathbb{D} \mathbb{H}_1$. We use a numerically stable Lie algebra of the dual quaternions together with $\exp$ and $\log$… ▽ More

    Submitted 29 August, 2017; v1 submitted 24 April, 2017; originally announced April 2017.

  50. arXiv:1403.0728  [pdf, ps, other

    cs.CV cs.CG cs.GR

    A Novel Method for Vectorization

    Authors: Tolga Birdal, Emrah Bala

    Abstract: Vectorization of images is a key concern uniting computer graphics and computer vision communities. In this paper we are presenting a novel idea for efficient, customizable vectorization of raster images, based on Catmull Rom spline fitting. The algorithm maintains a good balance between photo-realism and photo abstraction, and hence is applicable to applications with artistic concerns or applicat… ▽ More

    Submitted 4 March, 2014; originally announced March 2014.

    Comments: Prepared in Siggraph format, not published in a conference, 7 pages, 9 figures