Search | arXiv e-print repository

Flexible Techniques for Differentiable Rendering with 3D Gaussians

Authors: Leonid Keselman, Martial Hebert

Abstract: Fast, reliable shape reconstruction is an essential ingredient in many computer vision applications. Neural Radiance Fields demonstrated that photorealistic novel view synthesis is within reach, but was gated by performance requirements for fast reconstruction of real scenes and objects. Several recent approaches have built on alternative shape representations, in particular, 3D Gaussians. We deve… ▽ More Fast, reliable shape reconstruction is an essential ingredient in many computer vision applications. Neural Radiance Fields demonstrated that photorealistic novel view synthesis is within reach, but was gated by performance requirements for fast reconstruction of real scenes and objects. Several recent approaches have built on alternative shape representations, in particular, 3D Gaussians. We develop extensions to these renderers, such as integrating differentiable optical flow, exporting watertight meshes and rendering per-ray normals. Additionally, we show how two of the recent methods are interoperable with each other. These reconstructions are quick, robust, and easily performed on GPU or CPU. For code and visual examples, see https://leonidk.github.io/fmb-plus △ Less

Submitted 28 August, 2023; originally announced August 2023.

ACM Class: I.2.10; I.3.7; I.4.0

arXiv:2308.04571 [pdf, other]

Optimizing Algorithms From Pairwise User Preferences

Authors: Leonid Keselman, Katherine Shih, Martial Hebert, Aaron Steinfeld

Abstract: Typical black-box optimization approaches in robotics focus on learning from metric scores. However, that is not always possible, as not all developers have ground truth available. Learning appropriate robot behavior in human-centric contexts often requires querying users, who typically cannot provide precise metric scores. Existing approaches leverage human feedback in an attempt to model an impl… ▽ More Typical black-box optimization approaches in robotics focus on learning from metric scores. However, that is not always possible, as not all developers have ground truth available. Learning appropriate robot behavior in human-centric contexts often requires querying users, who typically cannot provide precise metric scores. Existing approaches leverage human feedback in an attempt to model an implicit reward function; however, this reward may be difficult or impossible to effectively capture. In this work, we introduce SortCMA to optimize algorithm parameter configurations in high dimensions based on pairwise user preferences. SortCMA efficiently and robustly leverages user input to find parameter sets without directly modeling a reward. We apply this method to tuning a commercial depth sensor without ground truth, and to robot social navigation, which involves highly complex preferences over robot behavior. We show that our method succeeds in optimizing for the user's goals and perform a user study to evaluate social navigation results. △ Less

Submitted 8 August, 2023; originally announced August 2023.

Comments: Accepted at IROS 2023

ACM Class: I.2.9; H.1.2; I.2.8

arXiv:2304.11824 [pdf, other]

Shape from Shading for Robotic Manipulation

Authors: Arkadeep Narayan Chaudhury, Leonid Keselman, Christopher G. Atkeson

Abstract: Controlling illumination can generate high quality information about object surface normals and depth discontinuities at a low computational cost. In this work we demonstrate a robot workspace-scaled controlled illumination approach that generates high quality information for table top scale objects for robotic manipulation. With our low angle of incidence directional illumination approach, we can… ▽ More Controlling illumination can generate high quality information about object surface normals and depth discontinuities at a low computational cost. In this work we demonstrate a robot workspace-scaled controlled illumination approach that generates high quality information for table top scale objects for robotic manipulation. With our low angle of incidence directional illumination approach, we can precisely capture surface normals and depth discontinuities of monochromatic Lambertian objects. We show that this approach to shape estimation is 1) valuable for general purpose gras** with a single point vacuum gripper, 2) can measure the deformation of known objects, and 3) can estimate pose of known objects and track unknown objects in the robot's workspace. △ Less

Submitted 6 November, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

Comments: Project webpage: https://arkadeepnc.github.io/projects/active_workspace/index.html

arXiv:2303.07434 [pdf, other]

Discovering Multiple Algorithm Configurations

Authors: Leonid Keselman, Martial Hebert

Abstract: Many practitioners in robotics regularly depend on classic, hand-designed algorithms. Often the performance of these algorithms is tuned across a dataset of annotated examples which represent typical deployment conditions. Automatic tuning of these settings is traditionally known as algorithm configuration. In this work, we extend algorithm configuration to automatically discover multiple modes in… ▽ More Many practitioners in robotics regularly depend on classic, hand-designed algorithms. Often the performance of these algorithms is tuned across a dataset of annotated examples which represent typical deployment conditions. Automatic tuning of these settings is traditionally known as algorithm configuration. In this work, we extend algorithm configuration to automatically discover multiple modes in the tuning dataset. Unlike prior work, these configuration modes represent multiple dataset instances and are detected automatically during the course of optimization. We propose three methods for mode discovery: a post hoc method, a multi-stage method, and an online algorithm using a multi-armed bandit. Our results characterize these methods on synthetic test functions and in multiple robotics application domains: stereoscopic depth estimation, differentiable rendering, motion planning, and visual odometry. We show the clear benefits of detecting multiple modes in algorithm configuration space. △ Less

Submitted 13 March, 2023; originally announced March 2023.

Comments: 8 pages, accepted to ICRA 2023

ACM Class: I.2.9; I.2.6; I.2.8

arXiv:2207.10606 [pdf, other]

Approximate Differentiable Rendering with Algebraic Surfaces

Authors: Leonid Keselman, Martial Hebert

Abstract: Differentiable renderers provide a direct mathematical link between an object's 3D representation and images of that object. In this work, we develop an approximate differentiable renderer for a compact, interpretable representation, which we call Fuzzy Metaballs. Our approximate renderer focuses on rendering shapes via depth maps and silhouettes. It sacrifices fidelity for utility, producing fast… ▽ More Differentiable renderers provide a direct mathematical link between an object's 3D representation and images of that object. In this work, we develop an approximate differentiable renderer for a compact, interpretable representation, which we call Fuzzy Metaballs. Our approximate renderer focuses on rendering shapes via depth maps and silhouettes. It sacrifices fidelity for utility, producing fast runtimes and high-quality gradient information that can be used to solve vision tasks. Compared to mesh-based differentiable renderers, our method has forward passes that are 5x faster and backwards passes that are 30x faster. The depth maps and silhouette images generated by our method are smooth and defined everywhere. In our evaluation of differentiable renderers for pose estimation, we show that our method is the only one comparable to classic techniques. In shape from silhouette, our method performs well using only gradient descent and a per-pixel loss, without any surrogate losses or regularization. These reconstructions work well even on natural video sequences with segmentation artifacts. Project page: https://leonidk.github.io/fuzzy-metaballs △ Less

Submitted 21 July, 2022; originally announced July 2022.

Comments: Accepted to the European Conference on Computer Vision (ECCV) 2022

ACM Class: I.2.10; I.3.7; I.4.0

arXiv:1904.12573 [pdf, other]

Venue Analytics: A Simple Alternative to Citation-Based Metrics

Authors: Leonid Keselman

Abstract: We present a method for automatically organizing and evaluating the quality of different publishing venues in Computer Science. Since this method only requires paper publication data as its input, we can demonstrate our method on a large portion of the DBLP dataset, spanning 50 years, with millions of authors and thousands of publishing venues. By formulating venue authorship as a regression probl… ▽ More We present a method for automatically organizing and evaluating the quality of different publishing venues in Computer Science. Since this method only requires paper publication data as its input, we can demonstrate our method on a large portion of the DBLP dataset, spanning 50 years, with millions of authors and thousands of publishing venues. By formulating venue authorship as a regression problem and targeting metrics of interest, we obtain venue scores for every conference and journal in our dataset. The obtained scores can also provide a per-year model of conference quality, showing how fields develop and change over time. Additionally, these venue scores can be used to evaluate individual academic authors and academic institutions. We show that using venue scores to evaluate both authors and institutions produces quantitative measures that are comparable to approaches using citations or peer assessment. In contrast to many other existing evaluation metrics, our use of large-scale, openly available data enables this approach to be repeatable and transparent. To help others build upon this work, all of our code and data is available at https://github.com/leonidk/venue_scores △ Less

Submitted 5 June, 2019; v1 submitted 29 April, 2019; originally announced April 2019.

Comments: 10 pages, Accepted to ACM/IEEE JCDL 2019

arXiv:1904.05537 [pdf, other]

Direct Fitting of Gaussian Mixture Models

Authors: Leonid Keselman, Martial Hebert

Abstract: When fitting Gaussian Mixture Models to 3D geometry, the model is typically fit to point clouds, even when the shapes were obtained as 3D meshes. Here we present a formulation for fitting Gaussian Mixture Models (GMMs) directly to a triangular mesh instead of using points sampled from its surface. Part of this work analyzes a general formulation for evaluating likelihood of geometric objects. This… ▽ More When fitting Gaussian Mixture Models to 3D geometry, the model is typically fit to point clouds, even when the shapes were obtained as 3D meshes. Here we present a formulation for fitting Gaussian Mixture Models (GMMs) directly to a triangular mesh instead of using points sampled from its surface. Part of this work analyzes a general formulation for evaluating likelihood of geometric objects. This modification enables fitting higher-quality GMMs under a wider range of initialization conditions. Additionally, models obtained from this fitting method are shown to produce an improvement in 3D registration for both meshes and RGB-D frames. This result is general and applicable to arbitrary geometric objects, including representing uncertainty from sensor measurements. △ Less

Submitted 11 June, 2019; v1 submitted 11 April, 2019; originally announced April 2019.

Comments: Accepted to the Conference on Computer and Robot Vision 2019. 8 pages

arXiv:1705.07640 [pdf, other]

Dynamics Based 3D Skeletal Hand Tracking

Authors: Stan Melax, Leonid Keselman, Sterling Orsten

Abstract: Tracking the full skeletal pose of the hands and fingers is a challenging problem that has a plethora of applications for user interaction. Existing techniques either require wearable hardware, add restrictions to user pose, or require significant computation resources. This research explores a new approach to tracking hands, or any articulated model, by using an augmented rigid body simulation. T… ▽ More Tracking the full skeletal pose of the hands and fingers is a challenging problem that has a plethora of applications for user interaction. Existing techniques either require wearable hardware, add restrictions to user pose, or require significant computation resources. This research explores a new approach to tracking hands, or any articulated model, by using an augmented rigid body simulation. This allows us to phrase 3D object tracking as a linear complementarity problem with a well-defined solution. Based on a depth sensor's samples, the system generates constraints that limit motion orthogonal to the rigid body model's surface. These constraints, along with prior motion, collision/contact constraints, and joint mechanics, are resolved with a projected Gauss-Seidel solver. Due to camera noise properties and attachment errors, the numerous surface constraints are impulse capped to avoid overpowering mechanical constraints. To improve tracking accuracy, multiple simulations are spawned at each frame and fed a variety of heuristics, constraints and poses. A 3D error metric selects the best-fit simulation, hel** the system handle challenging hand motions. Such an approach enables real-time, robust, and accurate 3D skeletal tracking of a user's hand on a variety of depth cameras, while only utilizing a single x86 CPU core for processing. △ Less

Submitted 22 May, 2017; originally announced May 2017.

Comments: Published in Graphics Interface 2013

ACM Class: I.3.7

arXiv:1705.05548 [pdf, other]

Intel RealSense Stereoscopic Depth Cameras

Authors: Leonid Keselman, John Iselin Woodfill, Anders Grunnet-Jepsen, Achintya Bhowmik

Abstract: We present a comprehensive overview of the stereoscopic Intel RealSense RGBD imaging systems. We discuss these systems' mode-of-operation, functional behavior and include models of their expected performance, shortcomings, and limitations. We provide information about the systems' optical characteristics, their correlation algorithms, and how these properties can affect different applications, inc… ▽ More We present a comprehensive overview of the stereoscopic Intel RealSense RGBD imaging systems. We discuss these systems' mode-of-operation, functional behavior and include models of their expected performance, shortcomings, and limitations. We provide information about the systems' optical characteristics, their correlation algorithms, and how these properties can affect different applications, including 3D reconstruction and gesture recognition. Our discussion covers the Intel RealSense R200 and the Intel RealSense D400 (formally RS400). △ Less

Submitted 29 October, 2017; v1 submitted 16 May, 2017; originally announced May 2017.

Comments: Accepted to CCD 2017, a CVPR 2017 Workshop

ACM Class: I.4.8

Showing 1–9 of 9 results for author: Keselman, L