-
Direct Sparse Odometry
Authors:
Jakob Engel,
Vladlen Koltun,
Daniel Cremers
Abstract:
We propose a novel direct sparse visual odometry formulation. It combines a fully direct probabilistic model (minimizing a photometric error) with consistent, joint optimization of all model parameters, including geometry -- represented as inverse depth in a reference frame -- and camera motion. This is achieved in real time by omitting the smoothness prior used in other direct methods and instead…
▽ More
We propose a novel direct sparse visual odometry formulation. It combines a fully direct probabilistic model (minimizing a photometric error) with consistent, joint optimization of all model parameters, including geometry -- represented as inverse depth in a reference frame -- and camera motion. This is achieved in real time by omitting the smoothness prior used in other direct methods and instead sampling pixels evenly throughout the images. Since our method does not depend on keypoint detectors or descriptors, it can naturally sample pixels from across all image regions that have intensity gradient, including edges or smooth intensity variations on mostly white walls. The proposed model integrates a full photometric calibration, accounting for exposure time, lens vignetting, and non-linear response functions. We thoroughly evaluate our method on three different datasets comprising several hours of video. The experiments show that the presented approach significantly outperforms state-of-the-art direct and indirect methods in a variety of real-world settings, both in terms of tracking accuracy and robustness.
△ Less
Submitted 7 October, 2016; v1 submitted 9 July, 2016;
originally announced July 2016.
-
Full Flow: Optical Flow Estimation By Global Optimization over Regular Grids
Authors:
Qifeng Chen,
Vladlen Koltun
Abstract:
We present a global optimization approach to optical flow estimation. The approach optimizes a classical optical flow objective over the full space of map**s between discrete grids. No descriptor matching is used. The highly regular structure of the space of map**s enables optimizations that reduce the computational complexity of the algorithm's inner loop from quadratic to linear and support…
▽ More
We present a global optimization approach to optical flow estimation. The approach optimizes a classical optical flow objective over the full space of map**s between discrete grids. No descriptor matching is used. The highly regular structure of the space of map**s enables optimizations that reduce the computational complexity of the algorithm's inner loop from quadratic to linear and support efficient matching of tens of thousands of nodes to tens of thousands of displacements. We show that one-shot global optimization of a classical Horn-Schunck-type objective over regular grids at a single resolution is sufficient to initialize continuous interpolation and achieve state-of-the-art performance on challenging modern benchmarks.
△ Less
Submitted 12 April, 2016;
originally announced April 2016.
-
A Large Dataset of Object Scans
Authors:
Sungjoon Choi,
Qian-Yi Zhou,
Stephen Miller,
Vladlen Koltun
Abstract:
We have created a dataset of more than ten thousand 3D scans of real objects. To create the dataset, we recruited 70 operators, equipped them with consumer-grade mobile 3D scanning setups, and paid them to scan objects in their environments. The operators scanned objects of their choosing, outside the laboratory and without direct supervision by computer vision professionals. The result is a large…
▽ More
We have created a dataset of more than ten thousand 3D scans of real objects. To create the dataset, we recruited 70 operators, equipped them with consumer-grade mobile 3D scanning setups, and paid them to scan objects in their environments. The operators scanned objects of their choosing, outside the laboratory and without direct supervision by computer vision professionals. The result is a large and diverse collection of object scans: from shoes, mugs, and toys to grand pianos, construction vehicles, and large outdoor sculptures. We worked with an attorney to ensure that data acquisition did not violate privacy constraints. The acquired data was irrevocably placed in the public domain and is available freely at http://redwood-data.org/3dscan .
△ Less
Submitted 5 May, 2016; v1 submitted 8 February, 2016;
originally announced February 2016.
-
Multi-Scale Context Aggregation by Dilated Convolutions
Authors:
Fisher Yu,
Vladlen Koltun
Abstract:
State-of-the-art models for semantic segmentation are based on adaptations of convolutional networks that had originally been designed for image classification. However, dense prediction and image classification are structurally different. In this work, we develop a new convolutional network module that is specifically designed for dense prediction. The presented module uses dilated convolutions t…
▽ More
State-of-the-art models for semantic segmentation are based on adaptations of convolutional networks that had originally been designed for image classification. However, dense prediction and image classification are structurally different. In this work, we develop a new convolutional network module that is specifically designed for dense prediction. The presented module uses dilated convolutions to systematically aggregate multi-scale contextual information without losing resolution. The architecture is based on the fact that dilated convolutions support exponential expansion of the receptive field without loss of resolution or coverage. We show that the presented context module increases the accuracy of state-of-the-art semantic segmentation systems. In addition, we examine the adaptation of image classification networks to dense prediction and show that simplifying the adapted network can increase accuracy.
△ Less
Submitted 30 April, 2016; v1 submitted 23 November, 2015;
originally announced November 2015.
-
Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials
Authors:
Philipp Krähenbühl,
Vladlen Koltun
Abstract:
Most state-of-the-art techniques for multi-class image segmentation and labeling use conditional random fields defined over pixels or image regions. While region-level models often feature dense pairwise connectivity, pixel-level models are considerably larger and have only permitted sparse graph structures. In this paper, we consider fully connected CRF models defined on the complete set of pixel…
▽ More
Most state-of-the-art techniques for multi-class image segmentation and labeling use conditional random fields defined over pixels or image regions. While region-level models often feature dense pairwise connectivity, pixel-level models are considerably larger and have only permitted sparse graph structures. In this paper, we consider fully connected CRF models defined on the complete set of pixels in an image. The resulting graphs have billions of edges, making traditional inference algorithms impractical. Our main contribution is a highly efficient approximate inference algorithm for fully connected CRF models in which the pairwise edge potentials are defined by a linear combination of Gaussian kernels. Our experiments demonstrate that dense connectivity at the pixel level substantially improves segmentation and labeling accuracy.
△ Less
Submitted 20 October, 2012;
originally announced October 2012.
-
Continuous Inverse Optimal Control with Locally Optimal Examples
Authors:
Sergey Levine,
Vladlen Koltun
Abstract:
Inverse optimal control, also known as inverse reinforcement learning, is the problem of recovering an unknown reward function in a Markov decision process from expert demonstrations of the optimal policy. We introduce a probabilistic inverse optimal control algorithm that scales gracefully with task dimensionality, and is suitable for large, continuous domains where even computing a full policy i…
▽ More
Inverse optimal control, also known as inverse reinforcement learning, is the problem of recovering an unknown reward function in a Markov decision process from expert demonstrations of the optimal policy. We introduce a probabilistic inverse optimal control algorithm that scales gracefully with task dimensionality, and is suitable for large, continuous domains where even computing a full policy is impractical. By using a local approximation of the reward function, our method can also drop the assumption that the demonstrations are globally optimal, requiring only local optimality. This allows it to learn from examples that are unsuitable for prior methods.
△ Less
Submitted 18 June, 2012;
originally announced June 2012.
-
Kinetic Stable Delaunay Graphs
Authors:
Pankaj K. Agarwal,
Jie Gao,
Leonidas J. Guibas,
Haim Kaplan,
Vladlen Koltun,
Natan Rubin,
Micha Sharir
Abstract:
We consider the problem of maintaining the Euclidean Delaunay triangulation $\DT$ of a set $P$ of $n$ moving points in the plane, along algebraic trajectories of constant description complexity. Since the best known upper bound on the number of topological changes in the full $\DT$ is nearly cubic, we seek to maintain a suitable portion of it that is less volatile yet retains many useful propertie…
▽ More
We consider the problem of maintaining the Euclidean Delaunay triangulation $\DT$ of a set $P$ of $n$ moving points in the plane, along algebraic trajectories of constant description complexity. Since the best known upper bound on the number of topological changes in the full $\DT$ is nearly cubic, we seek to maintain a suitable portion of it that is less volatile yet retains many useful properties. We introduce the notion of a stable Delaunay graph, which is a dynamic subgraph of the Delaunay triangulation. The stable Delaunay graph (a) is easy to define, (b) experiences only a nearly quadratic number of discrete changes, (c) is robust under small changes of the norm, and (d) possesses certain useful properties.
The stable Delaunay graph ($\SDG$ in short) is defined in terms of a parameter $α>0$, and consists of Delaunay edges $pq$ for which the angles at which $p$ and $q$ see their Voronoi edge $e_{pq}$ are at least $α$. We show that (i) $\SDG$ always contains at least roughly one third of the Delaunay edges; (ii) it contains the $β$-skeleton of $P$, for $β=1+Ω(α^2)$; (iii) it is stable, in the sense that its edges survive for long periods of time, as long as the orientations of the segments connecting (nearby) points of $P$ do not change by much; and (iv) stable Delaunay edges remain stable (with an appropriate redefinition of stability) if we replace the Euclidean norm by any sufficiently close norm.
In particular, we can approximate the Euclidean norm by a polygonal norm (namely, a regular $k$-gon, with $k=Θ(1/α)$), and keep track of a Euclidean $\SDG$ by maintaining the full Delaunay triangulation of $P$ under the polygonal norm.
We describe two kinetic data structures for maintaining $\SDG$. Both structures use $O^*(n)$ storage and process $O^*(n^2)$ events during the motion, each in $O^*(1)$ time.
△ Less
Submitted 4 April, 2011;
originally announced April 2011.