Search | arXiv e-print repository

Direct Sparse Odometry

Authors: Jakob Engel, Vladlen Koltun, Daniel Cremers

Abstract: We propose a novel direct sparse visual odometry formulation. It combines a fully direct probabilistic model (minimizing a photometric error) with consistent, joint optimization of all model parameters, including geometry -- represented as inverse depth in a reference frame -- and camera motion. This is achieved in real time by omitting the smoothness prior used in other direct methods and instead… ▽ More We propose a novel direct sparse visual odometry formulation. It combines a fully direct probabilistic model (minimizing a photometric error) with consistent, joint optimization of all model parameters, including geometry -- represented as inverse depth in a reference frame -- and camera motion. This is achieved in real time by omitting the smoothness prior used in other direct methods and instead sampling pixels evenly throughout the images. Since our method does not depend on keypoint detectors or descriptors, it can naturally sample pixels from across all image regions that have intensity gradient, including edges or smooth intensity variations on mostly white walls. The proposed model integrates a full photometric calibration, accounting for exposure time, lens vignetting, and non-linear response functions. We thoroughly evaluate our method on three different datasets comprising several hours of video. The experiments show that the presented approach significantly outperforms state-of-the-art direct and indirect methods in a variety of real-world settings, both in terms of tracking accuracy and robustness. △ Less

Submitted 7 October, 2016; v1 submitted 9 July, 2016; originally announced July 2016.

Comments: ** Corrected a bug which caused the real-time results for ORB-SLAM (dashed lines in Fig. 10 and 12) to be much worse than they should be ** Added references [12], [13],[19], and Fig. 11. ** Partly re-formulated and extended [5. Conclusion]. ** Fixed typos and minor re-formulations

arXiv:1604.03513 [pdf, other]

Full Flow: Optical Flow Estimation By Global Optimization over Regular Grids

Authors: Qifeng Chen, Vladlen Koltun

Abstract: We present a global optimization approach to optical flow estimation. The approach optimizes a classical optical flow objective over the full space of map**s between discrete grids. No descriptor matching is used. The highly regular structure of the space of map**s enables optimizations that reduce the computational complexity of the algorithm's inner loop from quadratic to linear and support… ▽ More We present a global optimization approach to optical flow estimation. The approach optimizes a classical optical flow objective over the full space of map**s between discrete grids. No descriptor matching is used. The highly regular structure of the space of map**s enables optimizations that reduce the computational complexity of the algorithm's inner loop from quadratic to linear and support efficient matching of tens of thousands of nodes to tens of thousands of displacements. We show that one-shot global optimization of a classical Horn-Schunck-type objective over regular grids at a single resolution is sufficient to initialize continuous interpolation and achieve state-of-the-art performance on challenging modern benchmarks. △ Less

Submitted 12 April, 2016; originally announced April 2016.

Comments: To be presented at CVPR 2016

arXiv:1602.02481 [pdf, other]

A Large Dataset of Object Scans

Authors: Sungjoon Choi, Qian-Yi Zhou, Stephen Miller, Vladlen Koltun

Abstract: We have created a dataset of more than ten thousand 3D scans of real objects. To create the dataset, we recruited 70 operators, equipped them with consumer-grade mobile 3D scanning setups, and paid them to scan objects in their environments. The operators scanned objects of their choosing, outside the laboratory and without direct supervision by computer vision professionals. The result is a large… ▽ More We have created a dataset of more than ten thousand 3D scans of real objects. To create the dataset, we recruited 70 operators, equipped them with consumer-grade mobile 3D scanning setups, and paid them to scan objects in their environments. The operators scanned objects of their choosing, outside the laboratory and without direct supervision by computer vision professionals. The result is a large and diverse collection of object scans: from shoes, mugs, and toys to grand pianos, construction vehicles, and large outdoor sculptures. We worked with an attorney to ensure that data acquisition did not violate privacy constraints. The acquired data was irrevocably placed in the public domain and is available freely at http://redwood-data.org/3dscan . △ Less

Submitted 5 May, 2016; v1 submitted 8 February, 2016; originally announced February 2016.

Comments: Technical report

arXiv:1511.07122 [pdf, other]

Multi-Scale Context Aggregation by Dilated Convolutions

Authors: Fisher Yu, Vladlen Koltun

Abstract: State-of-the-art models for semantic segmentation are based on adaptations of convolutional networks that had originally been designed for image classification. However, dense prediction and image classification are structurally different. In this work, we develop a new convolutional network module that is specifically designed for dense prediction. The presented module uses dilated convolutions t… ▽ More State-of-the-art models for semantic segmentation are based on adaptations of convolutional networks that had originally been designed for image classification. However, dense prediction and image classification are structurally different. In this work, we develop a new convolutional network module that is specifically designed for dense prediction. The presented module uses dilated convolutions to systematically aggregate multi-scale contextual information without losing resolution. The architecture is based on the fact that dilated convolutions support exponential expansion of the receptive field without loss of resolution or coverage. We show that the presented context module increases the accuracy of state-of-the-art semantic segmentation systems. In addition, we examine the adaptation of image classification networks to dense prediction and show that simplifying the adapted network can increase accuracy. △ Less

Submitted 30 April, 2016; v1 submitted 23 November, 2015; originally announced November 2015.

Comments: Published as a conference paper at ICLR 2016

arXiv:1210.5644 [pdf, other]

Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials

Authors: Philipp Krähenbühl, Vladlen Koltun

Abstract: Most state-of-the-art techniques for multi-class image segmentation and labeling use conditional random fields defined over pixels or image regions. While region-level models often feature dense pairwise connectivity, pixel-level models are considerably larger and have only permitted sparse graph structures. In this paper, we consider fully connected CRF models defined on the complete set of pixel… ▽ More Most state-of-the-art techniques for multi-class image segmentation and labeling use conditional random fields defined over pixels or image regions. While region-level models often feature dense pairwise connectivity, pixel-level models are considerably larger and have only permitted sparse graph structures. In this paper, we consider fully connected CRF models defined on the complete set of pixels in an image. The resulting graphs have billions of edges, making traditional inference algorithms impractical. Our main contribution is a highly efficient approximate inference algorithm for fully connected CRF models in which the pairwise edge potentials are defined by a linear combination of Gaussian kernels. Our experiments demonstrate that dense connectivity at the pixel level substantially improves segmentation and labeling accuracy. △ Less

Submitted 20 October, 2012; originally announced October 2012.

Comments: NIPS 2011

Journal ref: Advances in Neural Information Processing Systems 24 (2011) 109-117

arXiv:1206.4617 [pdf]

Continuous Inverse Optimal Control with Locally Optimal Examples

Authors: Sergey Levine, Vladlen Koltun

Abstract: Inverse optimal control, also known as inverse reinforcement learning, is the problem of recovering an unknown reward function in a Markov decision process from expert demonstrations of the optimal policy. We introduce a probabilistic inverse optimal control algorithm that scales gracefully with task dimensionality, and is suitable for large, continuous domains where even computing a full policy i… ▽ More Inverse optimal control, also known as inverse reinforcement learning, is the problem of recovering an unknown reward function in a Markov decision process from expert demonstrations of the optimal policy. We introduce a probabilistic inverse optimal control algorithm that scales gracefully with task dimensionality, and is suitable for large, continuous domains where even computing a full policy is impractical. By using a local approximation of the reward function, our method can also drop the assumption that the demonstrations are globally optimal, requiring only local optimality. This allows it to learn from examples that are unsuitable for prior methods. △ Less

Submitted 18 June, 2012; originally announced June 2012.

Comments: ICML2012

arXiv:1104.0622 [pdf, ps, other]

Kinetic Stable Delaunay Graphs

Authors: Pankaj K. Agarwal, Jie Gao, Leonidas J. Guibas, Haim Kaplan, Vladlen Koltun, Natan Rubin, Micha Sharir

Abstract: We consider the problem of maintaining the Euclidean Delaunay triangulation $\DT$ of a set $P$ of $n$ moving points in the plane, along algebraic trajectories of constant description complexity. Since the best known upper bound on the number of topological changes in the full $\DT$ is nearly cubic, we seek to maintain a suitable portion of it that is less volatile yet retains many useful propertie… ▽ More We consider the problem of maintaining the Euclidean Delaunay triangulation $\DT$ of a set $P$ of $n$ moving points in the plane, along algebraic trajectories of constant description complexity. Since the best known upper bound on the number of topological changes in the full $\DT$ is nearly cubic, we seek to maintain a suitable portion of it that is less volatile yet retains many useful properties. We introduce the notion of a stable Delaunay graph, which is a dynamic subgraph of the Delaunay triangulation. The stable Delaunay graph (a) is easy to define, (b) experiences only a nearly quadratic number of discrete changes, (c) is robust under small changes of the norm, and (d) possesses certain useful properties. The stable Delaunay graph ($\SDG$ in short) is defined in terms of a parameter $α>0$, and consists of Delaunay edges $pq$ for which the angles at which $p$ and $q$ see their Voronoi edge $e_{pq}$ are at least $α$. We show that (i) $\SDG$ always contains at least roughly one third of the Delaunay edges; (ii) it contains the $β$-skeleton of $P$, for $β=1+Ω(α^2)$; (iii) it is stable, in the sense that its edges survive for long periods of time, as long as the orientations of the segments connecting (nearby) points of $P$ do not change by much; and (iv) stable Delaunay edges remain stable (with an appropriate redefinition of stability) if we replace the Euclidean norm by any sufficiently close norm. In particular, we can approximate the Euclidean norm by a polygonal norm (namely, a regular $k$-gon, with $k=Θ(1/α)$), and keep track of a Euclidean $\SDG$ by maintaining the full Delaunay triangulation of $P$ under the polygonal norm. We describe two kinetic data structures for maintaining $\SDG$. Both structures use $O^*(n)$ storage and process $O^*(n^2)$ events during the motion, each in $O^*(1)$ time. △ Less

Submitted 4 April, 2011; originally announced April 2011.

Comments: A preliminary version appeared in Proc. SoCG 2010

ACM Class: F.2.2; G.2.1

Showing 101–107 of 107 results for author: Koltun, V