Search | arXiv e-print repository

Image-based localization using LSTMs for structured feature correlation

Authors: Florian Walch, Caner Hazirbas, Laura Leal-Taixé, Torsten Sattler, Sebastian Hilsenbeck, Daniel Cremers

Abstract: In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes. CNNs allow us to learn suitable feature representations for localization that are robust against motion blur and illumination changes. We make use of LSTM units on the CNN output, which play the role of a structured dimensionality reduction on the feature vector, leading to drastic improve… ▽ More In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes. CNNs allow us to learn suitable feature representations for localization that are robust against motion blur and illumination changes. We make use of LSTM units on the CNN output, which play the role of a structured dimensionality reduction on the feature vector, leading to drastic improvements in localization performance. We provide extensive quantitative comparison of CNN-based and SIFT-based localization methods, showing the weaknesses and strengths of each. Furthermore, we present a new large-scale indoor dataset with accurate ground truth from a laser scanner. Experimental results on both indoor and outdoor public datasets show our method outperforms existing deep architectures, and can localize images in hard conditions, e.g., in the presence of mostly textureless surfaces, where classic SIFT-based methods fail. △ Less

Submitted 20 August, 2017; v1 submitted 23 November, 2016; originally announced November 2016.

arXiv:1611.07767 [pdf, other]

Multiframe Motion Coupling for Video Super Resolution

Authors: Jonas Gei**, Hendrik Dirks, Daniel Cremers, Michael Moeller

Abstract: The idea of video super resolution is to use different view points of a single scene to enhance the overall resolution and quality. Classical energy minimization approaches first establish a correspondence of the current frame to all its neighbors in some radius and then use this temporal information for enhancement. In this paper, we propose the first variational super resolution approach that co… ▽ More The idea of video super resolution is to use different view points of a single scene to enhance the overall resolution and quality. Classical energy minimization approaches first establish a correspondence of the current frame to all its neighbors in some radius and then use this temporal information for enhancement. In this paper, we propose the first variational super resolution approach that computes several super resolved frames in one batch optimization procedure by incorporating motion information between the high-resolution image frames themselves. As a consequence, the number of motion estimation problems grows linearly in the number of frames, opposed to a quadratic growth of classical methods and temporal consistency is enforced naturally. We use infimal convolution regularization as well as an automatic parameter balancing scheme to automatically determine the reliability of the motion information and reweight the regularization locally. We demonstrate that our approach yields state-of-the-art results and even is competitive with machine learning approaches. △ Less

Submitted 4 December, 2017; v1 submitted 23 November, 2016; originally announced November 2016.

ACM Class: I.4; G.1.6; G.4

arXiv:1611.06987 [pdf, other]

Sublabel-Accurate Discretization of Nonconvex Free-Discontinuity Problems

Authors: Thomas Möllenhoff, Daniel Cremers

Abstract: In this work we show how sublabel-accurate multilabeling approaches can be derived by approximating a classical label-continuous convex relaxation of nonconvex free-discontinuity problems. This insight allows to extend these sublabel-accurate approaches from total variation to general convex and nonconvex regularizations. Furthermore, it leads to a systematic approach to the discretization of cont… ▽ More In this work we show how sublabel-accurate multilabeling approaches can be derived by approximating a classical label-continuous convex relaxation of nonconvex free-discontinuity problems. This insight allows to extend these sublabel-accurate approaches from total variation to general convex and nonconvex regularizations. Furthermore, it leads to a systematic approach to the discretization of continuous convex relaxations. We study the relationship to existing discretizations and to discrete-continuous MRFs. Finally, we apply the proposed approach to obtain a sublabel-accurate and convex solution to the vectorial Mumford-Shah functional and show in several experiments that it leads to more precise solutions using fewer labels. △ Less

Submitted 5 August, 2017; v1 submitted 21 November, 2016; originally announced November 2016.

Comments: ICCV 2017 version

arXiv:1611.05241 [pdf, other]

A Combinatorial Solution to Non-Rigid 3D Shape-to-Image Matching

Authors: Florian Bernard, Frank R. Schmidt, Johan Thunberg, Daniel Cremers

Abstract: We propose a combinatorial solution for the problem of non-rigidly matching a 3D shape to 3D image data. To this end, we model the shape as a triangular mesh and allow each triangle of this mesh to be rigidly transformed to achieve a suitable matching to the image. By penalising the distance and the relative rotation between neighbouring triangles our matching compromises between image and shape i… ▽ More We propose a combinatorial solution for the problem of non-rigidly matching a 3D shape to 3D image data. To this end, we model the shape as a triangular mesh and allow each triangle of this mesh to be rigidly transformed to achieve a suitable matching to the image. By penalising the distance and the relative rotation between neighbouring triangles our matching compromises between image and shape information. In this paper, we resolve two major challenges: Firstly, we address the resulting large and NP-hard combinatorial problem with a suitable graph-theoretic approach. Secondly, we propose an efficient discretisation of the unbounded 6-dimensional Lie group SE(3). To our knowledge this is the first combinatorial formulation for non-rigid 3D shape-to-image matching. In contrast to existing local (gradient descent) optimisation methods, we obtain solutions that do not require a good initialisation and that are within a bound of the optimal solution. We evaluate the proposed method on the two problems of non-rigid 3D shape-to-shape and non-rigid 3D shape-to-image registration and demonstrate that it provides promising results. △ Less

Submitted 18 May, 2017; v1 submitted 16 November, 2016; originally announced November 2016.

Comments: 10 pages, 7 figures

Journal ref: CVPR 2017

arXiv:1611.05198 [pdf, other]

One-Shot Video Object Segmentation

Authors: Sergi Caelles, Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Laura Leal-Taixé, Daniel Cremers, Luc Van Gool

Abstract: This paper tackles the task of semi-supervised video object segmentation, i.e., the separation of an object from the background in a video, given the mask of the first frame. We present One-Shot Video Object Segmentation (OSVOS), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foregro… ▽ More This paper tackles the task of semi-supervised video object segmentation, i.e., the separation of an object from the background in a video, given the mask of the first frame. We present One-Shot Video Object Segmentation (OSVOS), based on a fully-convolutional neural network architecture that is able to successively transfer generic semantic information, learned on ImageNet, to the task of foreground segmentation, and finally to learning the appearance of a single annotated object of the test sequence (hence one-shot). Although all frames are processed independently, the results are temporally coherent and stable. We perform experiments on two annotated video segmentation databases, which show that OSVOS is fast and improves the state of the art by a significant margin (79.8% vs 68.0%). △ Less

Submitted 13 April, 2017; v1 submitted 16 November, 2016; originally announced November 2016.

Comments: CVPR 2017 camera ready. Code: http://www.vision.ee.ethz.ch/~cvlsegmentation/osvos/

arXiv:1609.08267 [pdf, other]

De-noising, Stabilizing and Completing 3D Reconstructions On-the-go using Plane Priors

Authors: Maksym Dzitsiuk, Jürgen Sturm, Robert Maier, Lingni Ma, Daniel Cremers

Abstract: Creating 3D maps on robots and other mobile devices has become a reality in recent years. Online 3D reconstruction enables many exciting applications in robotics and AR/VR gaming. However, the reconstructions are noisy and generally incomplete. Moreover, during onine reconstruction, the surface changes with every newly integrated depth image which poses a significant challenge for physics engines… ▽ More Creating 3D maps on robots and other mobile devices has become a reality in recent years. Online 3D reconstruction enables many exciting applications in robotics and AR/VR gaming. However, the reconstructions are noisy and generally incomplete. Moreover, during onine reconstruction, the surface changes with every newly integrated depth image which poses a significant challenge for physics engines and path planning algorithms. This paper presents a novel, fast and robust method for obtaining and using information about planar surfaces, such as walls, floors, and ceilings as a stage in 3D reconstruction based on Signed Distance Fields. Our algorithm recovers clean and accurate surfaces, reduces the movement of individual mesh vertices caused by noise during online reconstruction and fills in the occluded and unobserved regions. We implemented and evaluated two different strategies to generate plane candidates and two strategies for merging them. Our implementation is optimized to run in real-time on mobile devices such as the Tango tablet. In an extensive set of experiments, we validated that our approach works well in a large number of natural environments despite the presence of significant amount of occlusion, clutter and noise, which occur frequently. We further show that plane fitting enables in many cases a meaningful semantic segmentation of real-world scenes. △ Less

Submitted 27 March, 2017; v1 submitted 27 September, 2016; originally announced September 2016.

arXiv:1609.07835 [pdf, other]

doi 10.1109/ECMR.2017.8098709

From Monocular SLAM to Autonomous Drone Exploration

Authors: Lukas von Stumberg, Vladyslav Usenko, Jakob Engel, Jörg Stückler, Daniel Cremers

Abstract: Micro aerial vehicles (MAVs) are strongly limited in their payload and power capacity. In order to implement autonomous navigation, algorithms are therefore desirable that use sensory equipment that is as small, low-weight, and low-power consuming as possible. In this paper, we propose a method for autonomous MAV navigation and exploration using a low-cost consumer-grade quadrocopter equipped with… ▽ More Micro aerial vehicles (MAVs) are strongly limited in their payload and power capacity. In order to implement autonomous navigation, algorithms are therefore desirable that use sensory equipment that is as small, low-weight, and low-power consuming as possible. In this paper, we propose a method for autonomous MAV navigation and exploration using a low-cost consumer-grade quadrocopter equipped with a monocular camera. Our vision-based navigation system builds on LSD-SLAM which estimates the MAV trajectory and a semi-dense reconstruction of the environment in real-time. Since LSD-SLAM only determines depth at high gradient pixels, texture-less areas are not directly observed so that previous exploration methods that assume dense map information cannot directly be applied. We propose an obstacle map** and exploration approach that takes the properties of our semi-dense monocular SLAM system into account. In experiments, we demonstrate our vision-based autonomous navigation and exploration system with a Parrot Bebop MAV. △ Less

Submitted 12 March, 2018; v1 submitted 25 September, 2016; originally announced September 2016.

MSC Class: 68T40 ACM Class: I.2.9

arXiv:1607.03425 [pdf, other]

Bayesian Inference of Bijective Non-Rigid Shape Correspondence

Authors: Matthias Vestner, Roee Litman, Alex Bronstein, Emanuele Rodolà, Daniel Cremers

Abstract: Many algorithms for the computation of correspondences between deformable shapes rely on some variant of nearest neighbor matching in a descriptor space. Such are, for example, various point-wise correspondence recovery algorithms used as a postprocessing stage in the functional correspondence framework. In this paper, we show that such frequently used techniques in practice suffer from lack of ac… ▽ More Many algorithms for the computation of correspondences between deformable shapes rely on some variant of nearest neighbor matching in a descriptor space. Such are, for example, various point-wise correspondence recovery algorithms used as a postprocessing stage in the functional correspondence framework. In this paper, we show that such frequently used techniques in practice suffer from lack of accuracy and result in poor surjectivity. We propose an alternative recovery technique guaranteeing a bijective correspondence and producing significantly higher accuracy. We derive the proposed method from a statistical framework of Bayesian inference and demonstrate its performance on several challenging deformable 3D shape matching datasets. △ Less

Submitted 12 July, 2016; originally announced July 2016.

arXiv:1607.02565 [pdf, other]

Direct Sparse Odometry

Authors: Jakob Engel, Vladlen Koltun, Daniel Cremers

Abstract: We propose a novel direct sparse visual odometry formulation. It combines a fully direct probabilistic model (minimizing a photometric error) with consistent, joint optimization of all model parameters, including geometry -- represented as inverse depth in a reference frame -- and camera motion. This is achieved in real time by omitting the smoothness prior used in other direct methods and instead… ▽ More We propose a novel direct sparse visual odometry formulation. It combines a fully direct probabilistic model (minimizing a photometric error) with consistent, joint optimization of all model parameters, including geometry -- represented as inverse depth in a reference frame -- and camera motion. This is achieved in real time by omitting the smoothness prior used in other direct methods and instead sampling pixels evenly throughout the images. Since our method does not depend on keypoint detectors or descriptors, it can naturally sample pixels from across all image regions that have intensity gradient, including edges or smooth intensity variations on mostly white walls. The proposed model integrates a full photometric calibration, accounting for exposure time, lens vignetting, and non-linear response functions. We thoroughly evaluate our method on three different datasets comprising several hours of video. The experiments show that the presented approach significantly outperforms state-of-the-art direct and indirect methods in a variety of real-world settings, both in terms of tracking accuracy and robustness. △ Less

Submitted 7 October, 2016; v1 submitted 9 July, 2016; originally announced July 2016.

Comments: ** Corrected a bug which caused the real-time results for ORB-SLAM (dashed lines in Fig. 10 and 12) to be much worse than they should be ** Added references [12], [13],[19], and Fig. 11. ** Partly re-formulated and extended [5. Conclusion]. ** Fixed typos and minor re-formulations

arXiv:1607.02555 [pdf, other]

A Photometrically Calibrated Benchmark For Monocular Visual Odometry

Authors: Jakob Engel, Vladyslav Usenko, Daniel Cremers

Abstract: We present a dataset for evaluating the tracking accuracy of monocular visual odometry and SLAM methods. It contains 50 real-world sequences comprising more than 100 minutes of video, recorded across dozens of different environments -- ranging from narrow indoor corridors to wide outdoor scenes. All sequences contain mostly exploring camera motion, starting and ending at the same position. This al… ▽ More We present a dataset for evaluating the tracking accuracy of monocular visual odometry and SLAM methods. It contains 50 real-world sequences comprising more than 100 minutes of video, recorded across dozens of different environments -- ranging from narrow indoor corridors to wide outdoor scenes. All sequences contain mostly exploring camera motion, starting and ending at the same position. This allows to evaluate tracking accuracy via the accumulated drift from start to end, without requiring ground truth for the full sequence. In contrast to existing datasets, all sequences are photometrically calibrated. We provide exposure times for each frame as reported by the sensor, the camera response function, and dense lens attenuation factors. We also propose a novel, simple approach to non-parametric vignette calibration, which requires minimal set-up and is easy to reproduce. Finally, we thoroughly evaluate two existing methods (ORB-SLAM and DSO) on the dataset, including an analysis of the effect of image resolution, camera field of view, and the camera motion direction. △ Less

Submitted 8 October, 2016; v1 submitted 8 July, 2016; originally announced July 2016.

Comments: * Corrected a bug in the evaluation setup, which caused the real-time results for ORB-SLAM (dashed lines in Figure 8) to be much worse than they should be. * https://vision.in.tum.de/data/datasets/mono-dataset

arXiv:1605.00071 [pdf, other]

The Homotopy Method Revisited: Computing Solution Paths of $\ell_1$-Regularized Problems

Authors: Björn Bringmann, Daniel Cremers, Felix Krahmer, Michael Möller

Abstract: $ \ell_1 $-regularized linear inverse problems are frequently used in signal processing, image analysis, and statistics. The correct choice of the regularization parameter $ t \in \mathbb{R}_{\geq 0} $ is a delicate issue. Instead of solving the variational problem for a fixed parameter, the idea of the homotopy method is to compute a complete solution path $ u(t) $ as a function of $ t $. In a ce… ▽ More $ \ell_1 $-regularized linear inverse problems are frequently used in signal processing, image analysis, and statistics. The correct choice of the regularization parameter $ t \in \mathbb{R}_{\geq 0} $ is a delicate issue. Instead of solving the variational problem for a fixed parameter, the idea of the homotopy method is to compute a complete solution path $ u(t) $ as a function of $ t $. In a celebrated paper by Osborne, Presnell, and Turlach, it has been shown that the computational cost of this approach is often comparable to the cost of solving the corresponding least squares problem. Their analysis relies on the one-at-a-time condition, which requires that different indices enter or leave the support of the solution at distinct regularization parameters. In this paper, we introduce a generalized homotopy algorithm based on a nonnegative least squares problem, which does not require such a condition, and prove its termination after finitely many steps. At every point of the path, we give a full characterization of all possible directions. To illustrate our results, we discuss examples in which the standard homotopy method either fails or becomes infeasible. To the best of our knowledge, our algorithm is the first to provably compute a full solution path for an arbitrary combination of an input matrix and a data vector. △ Less

Submitted 30 April, 2016; originally announced May 2016.

Comments: 19 pages, 4 figures

MSC Class: 90C25 (Primary) 49N45; 62J07 (Secondary)

arXiv:1604.01980 [pdf, other]

Sublabel-Accurate Convex Relaxation of Vectorial Multilabel Energies

Authors: Emanuel Laude, Thomas Möllenhoff, Michael Moeller, Jan Lellmann, Daniel Cremers

Abstract: Convex relaxations of nonconvex multilabel problems have been demonstrated to produce superior (provably optimal or near-optimal) solutions to a variety of classical computer vision problems. Yet, they are of limited practical use as they require a fine discretization of the label space, entailing a huge demand in memory and runtime. In this work, we propose the first sublabel accurate convex rela… ▽ More Convex relaxations of nonconvex multilabel problems have been demonstrated to produce superior (provably optimal or near-optimal) solutions to a variety of classical computer vision problems. Yet, they are of limited practical use as they require a fine discretization of the label space, entailing a huge demand in memory and runtime. In this work, we propose the first sublabel accurate convex relaxation for vectorial multilabel problems. The key idea is that we approximate the dataterm of the vectorial labeling problem in a piecewise convex (rather than piecewise linear) manner. As a result we have a more faithful approximation of the original cost function that provides a meaningful interpretation for the fractional solutions of the relaxed convex problem. In numerous experiments on large-displacement optical flow estimation and on color image denoising we demonstrate that the computed solutions have superior quality while requiring much lower memory and runtime. △ Less

Submitted 10 October, 2016; v1 submitted 7 April, 2016; originally announced April 2016.

arXiv:1601.06070 [pdf, other]

Efficient Globally Optimal 2D-to-3D Deformable Shape Matching

Authors: Zorah Lähner, Emanuele Rodolà, Frank R. Schmidt, Michael M. Bronstein, Daniel Cremers

Abstract: We propose the first algorithm for non-rigid 2D-to-3D shape matching, where the input is a 2D shape represented as a planar curve and a 3D shape represented as a surface; the output is a continuous curve on the surface. We cast the problem as finding the shortest circular path on the product 3-manifold of the surface and the curve. We prove that the optimal matching can be computed in polynomial t… ▽ More We propose the first algorithm for non-rigid 2D-to-3D shape matching, where the input is a 2D shape represented as a planar curve and a 3D shape represented as a surface; the output is a continuous curve on the surface. We cast the problem as finding the shortest circular path on the product 3-manifold of the surface and the curve. We prove that the optimal matching can be computed in polynomial time with a (worst-case) complexity of $O(mn^2\log(n))$, where $m$ and $n$ denote the number of vertices on the template curve and the 3D shape respectively. We also demonstrate that in practice the runtime is essentially linear in $m\!\cdot\! n$ making it an efficient method for shape analysis and shape retrieval. Quantitative evaluation confirms that the method provides excellent results for sketch-based deformable 3D shape retrieval. △ Less

Submitted 21 January, 2022; v1 submitted 22 January, 2016; originally announced January 2016.

Comments: Extended chapter of conference paper in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016 to be published in Shape Analysis: Euclidean, Discrete and Algebraic Geometric Methods by Springer

arXiv:1601.02912 [pdf, other]

Spectral Decompositions using One-Homogeneous Functionals

Authors: Martin Burger, Guy Gilboa, Michael Moeller, Lina Eckardt, Daniel Cremers

Abstract: This paper discusses the use of absolutely one-homogeneous regularization functionals in a variational, scale space, and inverse scale space setting to define a nonlinear spectral decomposition of input data. We present several theoretical results that explain the relation between the different definitions. Additionally, results on the orthogonality of the decomposition, a Parseval-type identity a… ▽ More This paper discusses the use of absolutely one-homogeneous regularization functionals in a variational, scale space, and inverse scale space setting to define a nonlinear spectral decomposition of input data. We present several theoretical results that explain the relation between the different definitions. Additionally, results on the orthogonality of the decomposition, a Parseval-type identity and the notion of generalized (nonlinear) eigenvectors closely link our nonlinear multiscale decompositions to the well-known linear filtering theory. Numerical results are used to illustrate our findings. △ Less

Submitted 12 January, 2016; originally announced January 2016.

arXiv:1512.02134 [pdf, other]

doi 10.1109/CVPR.2016.438

A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation

Authors: Nikolaus Mayer, Eddy Ilg, Philip Häusser, Philipp Fischer, Daniel Cremers, Alexey Dosovitskiy, Thomas Brox

Abstract: Recent work has shown that optical flow estimation can be formulated as a supervised learning task and can be successfully solved with convolutional networks. Training of the so-called FlowNet was enabled by a large synthetically generated dataset. The present paper extends the concept of optical flow estimation via convolutional networks to disparity and scene flow estimation. To this end, we pro… ▽ More Recent work has shown that optical flow estimation can be formulated as a supervised learning task and can be successfully solved with convolutional networks. Training of the so-called FlowNet was enabled by a large synthetically generated dataset. The present paper extends the concept of optical flow estimation via convolutional networks to disparity and scene flow estimation. To this end, we propose three synthetic stereo video datasets with sufficient realism, variation, and size to successfully train large networks. Our datasets are the first large-scale datasets to enable training and evaluating scene flow methods. Besides the datasets, we present a convolutional network for real-time disparity estimation that provides state-of-the-art results. By combining a flow and disparity estimation network and training it jointly, we demonstrate the first scene flow estimation with a convolutional network. △ Less

Submitted 7 December, 2015; originally announced December 2015.

Comments: Includes supplementary material

ACM Class: I.2.6; I.2.10; I.4.8

arXiv:1512.01383 [pdf, other]

Sublabel-Accurate Relaxation of Nonconvex Energies

Authors: Thomas Möllenhoff, Emanuel Laude, Michael Moeller, Jan Lellmann, Daniel Cremers

Abstract: We propose a novel spatially continuous framework for convex relaxations based on functional lifting. Our method can be interpreted as a sublabel-accurate solution to multilabel problems. We show that previously proposed functional lifting methods optimize an energy which is linear between two labels and hence require (often infinitely) many labels for a faithful approximation. In contrast, the pr… ▽ More We propose a novel spatially continuous framework for convex relaxations based on functional lifting. Our method can be interpreted as a sublabel-accurate solution to multilabel problems. We show that previously proposed functional lifting methods optimize an energy which is linear between two labels and hence require (often infinitely) many labels for a faithful approximation. In contrast, the proposed formulation is based on a piecewise convex approximation and therefore needs far fewer labels. In comparison to recent MRF-based approaches, our method is formulated in a spatially continuous setting and shows less grid bias. Moreover, in a local sense, our formulation is the tightest possible convex relaxation. It is easy to implement and allows an efficient primal-dual optimization on GPUs. We show the effectiveness of our approach on several computer vision problems. △ Less

Submitted 4 December, 2015; originally announced December 2015.

arXiv:1508.01308 [pdf, other]

doi 10.1137/15M102873X

Collaborative Total Variation: A General Framework for Vectorial TV Models

Authors: Joan Duran, Michael Moeller, Catalina Sbert, Daniel Cremers

Abstract: Even after over two decades, the total variation (TV) remains one of the most popular regularizations for image processing problems and has sparked a tremendous amount of research, particularly to move from scalar to vector-valued functions. In this paper, we consider the gradient of a color image as a three dimensional matrix or tensor with dimensions corresponding to the spatial extend, the diff… ▽ More Even after over two decades, the total variation (TV) remains one of the most popular regularizations for image processing problems and has sparked a tremendous amount of research, particularly to move from scalar to vector-valued functions. In this paper, we consider the gradient of a color image as a three dimensional matrix or tensor with dimensions corresponding to the spatial extend, the differences to other pixels, and the spectral channels. The smoothness of this tensor is then measured by taking different norms along the different dimensions. Depending on the type of these norms one obtains very different properties of the regularization, leading to novel models for color images. We call this class of regularizations collaborative total variation (CTV). On the theoretical side, we characterize the dual norm, the subdifferential and the proximal map** of the proposed regularizers. We further prove, with the help of the generalized concept of singular vectors, that an $\ell^{\infty}$ channel coupling makes the most prior assumptions and has the greatest potential to reduce color artifacts. Our practical contributions consist of an extensive experimental section where we compare the performance of a large number of collaborative TV methods for inverse problems like denoising, deblurring and inpainting. △ Less

Submitted 6 August, 2015; originally announced August 2015.

MSC Class: 15A60; 65F22; 65K10; 68U10; 90C25; 90C46; 94A08

Journal ref: SIAM Journal on Imaging Sciences, vol. 9(1), pp. 116-151, 2016

arXiv:1506.05603 [pdf, other]

Point-wise Map Recovery and Refinement from Functional Correspondence

Authors: Emanuele Rodolà, Michael Moeller, Daniel Cremers

Abstract: Since their introduction in the shape analysis community, functional maps have met with considerable success due to their ability to compactly represent dense correspondences between deformable shapes, with applications ranging from shape matching and image segmentation, to exploration of large shape collections. Despite the numerous advantages of such representation, however, the problem of conve… ▽ More Since their introduction in the shape analysis community, functional maps have met with considerable success due to their ability to compactly represent dense correspondences between deformable shapes, with applications ranging from shape matching and image segmentation, to exploration of large shape collections. Despite the numerous advantages of such representation, however, the problem of converting a given functional map back to a point-to-point map has received a surprisingly limited interest. In this paper we analyze the general problem of point-wise map recovery from arbitrary functional maps. In doing so, we rule out many of the assumptions required by the currently established approach -- most notably, the limiting requirement of the input shapes being nearly-isometric. We devise an efficient recovery process based on a simple probabilistic model. Experiments confirm that this approach achieves remarkable accuracy improvements in very challenging cases. △ Less

Submitted 18 June, 2015; originally announced June 2015.

arXiv:1506.05274 [pdf, other]

Partial Functional Correspondence

Authors: Emanuele Rodolà, Luca Cosmo, Michael M. Bronstein, Andrea Torsello, Daniel Cremers

Abstract: In this paper, we propose a method for computing partial functional correspondence between non-rigid shapes. We use perturbation analysis to show how removal of shape parts changes the Laplace-Beltrami eigenfunctions, and exploit it as a prior on the spectral representation of the correspondence. Corresponding parts are optimization variables in our problem and are used to weight the functional co… ▽ More In this paper, we propose a method for computing partial functional correspondence between non-rigid shapes. We use perturbation analysis to show how removal of shape parts changes the Laplace-Beltrami eigenfunctions, and exploit it as a prior on the spectral representation of the correspondence. Corresponding parts are optimization variables in our problem and are used to weight the functional correspondence; we are looking for the largest and most regular (in the Mumford-Shah sense) parts that minimize correspondence distortion. We show that our approach can cope with very challenging correspondence settings. △ Less

Submitted 22 December, 2015; v1 submitted 17 June, 2015; originally announced June 2015.

arXiv:1504.06852 [pdf, other]

FlowNet: Learning Optical Flow with Convolutional Networks

Authors: Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazırbaş, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, Thomas Brox

Abstract: Convolutional neural networks (CNNs) have recently been very successful in a variety of computer vision tasks, especially on those linked to recognition. Optical flow estimation has not been among the tasks where CNNs were successful. In this paper we construct appropriate CNNs which are capable of solving the optical flow estimation problem as a supervised learning task. We propose and compare tw… ▽ More Convolutional neural networks (CNNs) have recently been very successful in a variety of computer vision tasks, especially on those linked to recognition. Optical flow estimation has not been among the tasks where CNNs were successful. In this paper we construct appropriate CNNs which are capable of solving the optical flow estimation problem as a supervised learning task. We propose and compare two architectures: a generic architecture and another one including a layer that correlates feature vectors at different image locations. Since existing ground truth data sets are not sufficiently large to train a CNN, we generate a synthetic Flying Chairs dataset. We show that networks trained on this unrealistic data still generalize very well to existing datasets such as Sintel and KITTI, achieving competitive accuracy at frame rates of 5 to 10 fps. △ Less

Submitted 4 May, 2015; v1 submitted 26 April, 2015; originally announced April 2015.

Comments: Added supplementary material

ACM Class: I.2.6; I.4.8

arXiv:1408.0173 [pdf, other]

doi 10.1109/TIP.2015.2479469

Variational Depth from Focus Reconstruction

Authors: Michael Moeller, Martin Benning, Carola Schönlieb, Daniel Cremers

Abstract: This paper deals with the problem of reconstructing a depth map from a sequence of differently focused images, also known as depth from focus or shape from focus. We propose to state the depth from focus problem as a variational problem including a smooth but nonconvex data fidelity term, and a convex nonsmooth regularization, which makes the method robust to noise and leads to more realistic dept… ▽ More This paper deals with the problem of reconstructing a depth map from a sequence of differently focused images, also known as depth from focus or shape from focus. We propose to state the depth from focus problem as a variational problem including a smooth but nonconvex data fidelity term, and a convex nonsmooth regularization, which makes the method robust to noise and leads to more realistic depth maps. Additionally, we propose to solve the nonconvex minimization problem with a linearized alternating directions method of multipliers (ADMM), allowing to minimize the energy very efficiently. A numerical comparison to classical methods on simulated as well as on real data is presented. △ Less

Submitted 5 November, 2014; v1 submitted 1 August, 2014; originally announced August 2014.

arXiv:1407.1723 [pdf, other]

The Primal-Dual Hybrid Gradient Method for Semiconvex Splittings

Authors: Thomas Möllenhoff, Evgeny Strekalovskiy, Michael Moeller, Daniel Cremers

Abstract: This paper deals with the analysis of a recent reformulation of the primal-dual hybrid gradient method [Zhu and Chan 2008, Pock, Cremers, Bischof and Chambolle 2009, Esser, Zhang and Chan 2010, Chambolle and Pock 2011], which allows to apply it to nonconvex regularizers as first proposed for truncated quadratic penalization in [Strekalovskiy and Cremers 2014]. Particularly, it investigates variati… ▽ More This paper deals with the analysis of a recent reformulation of the primal-dual hybrid gradient method [Zhu and Chan 2008, Pock, Cremers, Bischof and Chambolle 2009, Esser, Zhang and Chan 2010, Chambolle and Pock 2011], which allows to apply it to nonconvex regularizers as first proposed for truncated quadratic penalization in [Strekalovskiy and Cremers 2014]. Particularly, it investigates variational problems for which the energy to be minimized can be written as $G(u) + F(Ku)$, where $G$ is convex, $F$ semiconvex, and $K$ is a linear operator. We study the method and prove convergence in the case where the nonconvexity of $F$ is compensated by the strong convexity of the $G$. The convergence proof yields an interesting requirement for the choice of algorithm parameters, which we show to not only be sufficient, but necessary. Additionally, we show boundedness of the iterates under much weaker conditions. Finally, we demonstrate effectiveness and convergence of the algorithm beyond the theoretical guarantees in several numerical experiments. △ Less

Submitted 7 July, 2014; originally announced July 2014.

arXiv:1312.4746 [pdf, other]

Co-Sparse Textural Similarity for Image Segmentation

Authors: Claudia Nieuwenhuis, Daniel Cremers, Simon Hawe, Martin Kleinsteuber

Abstract: We propose an algorithm for segmenting natural images based on texture and color information, which leverages the co-sparse analysis model for image segmentation within a convex multilabel optimization framework. As a key ingredient of this method, we introduce a novel textural similarity measure, which builds upon the co-sparse representation of image patches. We propose a Bayesian approach to me… ▽ More We propose an algorithm for segmenting natural images based on texture and color information, which leverages the co-sparse analysis model for image segmentation within a convex multilabel optimization framework. As a key ingredient of this method, we introduce a novel textural similarity measure, which builds upon the co-sparse representation of image patches. We propose a Bayesian approach to merge textural similarity with information about color and location. Combined with recently developed convex multilabel optimization methods this leads to an efficient algorithm for both supervised and unsupervised segmentation, which is easily parallelized on graphics hardware. The approach provides competitive results in unsupervised segmentation and outperforms state-of-the-art interactive segmentation methods on the Graz Benchmark. △ Less

Submitted 17 December, 2013; originally announced December 2013.

arXiv:1102.3830 [pdf, other]

A linear framework for region-based image segmentation and inpainting involving curvature penalization

Authors: Thomas Schoenemann, Fredrik Kahl, Simon Masnou, Daniel Cremers

Abstract: We present the first method to handle curvature regularity in region-based image segmentation and inpainting that is independent of initialization. To this end we start from a new formulation of length-based optimization schemes, based on surface continuation constraints, and discuss the connections to existing schemes. The formulation is based on a \emph{cell complex} and considers basic region… ▽ More We present the first method to handle curvature regularity in region-based image segmentation and inpainting that is independent of initialization. To this end we start from a new formulation of length-based optimization schemes, based on surface continuation constraints, and discuss the connections to existing schemes. The formulation is based on a \emph{cell complex} and considers basic regions and boundary elements. The corresponding optimization problem is cast as an integer linear program. We then show how the method can be extended to include curvature regularity, again cast as an integer linear program. Here, we are considering pairs of boundary elements to reflect curvature. Moreover, a constraint set is derived to ensure that the boundary variables indeed reflect the boundary of the regions described by the region variables. We show that by solving the linear programming relaxation one gets quite close to the global optimum, and that curvature regularity is indeed much better suited in the presence of long and thin objects compared to standard length regularity. △ Less

Submitted 18 February, 2011; originally announced February 2011.

arXiv:1101.0777 [pdf, other]

On a linear programming approach to the discrete Willmore boundary value problem and generalizations

Authors: Thomas Schoenemann, Simon Masnou, Daniel Cremers

Abstract: We consider the problem of finding (possibly non connected) discrete surfaces spanning a finite set of discrete boundary curves in the three-dimensional space and minimizing (globally) a discrete energy involving mean curvature. Although we consider a fairly general class of energies, our main focus is on the Willmore energy, i.e. the total squared mean curvature Our purpose is to address the deli… ▽ More We consider the problem of finding (possibly non connected) discrete surfaces spanning a finite set of discrete boundary curves in the three-dimensional space and minimizing (globally) a discrete energy involving mean curvature. Although we consider a fairly general class of energies, our main focus is on the Willmore energy, i.e. the total squared mean curvature Our purpose is to address the delicate task of approximating global minimizers of the energy under boundary constraints. The main contribution of this work is to translate the nonlinear boundary value problem into an integer linear program, using a natural formulation involving pairs of elementary triangles chosen in a pre-specified dictionary and allowing self-intersection. Our work focuses essentially on the connection between the integer linear program and its relaxation. We prove that: - One cannot guarantee the total unimodularity of the constraint matrix, which is a sufficient condition for the global solution of the relaxed linear program to be always integral, and therefore to be a solution of the integer program as well; - Furthermore, there are actually experimental evidences that, in some cases, solving the relaxed problem yields a fractional solution. Due to the very specific structure of the constraint matrix here, we strongly believe that it should be possible in the future to design ad-hoc integer solvers that yield high-definition approximations to solutions of several boundary value problems involving mean curvature, in particular the Willmore boundary value problem. △ Less

Submitted 4 January, 2011; originally announced January 2011.

arXiv:quant-ph/9809086 [pdf, ps, other]

doi 10.1016/S0167-2789(98)00267-X

Flow Equations for the Henon-Heiles Hamiltonian

Authors: Daniel Cremers, Andreas Mielke

Abstract: The Henon-Heiles Hamiltonian was introduced in 1964 as a mathematical model to describe the chaotic motion of stars in a galaxy. By canonically transforming the classical Hamiltonian to a Birkhoff-Gustavson normalform Delos and Swimm obtained a discrete quantum mechanical energy spectrum. The aim of the present work is to first quantize the classical Hamiltonian and to then diagonalize it using… ▽ More The Henon-Heiles Hamiltonian was introduced in 1964 as a mathematical model to describe the chaotic motion of stars in a galaxy. By canonically transforming the classical Hamiltonian to a Birkhoff-Gustavson normalform Delos and Swimm obtained a discrete quantum mechanical energy spectrum. The aim of the present work is to first quantize the classical Hamiltonian and to then diagonalize it using different variants of flow equations, a method of continuous unitary transformations introduced by Wegner in 1994. The results of the diagonalization via flow equations are comparable to those obtained by the classical transformation. In the case of commensurate frequencies the transformation turns out to be less lengthy. In addition, the dynamics of the quantum mechanical system are analyzed on the basis of the transformed observables. △ Less

Submitted 29 September, 1998; originally announced September 1998.

Comments: latex, 14 pages, 6 figures, uses epsfig, accepted for publication in Physica D

Report number: HD-TVP-98-4

Showing 201–226 of 226 results for author: Cremers, D