Skip to main content

Showing 101–121 of 121 results for author: Brox, T

.
  1. arXiv:1607.06317  [pdf, other

    cs.CV

    A Multi-cut Formulation for Joint Segmentation and Tracking of Multiple Objects

    Authors: Margret Keuper, Siyu Tang, Yu Zhongjie, Bjoern Andres, Thomas Brox, Bernt Schiele

    Abstract: Recently, Minimum Cost Multicut Formulations have been proposed and proven to be successful in both motion trajectory segmentation and multi-target tracking scenarios. Both tasks benefit from decomposing a graphical model into an optimal number of connected components based on attractive and repulsive pairwise terms. The two tasks are formulated on different levels of granularity and, accordingly,… ▽ More

    Submitted 21 July, 2016; originally announced July 2016.

  2. arXiv:1606.06650  [pdf, other

    cs.CV

    3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation

    Authors: Özgün Çiçek, Ahmed Abdulkadir, Soeren S. Lienkamp, Thomas Brox, Olaf Ronneberger

    Abstract: This paper introduces a network for volumetric segmentation that learns from sparsely annotated volumetric images. We outline two attractive use cases of this method: (1) In a semi-automated setup, the user annotates some slices in the volume to be segmented. The network learns from these sparse annotations and provides a dense 3D segmentation. (2) In a fully-automated setup, we assume that a repr… ▽ More

    Submitted 21 June, 2016; originally announced June 2016.

    Comments: Conditionally accepted for MICCAI 2016

  3. arXiv:1606.02467  [pdf, other

    cs.CV

    Point-wise mutual information-based video segmentation with high temporal consistency

    Authors: Margret Keuper, Thomas Brox

    Abstract: In this paper, we tackle the problem of temporally consistent boundary detection and hierarchical segmentation in videos. While finding the best high-level reasoning of region assignments in videos is the focus of much recent research, temporal consistency in boundary detection has so far only rarely been tackled. We argue that temporally consistent boundaries are a key component to temporally con… ▽ More

    Submitted 8 June, 2016; originally announced June 2016.

  4. arXiv:1605.09304  [pdf, other

    cs.NE cs.AI cs.CV cs.LG

    Synthesizing the preferred inputs for neurons in neural networks via deep generator networks

    Authors: Anh Nguyen, Alexey Dosovitskiy, Jason Yosinski, Thomas Brox, Jeff Clune

    Abstract: Deep neural networks (DNNs) have demonstrated state-of-the-art results on many pattern recognition tasks, especially vision classification problems. Understanding the inner workings of such computational brains is both fascinating basic science that is interesting in its own right - similar to why we study the human brain - and will enable researchers to further improve DNNs. One path to understan… ▽ More

    Submitted 23 November, 2016; v1 submitted 30 May, 2016; originally announced May 2016.

    Comments: 29 pages, 35 figures, NIPS camera-ready

  5. Artistic style transfer for videos

    Authors: Manuel Ruder, Alexey Dosovitskiy, Thomas Brox

    Abstract: In the past, manually re-drawing an image in a certain artistic style required a professional artist and a long time. Doing this for a video sequence single-handed was beyond imagination. Nowadays computers provide new possibilities. We present an approach that transfers the style from one image (for example, a painting) to a whole video sequence. We make use of recent advances in style transfer i… ▽ More

    Submitted 19 October, 2016; v1 submitted 28 April, 2016; originally announced April 2016.

    Comments: final version appeared in GCPR-2016; minor changes to improve the clarity

    Journal ref: German Conference on Pattern Recognition (GCPR), LNCS 9796, pp. 26-36 (2016)

  6. arXiv:1604.05096  [pdf, other

    cs.CV

    Pixel-level Encoding and Depth Layering for Instance-level Semantic Labeling

    Authors: Jonas Uhrig, Marius Cordts, Uwe Franke, Thomas Brox

    Abstract: Recent approaches for instance-aware semantic labeling have augmented convolutional neural networks (CNNs) with complex multi-task architectures or computationally expensive graphical models. We present a method that leverages a fully convolutional network (FCN) to predict semantic labels, depth and an instance-based encoding using each pixel's direction towards its corresponding instance center.… ▽ More

    Submitted 14 July, 2016; v1 submitted 18 April, 2016; originally announced April 2016.

    Comments: Accepted at GCPR 2016. Includes supplementary material

  7. arXiv:1604.03351  [pdf, other

    cs.CV cs.NE

    Orientation-boosted Voxel Nets for 3D Object Recognition

    Authors: Nima Sedaghat, Mohammadreza Zolfaghari, Ehsan Amiri, Thomas Brox

    Abstract: Recent work has shown good recognition results in 3D object recognition using 3D convolutional networks. In this paper, we show that the object orientation plays an important role in 3D recognition. More specifically, we argue that objects induce different features in the network under rotation. Thus, we approach the category-level classification task as a multi-task problem, in which the network… ▽ More

    Submitted 19 October, 2017; v1 submitted 12 April, 2016; originally announced April 2016.

    Comments: BMVC'17 version. Added some experiments + auto-alignment of Modelnet40

  8. arXiv:1602.07080  [pdf, other

    math.OC

    Techniques for Gradient Based Bilevel Optimization with Nonsmooth Lower Level Problems

    Authors: Peter Ochs, René Ranftl, Thomas Brox, Thomas Pock

    Abstract: We propose techniques for approximating bilevel optimization problems with non-smooth lower level problems that can have a non-unique solution. To this end, we substitute the expression of a minimizer of the lower level minimization problem with an iterative algorithm that is guaranteed to converge to a minimizer of the problem. Using suitable non-linear proximal distance functions, the update map… ▽ More

    Submitted 26 April, 2016; v1 submitted 23 February, 2016; originally announced February 2016.

  9. arXiv:1602.02644  [pdf, other

    cs.LG cs.CV cs.NE

    Generating Images with Perceptual Similarity Metrics based on Deep Networks

    Authors: Alexey Dosovitskiy, Thomas Brox

    Abstract: Image-generating machine learning models are typically trained with loss functions based on distance in the image space. This often leads to over-smoothed results. We propose a class of loss functions, which we call deep perceptual similarity metrics (DeePSiM), that mitigate this problem. Instead of computing distances in the image space, we compute distances between image features extracted by de… ▽ More

    Submitted 9 February, 2016; v1 submitted 8 February, 2016; originally announced February 2016.

    Comments: minor corrections

  10. arXiv:1512.02134  [pdf, other

    cs.CV cs.LG stat.ML

    A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation

    Authors: Nikolaus Mayer, Eddy Ilg, Philip Häusser, Philipp Fischer, Daniel Cremers, Alexey Dosovitskiy, Thomas Brox

    Abstract: Recent work has shown that optical flow estimation can be formulated as a supervised learning task and can be successfully solved with convolutional networks. Training of the so-called FlowNet was enabled by a large synthetically generated dataset. The present paper extends the concept of optical flow estimation via convolutional networks to disparity and scene flow estimation. To this end, we pro… ▽ More

    Submitted 7 December, 2015; originally announced December 2015.

    Comments: Includes supplementary material

    ACM Class: I.2.6; I.2.10; I.4.8

  11. arXiv:1511.06702  [pdf, other

    cs.CV

    Multi-view 3D Models from Single Images with a Convolutional Network

    Authors: Maxim Tatarchenko, Alexey Dosovitskiy, Thomas Brox

    Abstract: We present a convolutional network capable of inferring a 3D representation of a previously unseen object given a single image of this object. Concretely, the network can predict an RGB image and a depth map of the object as seen from an arbitrary view. Several of these depth maps fused together give a full point cloud of the object. The point cloud can in turn be transformed into a surface mesh.… ▽ More

    Submitted 2 August, 2016; v1 submitted 20 November, 2015; originally announced November 2015.

  12. arXiv:1506.02753  [pdf, other

    cs.NE cs.CV cs.LG

    Inverting Visual Representations with Convolutional Networks

    Authors: Alexey Dosovitskiy, Thomas Brox

    Abstract: Feature representations, both hand-designed and learned ones, are often hard to analyze and interpret, even when they are extracted from visual data. We propose a new approach to study image representations by inverting them with an up-convolutional neural network. We apply the method to shallow representations (HOG, SIFT, LBP), as well as to deep networks. For shallow representations our approach… ▽ More

    Submitted 26 April, 2016; v1 submitted 8 June, 2015; originally announced June 2015.

    Comments: Version 4 - final version to appear in CVPR-2016. Visually better results obtained with feature similarity and adversarial training are in a different paper - arXiv:1602.02644

  13. arXiv:1505.06973  [pdf, other

    cs.CV

    Efficient Decomposition of Image and Mesh Graphs by Lifted Multicuts

    Authors: Margret Keuper, Evgeny Levinkov, Nicolas Bonneel, Guillaume Lavoué, Thomas Brox, Bjoern Andres

    Abstract: Formulations of the Image Decomposition Problem as a Multicut Problem (MP) w.r.t. a superpixel graph have received considerable attention. In contrast, instances of the MP w.r.t. a pixel grid graph have received little attention, firstly, because the MP is NP-hard and instances w.r.t. a pixel grid graph are hard to solve in practice, and, secondly, due to the lack of long-range terms in the object… ▽ More

    Submitted 11 September, 2015; v1 submitted 26 May, 2015; originally announced May 2015.

  14. arXiv:1505.04597  [pdf, other

    cs.CV

    U-Net: Convolutional Networks for Biomedical Image Segmentation

    Authors: Olaf Ronneberger, Philipp Fischer, Thomas Brox

    Abstract: There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise… ▽ More

    Submitted 18 May, 2015; originally announced May 2015.

    Comments: conditionally accepted at MICCAI 2015

  15. arXiv:1504.06852  [pdf, other

    cs.CV cs.LG

    FlowNet: Learning Optical Flow with Convolutional Networks

    Authors: Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazırbaş, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, Thomas Brox

    Abstract: Convolutional neural networks (CNNs) have recently been very successful in a variety of computer vision tasks, especially on those linked to recognition. Optical flow estimation has not been among the tasks where CNNs were successful. In this paper we construct appropriate CNNs which are capable of solving the optical flow estimation problem as a supervised learning task. We propose and compare tw… ▽ More

    Submitted 4 May, 2015; v1 submitted 26 April, 2015; originally announced April 2015.

    Comments: Added supplementary material

    ACM Class: I.2.6; I.4.8

  16. arXiv:1412.6806  [pdf, other

    cs.LG cs.CV cs.NE

    Striving for Simplicity: The All Convolutional Net

    Authors: Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, Martin Riedmiller

    Abstract: Most modern convolutional neural networks (CNNs) used for object recognition are built using the same principles: Alternating convolution and max-pooling layers followed by a small number of fully connected layers. We re-evaluate the state of the art for object recognition from small images with convolutional networks, questioning the necessity of different components in the pipeline. We find that… ▽ More

    Submitted 13 April, 2015; v1 submitted 21 December, 2014; originally announced December 2014.

    Comments: accepted to ICLR-2015 workshop track; no changes other than style

  17. arXiv:1411.5928  [pdf, other

    cs.CV cs.LG cs.NE

    Learning to Generate Chairs, Tables and Cars with Convolutional Networks

    Authors: Alexey Dosovitskiy, Jost Tobias Springenberg, Maxim Tatarchenko, Thomas Brox

    Abstract: We train generative 'up-convolutional' neural networks which are able to generate images of objects given object style, viewpoint, and color. We train the networks on rendered 3D models of chairs, tables, and cars. Our experiments show that the networks do not merely learn all images by heart, but rather find a meaningful representation of 3D models allowing them to assess the similarity of differ… ▽ More

    Submitted 2 August, 2017; v1 submitted 21 November, 2014; originally announced November 2014.

    Comments: v4: final PAMI version. New architecture figure

  18. arXiv:1406.6909  [pdf, other

    cs.LG cs.CV cs.NE

    Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks

    Authors: Alexey Dosovitskiy, Philipp Fischer, Jost Tobias Springenberg, Martin Riedmiller, Thomas Brox

    Abstract: Deep convolutional networks have proven to be very successful in learning task specific features that allow for unprecedented performance on various computer vision tasks. Training of such networks follows mostly the supervised learning paradigm, where sufficiently many input-output pairs are required for training. Acquisition of large training sets is one of the key challenges, when approaching a… ▽ More

    Submitted 19 June, 2015; v1 submitted 26 June, 2014; originally announced June 2014.

    Comments: PAMI submission. Includes matching experiments as in arXiv:1405.5769v1. Also includes new network architectures, experiments on Caltech-256, experiment on combining Exemplar-CNN with clustering

  19. arXiv:1405.5769   

    cs.CV cs.LG

    Descriptor Matching with Convolutional Neural Networks: a Comparison to SIFT

    Authors: Philipp Fischer, Alexey Dosovitskiy, Thomas Brox

    Abstract: Latest results indicate that features learned via convolutional neural networks outperform previous descriptors on classification tasks by a large margin. It has been shown that these networks still work well when they are applied to datasets or recognition tasks different from those they were trained on. However, descriptors like SIFT are not only used in recognition but also for many corresponde… ▽ More

    Submitted 24 June, 2015; v1 submitted 22 May, 2014; originally announced May 2014.

    Comments: This paper has been merged with arXiv:1406.6909

    ACM Class: I.2.6; I.4.7; I.4.8

  20. arXiv:1404.4805  [pdf, other

    cs.CV math.OC

    iPiano: Inertial Proximal Algorithm for Non-Convex Optimization

    Authors: Peter Ochs, Yun** Chen, Thomas Brox, Thomas Pock

    Abstract: In this paper we study an algorithm for solving a minimization problem composed of a differentiable (possibly non-convex) and a convex (possibly non-differentiable) function. The algorithm iPiano combines forward-backward splitting with an inertial force. It can be seen as a non-smooth split version of the Heavy-ball method from Polyak. A rigorous analysis of the algorithm for the proposed class o… ▽ More

    Submitted 18 April, 2014; originally announced April 2014.

    Comments: 32pages, 7 figures, to appear in SIAM Journal on Imaging Sciences

  21. arXiv:1312.5242  [pdf, other

    cs.CV cs.LG cs.NE

    Unsupervised feature learning by augmenting single images

    Authors: Alexey Dosovitskiy, Jost Tobias Springenberg, Thomas Brox

    Abstract: When deep learning is applied to visual object recognition, data augmentation is often used to generate additional training data without extra labeling cost. It helps to reduce overfitting and increase the performance of the algorithm. In this paper we investigate if it is possible to use data augmentation as the main component of an unsupervised feature learning architecture. To that end we sampl… ▽ More

    Submitted 16 February, 2014; v1 submitted 18 December, 2013; originally announced December 2013.

    Comments: ICLR 2014 workshop track submission (7 pages, 4 figures, 1 table)