Skip to main content

Showing 51–68 of 68 results for author: Torresani, L

.
  1. arXiv:1709.09582  [pdf, other

    cs.LG cs.CV

    Connectivity Learning in Multi-Branch Networks

    Authors: Karim Ahmed, Lorenzo Torresani

    Abstract: While much of the work in the design of convolutional networks over the last five years has revolved around the empirical investigation of the importance of depth, filter sizes, and number of feature channels, recent studies have shown that branching, i.e., splitting the computation along parallel but distinct threads and then aggregating their outputs, represents a new promising dimension for sig… ▽ More

    Submitted 7 December, 2017; v1 submitted 27 September, 2017; originally announced September 2017.

  2. arXiv:1709.08855  [pdf, other

    cs.CV

    Learning to Inpaint for Image Compression

    Authors: Mohammad Haris Baig, Vladlen Koltun, Lorenzo Torresani

    Abstract: We study the design of deep architectures for lossy image compression. We present two architectural recipes in the context of multi-stage progressive encoders and empirically demonstrate their importance on compression performance. Specifically, we show that: (a) predicting the original image data from residuals in a multi-stage progressive architecture facilitates learning and leads to improved p… ▽ More

    Submitted 10 November, 2017; v1 submitted 26 September, 2017; originally announced September 2017.

    Comments: Published in Advances in Neural Information Processing Systems (NIPS 2017)

  3. BranchConnect: Large-Scale Visual Recognition with Learned Branch Connections

    Authors: Karim Ahmed, Lorenzo Torresani

    Abstract: We introduce an architecture for large-scale image categorization that enables the end-to-end learning of separate visual features for the different classes to distinguish. The proposed model consists of a deep CNN shaped like a tree. The stem of the tree includes a sequence of convolutional layers common to all classes. The stem then splits into multiple branches implementing parallel feature ext… ▽ More

    Submitted 29 July, 2018; v1 submitted 20 April, 2017; originally announced April 2017.

    Comments: WACV 2018

    Journal ref: IEEE Winter Conference on Applications of Computer Vision (WACV) 2018, pp. 1244-1253

  4. arXiv:1703.01550  [pdf, other

    cs.CV

    Deep-Learning for Classification of Colorectal Polyps on Whole-Slide Images

    Authors: Bruno Korbar, Andrea M. Olofson, Allen P. Miraflor, Katherine M. Nicka, Matthew A. Suriawinata, Lorenzo Torresani, Arief A. Suriawinata, Saeed Hassanpour

    Abstract: Histopathological characterization of colorectal polyps is an important principle for determining the risk of colorectal cancer and future rates of surveillance for patients. This characterization is time-intensive, requires years of specialized training, and suffers from significant inter-observer and intra-observer variability. In this work, we built an automatic image-understanding method that… ▽ More

    Submitted 12 April, 2017; v1 submitted 4 March, 2017; originally announced March 2017.

  5. arXiv:1606.07373  [pdf, other

    cs.CV

    VideoMCC: a New Benchmark for Video Comprehension

    Authors: Du Tran, Maksim Bolonkin, Manohar Paluri, Lorenzo Torresani

    Abstract: While there is overall agreement that future technology for organizing, browsing and searching videos hinges on the development of methods for high-level semantic understanding of video, so far no consensus has been reached on the best way to train and assess models for this task. Casting video understanding as a form of action or event categorization is problematic as it is not fully clear what t… ▽ More

    Submitted 16 June, 2017; v1 submitted 23 June, 2016; originally announced June 2016.

  6. arXiv:1606.06314  [pdf, other

    cs.CV

    Multiple Hypothesis Colorization

    Authors: Mohammad Haris Baig, Lorenzo Torresani

    Abstract: In this work we focus on the problem of colorization for image compression. Since color information occupies a large proportion of the total storage size of an image, a method that can predict accurate color from its grayscale version can produce dramatic reduction in image file size. But colorization for compression poses several challenges. First, while colorization for artistic purposes simply… ▽ More

    Submitted 1 March, 2017; v1 submitted 20 June, 2016; originally announced June 2016.

    Comments: 16 Pages (including Appendices)

  7. arXiv:1605.07686  [pdf, other

    cs.CV

    Local Perturb-and-MAP for Structured Prediction

    Authors: Gedas Bertasius, Qiang Liu, Lorenzo Torresani, Jianbo Shi

    Abstract: Conditional random fields (CRFs) provide a powerful tool for structured prediction, but cast significant challenges in both the learning and inference steps. Approximation techniques are widely used in both steps, which should be considered jointly to guarantee good performance (a.k.a. "inferning"). Perturb-and-MAP models provide a promising alternative to CRFs, but require global combinatorial op… ▽ More

    Submitted 13 October, 2016; v1 submitted 24 May, 2016; originally announced May 2016.

  8. arXiv:1605.07681  [pdf, other

    cs.CV

    Convolutional Random Walk Networks for Semantic Image Segmentation

    Authors: Gedas Bertasius, Lorenzo Torresani, Stella X. Yu, Jianbo Shi

    Abstract: Most current semantic segmentation methods rely on fully convolutional networks (FCNs). However, their use of large receptive fields and many pooling layers cause low spatial resolution inside the deep layers. This leads to predictions with poor localization around the boundaries. Prior work has attempted to address this issue by post-processing predictions with CRFs or MRFs. But such models often… ▽ More

    Submitted 8 May, 2017; v1 submitted 24 May, 2016; originally announced May 2016.

  9. arXiv:1604.06119  [pdf, other

    cs.CV

    Network of Experts for Large-Scale Image Categorization

    Authors: Karim Ahmed, Mohammad Haris Baig, Lorenzo Torresani

    Abstract: We present a tree-structured network architecture for large scale image classification. The trunk of the network contains convolutional layers optimized over all classes. At a given depth, the trunk splits into separate branches, each dedicated to discriminate a different subset of classes. Each branch acts as an expert classifying a set of categories that are difficult to tell apart, while the tr… ▽ More

    Submitted 19 April, 2017; v1 submitted 20 April, 2016; originally announced April 2016.

    Comments: ECCV 2016

  10. arXiv:1603.08199  [pdf, other

    cs.CV

    Recurrent Mixture Density Network for Spatiotemporal Visual Attention

    Authors: Loris Bazzani, Hugo Larochelle, Lorenzo Torresani

    Abstract: In many computer vision tasks, the relevant information to solve the problem at hand is mixed to irrelevant, distracting information. This has motivated researchers to design attentional models that can dynamically focus on parts of images or videos that are salient, e.g., by down-weighting irrelevant pixels. In this work, we propose a spatiotemporal attentional model that learns where to look in… ▽ More

    Submitted 11 February, 2017; v1 submitted 27 March, 2016; originally announced March 2016.

    Comments: ICLR 2017

  11. arXiv:1511.06681  [pdf, other

    cs.CV

    Deep End2End Voxel2Voxel Prediction

    Authors: Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri

    Abstract: Over the last few years deep learning methods have emerged as one of the most prominent approaches for video analysis. However, so far their most successful applications have been in the area of video classification and detection, i.e., problems involving the prediction of a single class label or a handful of output variables per video. Furthermore, while deep networks are commonly recognized as t… ▽ More

    Submitted 20 November, 2015; originally announced November 2015.

  12. arXiv:1511.02674  [pdf, other

    cs.CV

    Semantic Segmentation with Boundary Neural Fields

    Authors: Gedas Bertasius, Jianbo Shi, Lorenzo Torresani

    Abstract: The state-of-the-art in semantic segmentation is currently represented by fully convolutional networks (FCNs). However, FCNs use large receptive fields and many pooling layers, both of which cause blurring and low spatial resolution in the deep layers. As a result FCNs tend to produce segmentations that are poorly localized around object boundaries. Prior work has attempted to address this issue i… ▽ More

    Submitted 24 May, 2016; v1 submitted 9 November, 2015; originally announced November 2015.

  13. arXiv:1504.06201  [pdf, other

    cs.CV

    High-for-Low and Low-for-High: Efficient Boundary Detection from Deep Object Features and its Applications to High-Level Vision

    Authors: Gedas Bertasius, Jianbo Shi, Lorenzo Torresani

    Abstract: Most of the current boundary detection systems rely exclusively on low-level features, such as color and texture. However, perception studies suggest that humans employ object-level reasoning when judging if a particular pixel is a boundary. Inspired by this observation, in this work we show how to predict boundaries by exploiting object-level features from a pretrained object-classification netwo… ▽ More

    Submitted 21 September, 2015; v1 submitted 23 April, 2015; originally announced April 2015.

  14. arXiv:1501.04537  [pdf, other

    cs.CV

    Coupled Depth Learning

    Authors: Mohammad Haris Baig, Lorenzo Torresani

    Abstract: In this paper we propose a method for estimating depth from a single image using a coarse to fine approach. We argue that modeling the fine depth details is easier after a coarse depth map has been computed. We express a global (coarse) depth map of an image as a linear combination of a depth basis learned from training examples. The depth basis captures spatial and statistical regularities and re… ▽ More

    Submitted 9 February, 2016; v1 submitted 19 January, 2015; originally announced January 2015.

    Comments: 10 pages, 3 Figures, 4 Tables with quantitative evaluations

  15. arXiv:1412.1123  [pdf, other

    cs.CV

    DeepEdge: A Multi-Scale Bifurcated Deep Network for Top-Down Contour Detection

    Authors: Gedas Bertasius, Jianbo Shi, Lorenzo Torresani

    Abstract: Contour detection has been a fundamental component in many image segmentation and object detection systems. Most previous work utilizes low-level features such as texture or saliency to detect contours and then use them as cues for a higher-level task such as object detection. However, we claim that recognizing objects and predicting contours are two mutually related tasks. Contrary to traditional… ▽ More

    Submitted 23 April, 2015; v1 submitted 2 December, 2014; originally announced December 2014.

    Comments: Accepted to CVPR 2015

  16. arXiv:1412.0767  [pdf, other

    cs.CV

    Learning Spatiotemporal Features with 3D Convolutional Networks

    Authors: Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri

    Abstract: We propose a simple, yet effective approach for spatiotemporal feature learning using deep 3-dimensional convolutional networks (3D ConvNets) trained on a large scale supervised video dataset. Our findings are three-fold: 1) 3D ConvNets are more suitable for spatiotemporal feature learning compared to 2D ConvNets; 2) A homogeneous architecture with small 3x3x3 convolution kernels in all layers is… ▽ More

    Submitted 6 October, 2015; v1 submitted 1 December, 2014; originally announced December 2014.

  17. arXiv:1409.3964  [pdf, other

    cs.CV

    Self-taught Object Localization with Deep Networks

    Authors: Loris Bazzani, Alessandro Bergamo, Dragomir Anguelov, Lorenzo Torresani

    Abstract: This paper introduces self-taught object localization, a novel approach that leverages deep convolutional networks trained for whole-image recognition to localize objects in images without additional human supervision, i.e., without using any ground-truth bounding boxes for training. The key idea is to analyze the change in the recognition scores when artificially masking out different regions of… ▽ More

    Submitted 2 February, 2016; v1 submitted 13 September, 2014; originally announced September 2014.

    Comments: WACV 2016

  18. arXiv:1312.5785  [pdf, other

    cs.CV

    EXMOVES: Classifier-based Features for Scalable Action Recognition

    Authors: Du Tran, Lorenzo Torresani

    Abstract: This paper introduces EXMOVES, learned exemplar-based features for efficient recognition of actions in videos. The entries in our descriptor are produced by evaluating a set of movement classifiers over spatial-temporal volumes of the input sequence. Each movement classifier is a simple exemplar-SVM trained on low-level features, i.e., an SVM learned using a single annotated positive space-time vo… ▽ More

    Submitted 27 March, 2014; v1 submitted 19 December, 2013; originally announced December 2013.

    Comments: In Proceedings of the 2nd International Conference on Learning Representations, Banff, Canada, 2014