Skip to main content

Showing 151–200 of 207 results for author: Torr, P H S

.
  1. arXiv:1810.02340  [pdf, ps, other

    cs.CV cs.LG

    SNIP: Single-shot Network Pruning based on Connection Sensitivity

    Authors: Namhoon Lee, Thalaiyasingam Ajanthan, Philip H. S. Torr

    Abstract: Pruning large neural networks while maintaining their performance is often desirable due to the reduced space and time complexity. In existing methods, pruning is done within an iterative optimization procedure with either heuristically designed pruning schedules or additional hyperparameters, undermining their utility. In this work, we present a new approach that prunes a given network once at in… ▽ More

    Submitted 23 February, 2019; v1 submitted 4 October, 2018; originally announced October 2018.

    Comments: ICLR 2019

  2. arXiv:1808.03575  [pdf, other

    cs.CV

    Weakly- and Semi-Supervised Panoptic Segmentation

    Authors: Qizhu Li, Anurag Arnab, Philip H. S. Torr

    Abstract: We present a weakly supervised model that jointly performs both semantic- and instance-segmentation -- a particularly relevant problem given the substantial cost of obtaining pixel-perfect annotation for these tasks. In contrast to many popular instance segmentation approaches based on object detectors, our method does not predict any overlap** instances. Moreover, we are able to segment both "t… ▽ More

    Submitted 12 January, 2019; v1 submitted 10 August, 2018; originally announced August 2018.

    Comments: ECCV 2018. The first two authors contributed equally

  3. arXiv:1807.04200  [pdf, other

    cs.CV

    With Friends Like These, Who Needs Adversaries?

    Authors: Saumya Jetley, Nicholas A. Lord, Philip H. S. Torr

    Abstract: The vulnerability of deep image classification networks to adversarial attack is now well known, but less well understood. Via a novel experimental analysis, we illustrate some facts about deep convolutional networks for image classification that shed new light on their behaviour and how it connects to the problem of adversaries. In short, the celebrated performance of these networks and their vul… ▽ More

    Submitted 8 January, 2019; v1 submitted 11 July, 2018; originally announced July 2018.

    Comments: Published in this form at NeurIPS 2018

  4. arXiv:1805.11199  [pdf, other

    cs.AI cs.LG

    Value Propagation Networks

    Authors: Nantas Nardelli, Gabriel Synnaeve, Zeming Lin, Pushmeet Kohli, Philip H. S. Torr, Nicolas Usunier

    Abstract: We present Value Propagation (VProp), a set of parameter-efficient differentiable planning modules built on Value Iteration which can successfully be trained using reinforcement learning to solve unseen tasks, has the capability to generalize to larger map sizes, and can learn to navigate in dynamic environments. We show that the modules enable learning to plan when the environment also includes s… ▽ More

    Submitted 25 March, 2019; v1 submitted 28 May, 2018; originally announced May 2018.

    Comments: Updated to match ICLR 2019 OpenReview's version

  5. arXiv:1805.09028  [pdf, other

    cs.CV

    Efficient Relaxations for Dense CRFs with Sparse Higher Order Potentials

    Authors: Thomas Joy, Alban Desmaison, Thalaiyasingam Ajanthan, Rudy Bunel, Mathieu Salzmann, Pushmeet Kohli, Philip H. S. Torr, M. Pawan Kumar

    Abstract: Dense conditional random fields (CRFs) have become a popular framework for modelling several problems in computer vision such as stereo correspondence and multi-class semantic segmentation. By modelling long-range interactions, dense CRFs provide a labelling that captures finer detail than their sparse counterparts. Currently, the state-of-the-art algorithm performs mean-field inference using a fi… ▽ More

    Submitted 26 October, 2018; v1 submitted 23 May, 2018; originally announced May 2018.

  6. arXiv:1805.08136  [pdf, other

    cs.CV cs.LG stat.ML

    Meta-learning with differentiable closed-form solvers

    Authors: Luca Bertinetto, João F. Henriques, Philip H. S. Torr, Andrea Vedaldi

    Abstract: Adapting deep networks to new concepts from a few examples is challenging, due to the high computational requirements of standard fine-tuning procedures. Most work on few-shot learning has thus focused on simple learning techniques for adaptation, such as nearest neighbours or gradient descent. Nonetheless, the machine learning literature contains a wealth of methods that learn non-deep models ver… ▽ More

    Submitted 24 July, 2019; v1 submitted 21 May, 2018; originally announced May 2018.

    Comments: Published at ICLR'19. Code and data available at http://www.robots.ox.ac.uk/~luca/r2d2.html

  7. arXiv:1804.07090  [pdf, other

    cs.LG cs.AI stat.ML

    Robustness via Deep Low-Rank Representations

    Authors: Amartya Sanyal, Varun Kanade, Philip H. S. Torr, Puneet K. Dokania

    Abstract: We investigate the effect of the dimensionality of the representations learned in Deep Neural Networks (DNNs) on their robustness to input perturbations, both adversarial and random. To achieve low dimensionality of learned representations, we propose an easy-to-use, end-to-end trainable, low-rank regularizer (LR) that can be applied to any intermediate layer representation of a DNN. This regulari… ▽ More

    Submitted 19 February, 2020; v1 submitted 19 April, 2018; originally announced April 2018.

  8. arXiv:1804.02391  [pdf, other

    cs.CV cs.AI

    Learn To Pay Attention

    Authors: Saumya Jetley, Nicholas A. Lord, Namhoon Lee, Philip H. S. Torr

    Abstract: We propose an end-to-end-trainable attention module for convolutional neural network (CNN) architectures built for image classification. The module takes as input the 2D feature vector maps which form the intermediate representations of the input image at different stages in the CNN pipeline, and outputs a 2D matrix of scores for each map. Standard CNN architectures are modified through the incorp… ▽ More

    Submitted 26 April, 2018; v1 submitted 6 April, 2018; originally announced April 2018.

    Comments: International Conference on Learning Representations 2018

  9. arXiv:1803.09860  [pdf, other

    cs.CV

    Three Birds One Stone: A General Architecture for Salient Object Segmentation, Edge Detection and Skeleton Extraction

    Authors: Qibin Hou, Jiang-Jiang Liu, Ming-Ming Cheng, Ali Borji, Philip H. S. Torr

    Abstract: In this paper, we aim at solving pixel-wise binary problems, including salient object segmentation, skeleton extraction, and edge detection, by introducing a unified architecture. Previous works have proposed tailored methods for solving each of the three tasks independently. Here, we show that these tasks share some similarities that can be exploited for develo** a unified framework. In particu… ▽ More

    Submitted 5 April, 2019; v1 submitted 26 March, 2018; originally announced March 2018.

  10. arXiv:1803.09859  [pdf, other

    cs.CV

    WebSeg: Learning Semantic Segmentation from Web Searches

    Authors: Qibin Hou, Ming-Ming Cheng, Jiangjiang Liu, Philip H. S. Torr

    Abstract: In this paper, we improve semantic segmentation by automatically learning from Flickr images associated with a particular keyword, without relying on any explicit user annotations, thus substantially alleviating the dependence on accurate annotations when compared to previous weakly supervised methods. To solve such a challenging problem, we leverage several low-level cues (such as saliency, edg… ▽ More

    Submitted 26 March, 2018; originally announced March 2018.

    Comments: Submitted to ECCV2018

  11. arXiv:1802.07351  [pdf, other

    cs.CV

    Devon: Deformable Volume Network for Learning Optical Flow

    Authors: Yao Lu, Jack Valmadre, Heng Wang, Juho Kannala, Mehrtash Harandi, Philip H. S. Torr

    Abstract: State-of-the-art neural network models estimate large displacement optical flow in multi-resolution and use war** to propagate the estimation between two resolutions. Despite their impressive results, it is known that there are two problems with the approach. First, the multi-resolution estimation of optical flow fails in situations where small objects move fast. Second, war** creates artifact… ▽ More

    Submitted 4 March, 2019; v1 submitted 20 February, 2018; originally announced February 2018.

  12. Real-Time Dense Stereo Matching With ELAS on FPGA Accelerated Embedded Devices

    Authors: Oscar Rahnama, Duncan Frost, Ondrej Miksik, Philip H. S. Torr

    Abstract: For many applications in low-power real-time robotics, stereo cameras are the sensors of choice for depth perception as they are typically cheaper and more versatile than their active counterparts. Their biggest drawback, however, is that they do not directly sense depth maps; instead, these must be estimated through data-intensive processes. Therefore, appropriate algorithm selection plays an imp… ▽ More

    Submitted 20 February, 2018; originally announced February 2018.

    Comments: 8 pages, 7 figures, 2 tables

    Journal ref: IEEE Robotics and Automation Letters, vol. 3, no. 3, pp. 2008-2015, July 2018

  13. arXiv:1802.03803  [pdf, other

    cs.CV

    FlipDial: A Generative Model for Two-Way Visual Dialogue

    Authors: Daniela Massiceti, N. Siddharth, Puneet K. Dokania, Philip H. S. Torr

    Abstract: We present FlipDial, a generative model for visual dialogue that simultaneously plays the role of both participants in a visually-grounded dialogue. Given context in the form of an image and an associated caption summarising the contents of the image, FlipDial learns both to answer questions and put forward questions, capable of generating entire sequences of dialogue (question-answer pairs) which… ▽ More

    Submitted 3 April, 2018; v1 submitted 11 February, 2018; originally announced February 2018.

  14. Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence

    Authors: Arslan Chaudhry, Puneet K. Dokania, Thalaiyasingam Ajanthan, Philip H. S. Torr

    Abstract: Incremental learning (IL) has received a lot of attention recently, however, the literature lacks a precise problem definition, proper evaluation settings, and metrics tailored specifically for the IL problem. One of the main objectives of this work is to fill these gaps so as to provide a common ground for better understanding of IL. The main challenge for an IL algorithm is to update the classif… ▽ More

    Submitted 14 August, 2018; v1 submitted 30 January, 2018; originally announced January 2018.

  15. Collaborative Large-Scale Dense 3D Reconstruction with Online Inter-Agent Pose Optimisation

    Authors: Stuart Golodetz, Tommaso Cavallari, Nicholas A Lord, Victor A Prisacariu, David W Murray, Philip H S Torr

    Abstract: Reconstructing dense, volumetric models of real-world 3D scenes is important for many tasks, but capturing large scenes can take significant time, and the risk of transient changes to the scene goes up as the capture time increases. These are good reasons to want instead to capture several smaller sub-scenes that can be joined to make the whole scene. Achieving this has traditionally been difficul… ▽ More

    Submitted 2 July, 2019; v1 submitted 25 January, 2018; originally announced January 2018.

    Comments: Stuart Golodetz, Tommaso Cavallari and Nicholas Lord assert joint first authorship

    MSC Class: 68T45

    Journal ref: IEEE Transactions on Visualization and Computer Graphics 24(11):2895-2905, 2018

  16. arXiv:1711.09856  [pdf, other

    cs.CV

    On the Robustness of Semantic Segmentation Models to Adversarial Attacks

    Authors: Anurag Arnab, Ondrej Miksik, Philip H. S. Torr

    Abstract: Deep Neural Networks (DNNs) have demonstrated exceptional performance on most recognition tasks such as image classification and segmentation. However, they have also been shown to be vulnerable to adversarial examples. This phenomenon has recently attracted a lot of attention but it has not been extensively studied on multiple, large-scale datasets and structured prediction tasks such as semantic… ▽ More

    Submitted 8 July, 2018; v1 submitted 27 November, 2017; originally announced November 2017.

    Comments: CVPR 2018 extended version

  17. arXiv:1711.06025  [pdf, other

    cs.CV

    Learning to Compare: Relation Network for Few-Shot Learning

    Authors: Flood Sung, Yongxin Yang, Li Zhang, Tao Xiang, Philip H. S. Torr, Timothy M. Hospedales

    Abstract: We present a conceptually simple, flexible, and general framework for few-shot learning, where a classifier must learn to recognise new classes given only few examples from each. Our method, called the Relation Network (RN), is trained end-to-end from scratch. During meta-learning, it learns to learn a deep distance metric to compare a small number of images within episodes, each of which is desig… ▽ More

    Submitted 27 March, 2018; v1 submitted 16 November, 2017; originally announced November 2017.

    Comments: To appear in CVPR2018

  18. arXiv:1711.00455  [pdf, ps, other

    cs.AI cs.LG

    A Unified View of Piecewise Linear Neural Network Verification

    Authors: Rudy Bunel, Ilker Turkaslan, Philip H. S. Torr, Pushmeet Kohli, M. Pawan Kumar

    Abstract: The success of Deep Learning and its potential use in many safety-critical applications has motivated research on formal verification of Neural Network (NN) models. Despite the reputation of learned NN models to behave as black boxes and the theoretical hardness of proving their properties, researchers have been successful in verifying some classes of models by exploiting their piecewise linear st… ▽ More

    Submitted 22 May, 2018; v1 submitted 1 November, 2017; originally announced November 2017.

    Comments: Updated version of "Piecewise Linear Neural Network verification: A comparative study"

  19. arXiv:1709.03612  [pdf, other

    cs.CV

    Holistic, Instance-Level Human Parsing

    Authors: Qizhu Li, Anurag Arnab, Philip H. S. Torr

    Abstract: Object parsing -- the task of decomposing an object into its semantic parts -- has traditionally been formulated as a category-level segmentation problem. Consequently, when there are multiple objects in an image, current methods cannot count the number of objects in the scene, nor can they determine which part belongs to which object. We address this problem by segmenting the parts of objects at… ▽ More

    Submitted 11 September, 2017; originally announced September 2017.

    Comments: Poster at BMVC 2017

  20. arXiv:1708.00783  [pdf, other

    cs.CV

    InfiniTAM v3: A Framework for Large-Scale 3D Reconstruction with Loop Closure

    Authors: Victor Adrian Prisacariu, Olaf Kähler, Stuart Golodetz, Michael Sapienza, Tommaso Cavallari, Philip H S Torr, David W Murray

    Abstract: Volumetric models have become a popular representation for 3D scenes in recent years. One breakthrough leading to their popularity was KinectFusion, which focuses on 3D reconstruction using RGB-D sensors. However, monocular SLAM has since also been tackled with very similar approaches. Representing the reconstruction volumetrically as a TSDF leads to most of the simplicity and efficiency that can… ▽ More

    Submitted 2 August, 2017; originally announced August 2017.

    Comments: This article largely supersedes arxiv:1410.0925 (it describes version 3 of the InfiniTAM framework)

  21. arXiv:1707.07213  [pdf, other

    cs.CV

    Spatio-temporal Human Action Localisation and Instance Segmentation in Temporally Untrimmed Videos

    Authors: Suman Saha, Gurkirt Singh, Michael Sapienza, Philip H. S. Torr, Fabio Cuzzolin

    Abstract: Current state-of-the-art human action recognition is focused on the classification of temporally trimmed videos in which only one action occurs per frame. In this work we address the problem of action localisation and instance segmentation in which multiple concurrent actions of the same class may be segmented out of an image sequence. We cast the action tube extraction as an energy maximisation p… ▽ More

    Submitted 6 August, 2017; v1 submitted 22 July, 2017; originally announced July 2017.

    Comments: Typos corrected

  22. arXiv:1707.05821  [pdf, other

    cs.CV

    Discovering Class-Specific Pixels for Weakly-Supervised Semantic Segmentation

    Authors: Arslan Chaudhry, Puneet K. Dokania, Philip H. S. Torr

    Abstract: We propose an approach to discover class-specific pixels for the weakly-supervised semantic segmentation task. We show that properly combining saliency and attention maps allows us to obtain reliable cues capable of significantly boosting the performance. First, we propose a simple yet powerful hierarchical approach to discover the class-agnostic salient regions, obtained using a salient object de… ▽ More

    Submitted 18 July, 2017; originally announced July 2017.

    Journal ref: 28th British Machine Vision Conference (BMVC), 2017

  23. arXiv:1706.00400  [pdf, other

    stat.ML cs.AI cs.LG

    Learning Disentangled Representations with Semi-Supervised Deep Generative Models

    Authors: N. Siddharth, Brooks Paige, Jan-Willem van de Meent, Alban Desmaison, Noah D. Goodman, Pushmeet Kohli, Frank Wood, Philip H. S. Torr

    Abstract: Variational autoencoders (VAEs) learn representations of data by jointly training a probabilistic encoder and decoder network. Typically these models encode all features of the data into a single variable. Here we are interested in learning disentangled representations that encode distinct aspects of the data into separate variables. We propose to learn such representations using model architectur… ▽ More

    Submitted 13 November, 2017; v1 submitted 1 June, 2017; originally announced June 2017.

    Comments: Accepted for publication at NIPS 2017

  24. arXiv:1704.06036  [pdf, other

    cs.CV cs.LG

    End-to-end representation learning for Correlation Filter based tracking

    Authors: Jack Valmadre, Luca Bertinetto, João F. Henriques, Andrea Vedaldi, Philip H. S. Torr

    Abstract: The Correlation Filter is an algorithm that trains a linear template to discriminate between images and their translations. It is well suited to object tracking because its formulation in the Fourier domain provides a fast solution, enabling the detector to be re-trained once per frame. Previous works that use the Correlation Filter, however, have adopted features that were either manually designe… ▽ More

    Submitted 20 April, 2017; originally announced April 2017.

    Comments: To appear at CVPR 2017

  25. arXiv:1704.04394  [pdf, other

    cs.CV

    DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents

    Authors: Namhoon Lee, Wongun Choi, Paul Vernaza, Christopher B. Choy, Philip H. S. Torr, Manmohan Chandraker

    Abstract: We introduce a Deep Stochastic IOC RNN Encoderdecoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes. DESIRE effectively predicts future locations of objects in multiple scenes by 1) accounting for the multi-modal nature of the future prediction (i.e., given the same context, future may vary), 2) foreseeing the potential future outcomes and m… ▽ More

    Submitted 14 April, 2017; originally announced April 2017.

    Comments: Accepted at CVPR 2017

  26. arXiv:1704.02906  [pdf, other

    cs.CV cs.AI cs.GR cs.LG stat.ML

    Multi-Agent Diverse Generative Adversarial Networks

    Authors: Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri, Philip H. S. Torr, Puneet K. Dokania

    Abstract: We propose MAD-GAN, an intuitive generalization to the Generative Adversarial Networks (GANs) and its conditional variants to address the well known problem of mode collapse. First, MAD-GAN is a multi-agent GAN architecture incorporating multiple generators and one discriminator. Second, to enforce that different generators capture diverse high probability modes, the discriminator of MAD-GAN is de… ▽ More

    Submitted 16 July, 2018; v1 submitted 10 April, 2017; originally announced April 2017.

    Comments: This is an updated version of our CVPR'18 paper with the same title. In this version, we also introduce MAD-GAN-Sim in Appendix B

  27. arXiv:1704.02386  [pdf, other

    cs.CV

    Pixelwise Instance Segmentation with a Dynamically Instantiated Network

    Authors: Anurag Arnab, Philip H. S Torr

    Abstract: Semantic segmentation and object detection research have recently achieved rapid progress. However, the former task has no notion of different instances of the same object, and the latter operates at a coarse, bounding-box level. We propose an Instance Segmentation system that produces a segmentation map where each pixel is assigned an object class and instance identity label. Most approaches adap… ▽ More

    Submitted 7 April, 2017; originally announced April 2017.

    Comments: CVPR 2017

  28. arXiv:1704.01358  [pdf, other

    cs.CV

    Incremental Tube Construction for Human Action Detection

    Authors: Harkirat Singh Behl, Michael Sapienza, Gurkirt Singh, Suman Saha, Fabio Cuzzolin, Philip H. S. Torr

    Abstract: Current state-of-the-art action detection systems are tailored for offline batch-processing applications. However, for online applications like human-robot interaction, current systems fall short, either because they only detect one action per video, or because they assume that the entire video is available ahead of time. In this work, we introduce a real-time and online joint-labelling and associ… ▽ More

    Submitted 23 July, 2018; v1 submitted 5 April, 2017; originally announced April 2017.

    Comments: British Machine Vision Conference (BMVC) 2018

  29. arXiv:1702.08887  [pdf, other

    cs.AI cs.LG cs.MA

    Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

    Authors: Jakob Foerster, Nantas Nardelli, Gregory Farquhar, Triantafyllos Afouras, Philip H. S. Torr, Pushmeet Kohli, Shimon Whiteson

    Abstract: Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on single-agent RL to the multi-agent setting. A major stumbling block is that indep… ▽ More

    Submitted 21 May, 2018; v1 submitted 28 February, 2017; originally announced February 2017.

    Comments: Camera-ready version, International Conference of Machine Learning 2017; updated to fix print-breaking image

  30. arXiv:1702.02779  [pdf, other

    cs.CV

    On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation

    Authors: Tommaso Cavallari, Stuart Golodetz, Nicholas A. Lord, Julien Valentin, Luigi Di Stefano, Philip H. S. Torr

    Abstract: Camera relocalisation is an important problem in computer vision, with applications in simultaneous localisation and map**, virtual/augmented reality and navigation. Common techniques either match the current image against keyframes with known poses coming from a tracker, or establish 2D-to-3D correspondences between keypoints in the current image and points in the scene in order to estimate the… ▽ More

    Submitted 26 June, 2017; v1 submitted 9 February, 2017; originally announced February 2017.

    Comments: To appear in the proceedings of CVPR 2017

  31. arXiv:1612.01495  [pdf, other

    cs.CV

    ROAM: a Rich Object Appearance Model with Application to Rotosco**

    Authors: Ondrej Miksik, Juan-Manuel Pérez-Rúa, Philip H. S. Torr, Patrick Pérez

    Abstract: Rotosco**, the detailed delineation of scene elements through a video shot, is a painstaking task of tremendous importance in professional post-production pipelines. While pixel-wise segmentation techniques can help for this task, professional rotosco** tools rely on parametric curves that offer the artists a much better interactive control on the definition, editing and manipulation of the se… ▽ More

    Submitted 5 December, 2016; originally announced December 2016.

  32. arXiv:1612.01094  [pdf, other

    cs.LG

    Learning to superoptimize programs - Workshop Version

    Authors: Rudy Bunel, Alban Desmaison, M. Pawan Kumar, Philip H. S. Torr, Pushmeet Kohli

    Abstract: Superoptimization requires the estimation of the best program for a given computational task. In order to deal with large programs, superoptimization techniques perform a stochastic search. This involves proposing a modification of the current program, which is accepted or rejected based on the improvement achieved. The state of the art method uses uniform proposal distributions, which fails to ex… ▽ More

    Submitted 4 December, 2016; originally announced December 2016.

    Comments: Workshop version for the NIPS NAMPI Workshop. Extended version at arXiv:1611.01787

  33. arXiv:1612.00380  [pdf, other

    cs.AI cs.CV stat.ML

    Playing Doom with SLAM-Augmented Deep Reinforcement Learning

    Authors: Shehroze Bhatti, Alban Desmaison, Ondrej Miksik, Nantas Nardelli, N. Siddharth, Philip H. S. Torr

    Abstract: A number of recent approaches to policy learning in 2D game domains have been successful going directly from raw input images to actions. However when employed in complex 3D environments, they typically suffer from challenges related to partial observability, combinatorial exploration spaces, path planning, and a scarcity of rewarding scenarios. Inspired from prior work in human cognition that ind… ▽ More

    Submitted 1 December, 2016; originally announced December 2016.

  34. arXiv:1611.09718  [pdf, other

    cs.CV

    Efficient Linear Programming for Dense CRFs

    Authors: Thalaiyasingam Ajanthan, Alban Desmaison, Rudy Bunel, Mathieu Salzmann, Philip H. S. Torr, M. Pawan Kumar

    Abstract: The fully connected conditional random field (CRF) with Gaussian pairwise potentials has proven popular and effective for multi-class semantic segmentation. While the energy of a dense CRF can be minimized accurately using a linear programming (LP) relaxation, the state-of-the-art algorithm is too slow to be useful in practice. To alleviate this deficiency, we introduce an efficient LP minimizatio… ▽ More

    Submitted 14 February, 2017; v1 submitted 29 November, 2016; originally announced November 2016.

    Comments: 24 pages, 10 figures and 4 tables

    ACM Class: G.1.6; I.4.6

  35. arXiv:1611.07932  [pdf, other

    cs.CV

    Straight to Shapes: Real-time Detection of Encoded Shapes

    Authors: Saumya Jetley, Michael Sapienza, Stuart Golodetz, Philip H. S. Torr

    Abstract: Current object detection approaches predict bounding boxes, but these provide little instance-specific information beyond location, scale and aspect ratio. In this work, we propose to directly regress to objects' shapes in addition to their bounding boxes and categories. It is crucial to find an appropriate shape representation that is compact and decodable, and in which objects can be compared fo… ▽ More

    Submitted 5 July, 2017; v1 submitted 23 November, 2016; originally announced November 2016.

    Comments: 16 pages including appendix; Published at CVPR 2017

  36. arXiv:1611.07492  [pdf, other

    stat.ML cs.CV cs.LG

    Inducing Interpretable Representations with Variational Autoencoders

    Authors: N. Siddharth, Brooks Paige, Alban Desmaison, Jan-Willem Van de Meent, Frank Wood, Noah D. Goodman, Pushmeet Kohli, Philip H. S. Torr

    Abstract: We develop a framework for incorporating structured graphical models in the \emph{encoders} of variational autoencoders (VAEs) that allows us to induce interpretable representations through approximate variational inference. This allows us to both perform reasoning (e.g. classification) under the structural constraints of a given graphical model, and use deep generative models to deal with messy,… ▽ More

    Submitted 22 November, 2016; originally announced November 2016.

    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems

  37. arXiv:1611.01787  [pdf, other

    cs.LG

    Learning to superoptimize programs

    Authors: Rudy Bunel, Alban Desmaison, M. Pawan Kumar, Philip H. S. Torr, Pushmeet Kohli

    Abstract: Code super-optimization is the task of transforming any given program to a more efficient version while preserving its input-output behaviour. In some sense, it is similar to the paraphrase problem from natural language processing where the intention is to change the syntax of an utterance without changing its semantics. Code-optimization has been the subject of years of research that has resulted… ▽ More

    Submitted 28 June, 2017; v1 submitted 6 November, 2016; originally announced November 2016.

    Comments: Accepted to ICLR 2017

  38. arXiv:1609.05797  [pdf, other

    cs.CV cs.RO

    Random Forests versus Neural Networks - What's Best for Camera Localization?

    Authors: Daniela Massiceti, Alexander Krull, Eric Brachmann, Carsten Rother, Philip H. S. Torr

    Abstract: This work addresses the task of camera localization in a known 3D scene given a single input RGB image. State-of-the-art approaches accomplish this in two steps: firstly, regressing for every pixel in the image its 3D scene coordinate and subsequently, using these coordinates to estimate the final 6D camera pose via RANSAC. To solve the first step, Random Forests (RFs) are typically used. On the o… ▽ More

    Submitted 13 July, 2017; v1 submitted 19 September, 2016; originally announced September 2016.

    Comments: 8 pages, 4 figures

  39. arXiv:1609.03532  [pdf, other

    cs.CV

    Fully-Trainable Deep Matching

    Authors: James Thewlis, Shuai Zheng, Philip H. S. Torr, Andrea Vedaldi

    Abstract: Deep Matching (DM) is a popular high-quality method for quasi-dense image matching. Despite its name, however, the original DM formulation does not yield a deep neural network that can be trained end-to-end via backpropagation. In this paper, we remove this limitation by rewriting the complete DM algorithm as a convolutional neural network. This results in a novel deep architecture for image match… ▽ More

    Submitted 12 September, 2016; originally announced September 2016.

    Comments: British Machine Vision Conference (BMVC) 2016

  40. arXiv:1609.02583  [pdf, other

    cs.CV

    Bottom-up Instance Segmentation using Deep Higher-Order CRFs

    Authors: Anurag Arnab, Philip H. S. Torr

    Abstract: Traditional Scene Understanding problems such as Object Detection and Semantic Segmentation have made breakthroughs in recent years due to the adoption of deep learning. However, the former task is not able to localise objects at a pixel level, and the latter task has no notion of different instances of objects of the same class. We focus on the task of Instance Segmentation which recognises and l… ▽ More

    Submitted 8 September, 2016; originally announced September 2016.

    Comments: British Machine Vision Conference (BMVC) 2016

  41. arXiv:1608.06192  [pdf, other

    cs.CV

    Efficient Continuous Relaxations for Dense CRF

    Authors: Alban Desmaison, Rudy Bunel, Pushmeet Kohli, Philip H. S. Torr, M. Pawan Kumar

    Abstract: Dense conditional random fields (CRF) with Gaussian pairwise potentials have emerged as a popular framework for several computer vision applications such as stereo correspondence and semantic segmentation. By modeling long-range interactions, dense CRFs provide a more detailed labelling compared to their sparse counterparts. Variational inference in these dense models is performed using a filterin… ▽ More

    Submitted 22 August, 2016; originally announced August 2016.

  42. arXiv:1608.01529  [pdf, ps, other

    cs.CV

    Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos

    Authors: Suman Saha, Gurkirt Singh, Michael Sapienza, Philip H. S. Torr, Fabio Cuzzolin

    Abstract: In this work, we propose an approach to the spatiotemporal localisation (detection) and classification of multiple concurrent actions within temporally untrimmed videos. Our framework is composed of three stages. In stage 1, appearance and motion detection networks are employed to localise and score actions from colour images and optical flow. In stage 2, the appearance network detections are boos… ▽ More

    Submitted 4 August, 2016; originally announced August 2016.

    Comments: Accepted by British Machine Vision Conference 2016

  43. arXiv:1606.09549  [pdf, other

    cs.CV

    Fully-Convolutional Siamese Networks for Object Tracking

    Authors: Luca Bertinetto, Jack Valmadre, João F. Henriques, Andrea Vedaldi, Philip H. S. Torr

    Abstract: The problem of arbitrary object tracking has traditionally been tackled by learning a model of the object's appearance exclusively online, using as sole training data the video itself. Despite the success of these methods, their online-only approach inherently limits the richness of the model they can learn. Recently, several attempts have been made to exploit the expressive power of deep convolut… ▽ More

    Submitted 1 December, 2021; v1 submitted 30 June, 2016; originally announced June 2016.

    Comments: The first two authors contributed equally, and are listed in alphabetical order. Code available at http://www.robots.ox.ac.uk/~luca/siamese-fc.html

  44. arXiv:1606.05233  [pdf, other

    cs.CV cs.LG

    Learning feed-forward one-shot learners

    Authors: Luca Bertinetto, João F. Henriques, Jack Valmadre, Philip H. S. Torr, Andrea Vedaldi

    Abstract: One-shot learning is usually tackled by using generative models or discriminative embeddings. Discriminative methods based on deep learning, which are very effective in other learning scenarios, are ill-suited for one-shot learning as they need large amounts of training data. In this paper, we propose a method to learn the parameters of a deep model in one shot. We construct the learner as a secon… ▽ More

    Submitted 16 June, 2016; originally announced June 2016.

    Comments: The first three authors contributed equally, and are listed in alphabetical order

  45. arXiv:1605.07969  [pdf, other

    cs.AI cs.LG

    Adaptive Neural Compilation

    Authors: Rudy Bunel, Alban Desmaison, Pushmeet Kohli, Philip H. S. Torr, M. Pawan Kumar

    Abstract: This paper proposes an adaptive neural-compilation framework to address the problem of efficient program learning. Traditional code optimisation strategies used in compilers are based on applying pre-specified set of transformations that make the code faster to execute without changing its semantics. In contrast, our work involves adapting programs to make them more efficient while considering cor… ▽ More

    Submitted 26 May, 2016; v1 submitted 25 May, 2016; originally announced May 2016.

    Comments: Submitted to NIPS 2016, code and supplementary materials will be available on author's page

  46. arXiv:1511.08250  [pdf, other

    cs.CV cs.AI

    Recurrent Instance Segmentation

    Authors: Bernardino Romera-Paredes, Philip H. S. Torr

    Abstract: Instance segmentation is the problem of detecting and delineating each distinct object of interest appearing in an image. Current instance segmentation approaches consist of ensembles of modules that are trained independently of each other, thus missing opportunities for joint learning. Here we propose a new instance segmentation paradigm consisting in an end-to-end method that learns how to segme… ▽ More

    Submitted 24 October, 2016; v1 submitted 25 November, 2015; originally announced November 2015.

    Comments: 14 pages (main paper). 24 pages including references and appendix

    Journal ref: ECCV 2016. 14th European Conference on Computer Vision

  47. arXiv:1511.05067  [pdf, other

    cs.CV

    Joint Training of Generic CNN-CRF Models with Stochastic Optimization

    Authors: Alexander Kirillov, Dmitrij Schlesinger, Shuai Zheng, Bogdan Savchynskyy, Philip H. S. Torr, Carsten Rother

    Abstract: We propose a new CNN-CRF end-to-end learning framework, which is based on joint stochastic optimization with respect to both Convolutional Neural Network (CNN) and Conditional Random Field (CRF) parameters. While stochastic gradient descent is a standard technique for CNN training, it was not used for joint models so far. We show that our learning method is (i) general, i.e. it applies to arbitrar… ▽ More

    Submitted 14 September, 2016; v1 submitted 16 November, 2015; originally announced November 2015.

    Comments: ACCV2016

  48. arXiv:1511.04511  [pdf, other

    cs.CV

    Sequential Optimization for Efficient High-Quality Object Proposal Generation

    Authors: Ziming Zhang, Yun Liu, Xi Chen, Yanjun Zhu, Ming-Ming Cheng, Venkatesh Saligrama, Philip H. S. Torr

    Abstract: We are motivated by the need for a generic object proposal generation algorithm which achieves good balance between object detection recall, proposal localization quality and computational efficiency. We propose a novel object proposal algorithm, BING++, which inherits the virtue of good computational efficiency of BING but significantly improves its proposal localization quality. At high level we… ▽ More

    Submitted 22 May, 2017; v1 submitted 14 November, 2015; originally announced November 2015.

    Comments: Accepted by TPAMI

  49. SemanticPaint: A Framework for the Interactive Segmentation of 3D Scenes

    Authors: Stuart Golodetz, Michael Sapienza, Julien P. C. Valentin, Vibhav Vineet, Ming-Ming Cheng, Anurag Arnab, Victor A. Prisacariu, Olaf Kähler, Carl Yuheng Ren, David W. Murray, Shahram Izadi, Philip H. S. Torr

    Abstract: We present an open-source, real-time implementation of SemanticPaint, a system for geometric reconstruction, object-class segmentation and learning of 3D scenes. Using our system, a user can walk into a room wearing a depth camera and a virtual reality headset, and both densely reconstruct the 3D scene and interactively segment the environment into object classes such as 'chair', 'floor' and 'tabl… ▽ More

    Submitted 13 October, 2015; originally announced October 2015.

    Comments: 33 pages, Project: http://www.semantic-paint.com, Code: https://github.com/torrvision/spaint

    ACM Class: I.2.10

  50. Conditional Random Fields as Recurrent Neural Networks

    Authors: Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, Philip H. S. Torr

    Abstract: Pixel-level labelling tasks, such as semantic segmentation, play a central role in image understanding. Recent approaches have attempted to harness the capabilities of deep learning techniques for image recognition to tackle pixel-level labelling tasks. One central issue in this methodology is the limited capacity of deep learning techniques to delineate visual objects. To solve this problem, we i… ▽ More

    Submitted 13 April, 2016; v1 submitted 11 February, 2015; originally announced February 2015.

    Comments: This paper is published in IEEE ICCV 2015