Skip to main content

Showing 101–150 of 207 results for author: Torr, P H S

.
  1. arXiv:2003.06709  [pdf, other

    cs.LG cs.AI stat.ML

    FACMAC: Factored Multi-Agent Centralised Policy Gradients

    Authors: Bei Peng, Tabish Rashid, Christian A. Schroeder de Witt, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Böhmer, Shimon Whiteson

    Abstract: We propose FACtored Multi-Agent Centralised policy gradients (FACMAC), a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces. Like MADDPG, a popular multi-agent actor-critic method, our approach uses deep deterministic policy gradients to learn policies. However, FACMAC learns a centralised but factored critic, which combines per-agent utilit… ▽ More

    Submitted 7 May, 2021; v1 submitted 14 March, 2020; originally announced March 2020.

  2. arXiv:2003.01663  [pdf, other

    cs.CV

    Holistically-Attracted Wireframe Parsing

    Authors: Nan Xue, Tianfu Wu, Song Bai, Fu-Dong Wang, Gui-Song Xia, Liangpei Zhang, Philip H. S. Torr

    Abstract: This paper presents a fast and parsimonious parsing method to accurately and robustly detect a vectorized wireframe in an input image with a single forward pass. The proposed method is end-to-end trainable, consisting of three components: (i) line segment and junction proposal generation, (ii) line segment and junction matching, and (iii) line segment and junction verification. For computing line… ▽ More

    Submitted 3 March, 2020; originally announced March 2020.

    Comments: Accepted by CVPR 2020

  3. arXiv:2002.10410  [pdf, other

    cs.LG stat.ML

    Lagrangian Decomposition for Neural Network Verification

    Authors: Rudy Bunel, Alessandro De Palma, Alban Desmaison, Krishnamurthy Dvijotham, Pushmeet Kohli, Philip H. S. Torr, M. Pawan Kumar

    Abstract: A fundamental component of neural network verification is the computation of bounds on the values their outputs can take. Previous methods have either used off-the-shelf solvers, discarding the problem structure, or relaxed the problem even further, making the bounds unnecessarily loose. We propose a novel approach based on Lagrangian Decomposition. Our formulation admits an efficient supergradien… ▽ More

    Submitted 17 June, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: UAI 2020 conference paper

  4. arXiv:2002.09437  [pdf, other

    cs.LG cs.CV stat.ML

    Calibrating Deep Neural Networks using Focal Loss

    Authors: Jishnu Mukhoti, Viveka Kulharia, Amartya Sanyal, Stuart Golodetz, Philip H. S. Torr, Puneet K. Dokania

    Abstract: Miscalibration - a mismatch between a model's confidence and its correctness - of Deep Neural Networks (DNNs) makes their predictions hard to rely on. Ideally, we want networks to be accurate, calibrated and confident. We show that, as opposed to the standard cross-entropy loss, focal loss [Lin et. al., 2017] allows us to learn models that are already very well calibrated. When combined with tempe… ▽ More

    Submitted 26 October, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

    Comments: This paper was accepted at NeurIPS 2020

  5. arXiv:2002.05235  [pdf, other

    cs.CV cs.CL cs.LG

    Image-to-Image Translation with Text Guidance

    Authors: Bowen Li, Xiaojuan Qi, Philip H. S. Torr, Thomas Lukasiewicz

    Abstract: The goal of this paper is to embed controllable factors, i.e., natural language descriptions, into image-to-image translation with generative adversarial networks, which allows text descriptions to determine the visual attributes of synthetic images. We propose four key components: (1) the implementation of part-of-speech tagging to filter out non-semantic words in the given description, (2) the a… ▽ More

    Submitted 12 February, 2020; originally announced February 2020.

  6. arXiv:2002.01048  [pdf, other

    cs.CV cs.LG eess.IV

    Multi-Channel Attention Selection GANs for Guided Image-to-Image Translation

    Authors: Hao Tang, Philip H. S. Torr, Nicu Sebe

    Abstract: We propose a novel model named Multi-Channel Attention Selection Generative Adversarial Network (SelectionGAN) for guided image-to-image translation, where we translate an input image into another while respecting an external semantic guidance. The proposed SelectionGAN explicitly utilizes the semantic guidance information and consists of two stages. In the first stage, the input image and the con… ▽ More

    Submitted 6 October, 2022; v1 submitted 3 February, 2020; originally announced February 2020.

    Comments: Accepted to TPAMI, an extended version of a paper published in CVPR2019. arXiv admin note: substantial text overlap with arXiv:1904.06807

  7. arXiv:2001.04982  [pdf, other

    cs.CV

    Unifying Training and Inference for Panoptic Segmentation

    Authors: Qizhu Li, Xiaojuan Qi, Philip H. S. Torr

    Abstract: We present an end-to-end network to bridge the gap between training and inference pipeline for panoptic segmentation, a task that seeks to partition an image into semantic regions for "stuff" and object instances for "things". In contrast to recent works, our network exploits a parametrised, yet lightweight panoptic segmentation submodule, powered by an end-to-end learnt dense instance affinity, t… ▽ More

    Submitted 26 May, 2020; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: CVPR 2020

  8. arXiv:2001.03919  [pdf, other

    cs.CV

    Rethinking Class Relations: Absolute-relative Supervised and Unsupervised Few-shot Learning

    Authors: Hongguang Zhang, Piotr Koniusz, Songlei Jian, Hongdong Li, Philip H. S. Torr

    Abstract: The majority of existing few-shot learning methods describe image relations with binary labels. However, such binary relations are insufficient to teach the network complicated real-world relations, due to the lack of decision smoothness. Furthermore, current few-shot learning models capture only the similarity via relation labels, but they are not exposed to class concepts associated with objects… ▽ More

    Submitted 9 June, 2021; v1 submitted 12 January, 2020; originally announced January 2020.

    Comments: IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021

  9. arXiv:2001.03905  [pdf, other

    cs.CV

    Few-shot Action Recognition with Permutation-invariant Attention

    Authors: Hongguang Zhang, Li Zhang, Xiaojuan Qi, Hongdong Li, Philip H. S. Torr, Piotr Koniusz

    Abstract: Many few-shot learning models focus on recognising images. In contrast, we tackle a challenging task of few-shot action recognition from videos. We build on a C3D encoder for spatio-temporal video blocks to capture short-range action patterns. Such encoded blocks are aggregated by permutation-invariant pooling to make our approach robust to varying action lengths and long-range temporal dependenci… ▽ More

    Submitted 3 August, 2020; v1 submitted 12 January, 2020; originally announced January 2020.

    Comments: ECCV2020 Spotlight

  10. arXiv:2001.01600  [pdf, other

    cs.CV

    Improving Few-shot Learning by Spatially-aware Matching and CrossTransformer

    Authors: Hongguang Zhang, Philip H. S. Torr, Piotr Koniusz

    Abstract: Current few-shot learning models capture visual object relations in the so-called meta-learning setting under a fixed-resolution input. However, such models have a limited generalization ability under the scale and location mismatch between objects, as only few samples from target classes are provided. Therefore, the lack of a mechanism to match the scale and location between pairs of compared ima… ▽ More

    Submitted 8 October, 2022; v1 submitted 6 January, 2020; originally announced January 2020.

    Comments: Asian Conference on Computer Vision 2022

  11. arXiv:1912.12215  [pdf, other

    cs.CV cs.LG eess.IV

    Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

    Authors: Hao Tang, Dan Xu, Yan Yan, Philip H. S. Torr, Nicu Sebe

    Abstract: In this paper, we address the task of semantic-guided scene generation. One open challenge in scene generation is the difficulty of the generation of small objects and detailed local texture, which has been widely observed in global image-level generation methods. To tackle this issue, in this work we consider learning the scene generation in a local context, and correspondingly design a local cla… ▽ More

    Submitted 30 March, 2020; v1 submitted 27 December, 2019; originally announced December 2019.

    Comments: Accepted to CVPR 2020, camera ready (10 pages) + supplementary (18 pages)

  12. Learning Regional Attraction for Line Segment Detection

    Authors: Nan Xue, Song Bai, Fu-Dong Wang, Gui-Song Xia, Tianfu Wu, Liangpei Zhang, Philip H. S. Torr

    Abstract: This paper presents regional attraction of line segment maps, and hereby poses the problem of line segment detection (LSD) as a problem of region coloring. Given a line segment map, the proposed regional attraction first establishes the relationship between line segments and regions in the image lattice. Based on this, the line segment map is equivalently transformed to an attraction field map (AF… ▽ More

    Submitted 17 December, 2019; originally announced December 2019.

    Comments: Accepted to IEEE TPAMI. arXiv admin note: text overlap with arXiv:1812.02122

  13. arXiv:1912.06615  [pdf, other

    q-bio.NC

    Lessons from reinforcement learning for biological representations of space

    Authors: Alex Muryy, N. Siddharth, Nantas Nardelli, Philip H. S. Torr, Andrew Glennerster

    Abstract: Neuroscientists postulate 3D representations in the brain in a variety of different coordinate frames (e.g. 'head-centred', 'hand-centred' and 'world-based'). Recent advances in reinforcement learning demonstrate a quite different approach that may provide a more promising model for biological representations underlying spatial perception and navigation. In this paper, we focus on reinforcement le… ▽ More

    Submitted 6 July, 2020; v1 submitted 13 December, 2019; originally announced December 2019.

    Comments: 40 pages including Appendix, 6 figures plus 3 figures in Appendix. Accepted for publication in Vision Research

  14. arXiv:1912.06203  [pdf, other

    cs.CV cs.CL cs.LG

    ManiGAN: Text-Guided Image Manipulation

    Authors: Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, Philip H. S. Torr

    Abstract: The goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e.g., texture, colour, and background), while preserving other contents that are irrelevant to the text. To achieve this, we propose a novel generative adversarial network (ManiGAN), which contains two key components: text-image affine combination module (ACM) and detail correct… ▽ More

    Submitted 30 March, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

    Comments: CVPR 2020

  15. arXiv:1911.13270  [pdf, other

    cs.LG cs.CV stat.ML

    Transflow Learning: Repurposing Flow Models Without Retraining

    Authors: Andrew Gambardella, Atılım Güneş Baydin, Philip H. S. Torr

    Abstract: It is well known that deep generative models have a rich latent space, and that it is possible to smoothly manipulate their outputs by traversing this latent space. Recently, architectures have emerged that allow for more complex manipulations, such as making an image look as though it were from a different class, or painted in a certain style. These methods typically require large amounts of trai… ▽ More

    Submitted 5 December, 2019; v1 submitted 29 November, 2019; originally announced November 2019.

  16. arXiv:1911.12836  [pdf, other

    cs.CV

    Siam R-CNN: Visual Tracking by Re-Detection

    Authors: Paul Voigtlaender, Jonathon Luiten, Philip H. S. Torr, Bastian Leibe

    Abstract: We present Siam R-CNN, a Siamese re-detection architecture which unleashes the full power of two-stage object detection approaches for visual object tracking. We combine this with a novel tracklet-based dynamic programming algorithm, which takes advantage of re-detections of both the first-frame template and previous-frame predictions, to model the full history of both the object to be tracked and… ▽ More

    Submitted 2 April, 2020; v1 submitted 28 November, 2019; originally announced November 2019.

    Comments: CVPR 2020 camera-ready version

  17. arXiv:1911.11897  [pdf, other

    cs.CV cs.LG eess.IV

    AttentionGAN: Unpaired Image-to-Image Translation using Attention-Guided Generative Adversarial Networks

    Authors: Hao Tang, Hong Liu, Dan Xu, Philip H. S. Torr, Nicu Sebe

    Abstract: State-of-the-art methods in image-to-image translation are capable of learning a map** from a source domain to a target domain with unpaired image data. Though the existing methods have achieved promising results, they still produce visual artifacts, being able to translate low-level information but not high-level semantics of input images. One possible reason is that generators do not have the… ▽ More

    Submitted 16 August, 2021; v1 submitted 26 November, 2019; originally announced November 2019.

    Comments: Accepted to TNNLS, an extended version of a paper published in IJCNN2019. arXiv admin note: substantial text overlap with arXiv:1903.12296

  18. arXiv:1911.03393  [pdf, other

    stat.ML cs.LG

    Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models

    Authors: Yuge Shi, N. Siddharth, Brooks Paige, Philip H. S. Torr

    Abstract: Learning generative models that span multiple data modalities, such as vision and language, is often motivated by the desire to learn more useful, generalisable representations that faithfully capture common underlying factors between the modalities. In this work, we characterise successful learning of such models as the fulfillment of four criteria: i) implicit latent decomposition into shared an… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

  19. arXiv:1910.10895  [pdf, other

    cs.CV

    Anchor Diffusion for Unsupervised Video Object Segmentation

    Authors: Zhao Yang, Qiang Wang, Luca Bertinetto, Weiming Hu, Song Bai, Philip H. S. Torr

    Abstract: Unsupervised video object segmentation has often been tackled by methods based on recurrent neural networks and optical flow. Despite their complexity, these kinds of approaches tend to favour short-term temporal dependencies and are thus prone to accumulating inaccuracies, which cause drift over time. Moreover, simple (static) image segmentation models, alone, can perform competitively against th… ▽ More

    Submitted 23 October, 2019; originally announced October 2019.

    Comments: To appear in ICCV 2019

  20. arXiv:1910.09056  [pdf, other

    cs.LG cs.AI stat.ML

    Amortized Rejection Sampling in Universal Probabilistic Programming

    Authors: Saeid Naderiparizi, Adam Ścibior, Andreas Munk, Mehrdad Ghadiri, Atılım Güneş Baydin, Bradley Gram-Hansen, Christian Schroeder de Witt, Robert Zinkov, Philip H. S. Torr, Tom Rainforth, Yee Whye Teh, Frank Wood

    Abstract: Naive approaches to amortized inference in probabilistic programs with unbounded loops can produce estimators with infinite variance. This is particularly true of importance sampling inference in programs that explicitly include rejection sampling as part of the user-programmed generative procedure. In this paper we develop a new and efficient amortized importance sampling estimator. We prove fini… ▽ More

    Submitted 28 March, 2022; v1 submitted 20 October, 2019; originally announced October 2019.

    Comments: AISTATS 2022 camera ready

  21. arXiv:1910.08237  [pdf, other

    cs.LG cs.CV stat.ML

    Mirror Descent View for Neural Network Quantization

    Authors: Thalaiyasingam Ajanthan, Kartik Gupta, Philip H. S. Torr, Richard Hartley, Puneet K. Dokania

    Abstract: Quantizing large Neural Networks (NN) while maintaining the performance is highly desirable for resource-limited devices due to reduced memory and time complexity. It is usually formulated as a constrained optimization problem and optimized via a modified version of gradient descent. In this work, by interpreting the continuous parameters (unconstrained) as the dual of the quantized ones, we intro… ▽ More

    Submitted 2 March, 2021; v1 submitted 17 October, 2019; originally announced October 2019.

    Comments: This paper was accepted at AISTATS 2021

  22. arXiv:1909.11081  [pdf, other

    cs.CV cs.LG eess.IV

    Interactive Sketch & Fill: Multiclass Sketch-to-Image Translation

    Authors: Arnab Ghosh, Richard Zhang, Puneet K. Dokania, Oliver Wang, Alexei A. Efros, Philip H. S. Torr, Eli Shechtman

    Abstract: We propose an interactive GAN-based sketch-to-image translation method that helps novice users create images of simple objects. As the user starts to draw a sketch of a desired object type, the network interactively recommends plausible completions, and shows a corresponding synthesized image to the user. This enables a feedback loop, where the user can edit their sketch based on the network's rec… ▽ More

    Submitted 25 September, 2019; v1 submitted 24 September, 2019; originally announced September 2019.

    Comments: ICCV 2019, Video Avaiable at https://youtu.be/T9xtpAMUDps

  23. arXiv:1909.07083  [pdf, other

    cs.CV cs.CL cs.LG

    Controllable Text-to-Image Generation

    Authors: Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, Philip H. S. Torr

    Abstract: In this paper, we propose a novel controllable text-to-image generative adversarial network (ControlGAN), which can effectively synthesise high-quality images and also control parts of the image generation according to natural language descriptions. To achieve this, we introduce a word-level spatial and channel-wise attention-driven generator that can disentangle different visual attributes, and a… ▽ More

    Submitted 19 December, 2019; v1 submitted 16 September, 2019; originally announced September 2019.

    Comments: NeurIPS 2019

  24. arXiv:1909.06588  [pdf, other

    cs.LG cs.LO stat.ML

    Branch and Bound for Piecewise Linear Neural Network Verification

    Authors: Rudy Bunel, **gyue Lu, Ilker Turkaslan, Philip H. S. Torr, Pushmeet Kohli, M. Pawan Kumar

    Abstract: The success of Deep Learning and its potential use in many safety-critical applications has motivated research on formal verification of Neural Network (NN) models. In this context, verification involves proving or disproving that an NN model satisfies certain input-output properties. Despite the reputation of learned NN models as black boxes, and the theoretical hardness of proving useful propert… ▽ More

    Submitted 26 October, 2020; v1 submitted 14 September, 2019; originally announced September 2019.

  25. arXiv:1909.06121  [pdf, other

    cs.CV

    Dual Graph Convolutional Network for Semantic Segmentation

    Authors: Li Zhang, Xiangtai Li, Anurag Arnab, Kuiyuan Yang, Yunhai Tong, Philip H. S. Torr

    Abstract: Exploiting long-range contextual information is key for pixel-wise prediction tasks such as semantic segmentation. In contrast to previous work that uses multi-scale feature fusion or dilated convolutions, we propose a novel graph-convolutional network (GCN) to address this problem. Our Dual Graph Convolutional Network (DGCNet) models the global context of the input feature by modelling two orthog… ▽ More

    Submitted 26 August, 2020; v1 submitted 13 September, 2019; originally announced September 2019.

    Comments: BMVC 2019. Code is available at \url{https://github.com/lxtGH/GALD-DGCNet}

  26. arXiv:1908.06955  [pdf, other

    cs.CV cs.LG

    Dynamic Graph Message Passing Networks

    Authors: Li Zhang, Dan Xu, Anurag Arnab, Philip H. S. Torr

    Abstract: Modelling long-range dependencies is critical for scene understanding tasks in computer vision. Although CNNs have excelled in many vision tasks, they are still limited in capturing long-range structured relationships as they typically consist of layers of local kernels. A fully-connected graph is beneficial for such modelling, however, its computational overhead is prohibitive. We propose a dynam… ▽ More

    Submitted 14 September, 2022; v1 submitted 19 August, 2019; originally announced August 2019.

    Comments: CVPR 2020 Oral

  27. arXiv:1907.07745  [pdf, other

    cs.CV eess.IV eess.SP

    Real-Time Highly Accurate Dense Depth on a Power Budget using an FPGA-CPU Hybrid SoC

    Authors: Oscar Rahnama, Tommaso Cavallari, Stuart Golodetz, Alessio Tonioni, Thomas Joy, Luigi Di Stefano, Simon Walker, Philip H. S. Torr

    Abstract: Obtaining highly accurate depth from stereo images in real time has many applications across computer vision and robotics, but in some contexts, upper bounds on power consumption constrain the feasible hardware to embedded platforms such as FPGAs. Whilst various stereo algorithms have been deployed on these platforms, usually cut down to better match the embedded architecture, certain key parts of… ▽ More

    Submitted 17 July, 2019; originally announced July 2019.

    Comments: 6 pages, 7 figures, 2 tables, journal

    Journal ref: IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 66, no. 5, pp. 773-777, May 2019

  28. arXiv:1906.06307  [pdf, ps, other

    cs.LG cs.CV stat.ML

    A Signal Propagation Perspective for Pruning Neural Networks at Initialization

    Authors: Namhoon Lee, Thalaiyasingam Ajanthan, Stephen Gould, Philip H. S. Torr

    Abstract: Network pruning is a promising avenue for compressing deep neural networks. A typical approach to pruning starts by training a model and then removing redundant parameters while minimizing the impact on what is learned. Alternatively, a recent approach shows that pruning can be done at initialization prior to training, based on a saliency criterion called connection sensitivity. However, it remain… ▽ More

    Submitted 16 February, 2020; v1 submitted 14 June, 2019; originally announced June 2019.

    Comments: ICLR 2020

  29. arXiv:1906.04659  [pdf, other

    stat.ML cs.LG

    Stable Rank Normalization for Improved Generalization in Neural Networks and GANs

    Authors: Amartya Sanyal, Philip H. S. Torr, Puneet K. Dokania

    Abstract: Exciting new work on the generalization bounds for neural networks (NN) given by Neyshabur et al. , Bartlett et al. closely depend on two parameter-depenedent quantities: the Lipschitz constant upper-bound and the stable rank (a softer version of the rank operator). This leads to an interesting question of whether controlling these quantities might improve the generalization behaviour of NNs. To t… ▽ More

    Submitted 20 February, 2020; v1 submitted 11 June, 2019; originally announced June 2019.

    Comments: Accepted at the International Conference in Learning Representations, 2020, Addis Ababa, Ethiopia

  30. arXiv:1905.12432  [pdf, other

    stat.ML cs.LG

    Hijacking Malaria Simulators with Probabilistic Programming

    Authors: Bradley Gram-Hansen, Christian Schröder de Witt, Tom Rainforth, Philip H. S. Torr, Yee Whye Teh, Atılım Güneş Baydin

    Abstract: Epidemiology simulations have become a fundamental tool in the fight against the epidemics of various infectious diseases like AIDS and malaria. However, the complicated and stochastic nature of these simulators can mean their output is difficult to interpret, which reduces their usefulness to policymakers. In this paper, we introduce an approach that allows one to treat a large class of populatio… ▽ More

    Submitted 29 May, 2019; originally announced May 2019.

    Comments: 6 pages, 3 figures, Accepted at the International Conference on Machine Learning AI for Social Good Workshop, Long Beach, United States, 2019

    Journal ref: ICML Workshop on AI for Social Good, 2018

  31. arXiv:1905.11358  [pdf, other

    cs.CV cs.AI cs.LG

    Straight to Shapes++: Real-time Instance Segmentation Made More Accurate

    Authors: Laurynas Miksys, Saumya Jetley, Michael Sapienza, Stuart Golodetz, Philip H. S. Torr

    Abstract: Instance segmentation is an important problem in computer vision, with applications in autonomous driving, drone navigation and robotic manipulation. However, most existing methods are not real-time, complicating their deployment in time-sensitive contexts. In this work, we extend an existing approach to real-time instance segmentation, called `Straight to Shapes' (STS), which makes use of low-dim… ▽ More

    Submitted 30 July, 2019; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: Technical report, 27 pages (12 main, 15 supplementary), 17 figures, 14 tables

    Report number: STS-2018

  32. arXiv:1905.07435  [pdf, other

    cs.LG cs.AI stat.ML

    Alpha MAML: Adaptive Model-Agnostic Meta-Learning

    Authors: Harkirat Singh Behl, Atılım Güneş Baydin, Philip H. S. Torr

    Abstract: Model-agnostic meta-learning (MAML) is a meta-learning technique to train a model on a multitude of learning tasks in a way that primes the model for few-shot learning of new tasks. The MAML algorithm performs well on few-shot learning problems in classification, regression, and fine-tuning of policy gradients in reinforcement learning, but comes with the need for costly hyperparameter tuning for… ▽ More

    Submitted 17 May, 2019; originally announced May 2019.

    Comments: 6th ICML Workshop on Automated Machine Learning (2019)

    Journal ref: ICML Workshop on Automated Machine Learning (2019)

  33. arXiv:1904.06587  [pdf, other

    cs.CV

    GA-Net: Guided Aggregation Net for End-to-end Stereo Matching

    Authors: Feihu Zhang, Victor Prisacariu, Ruigang Yang, Philip H. S. Torr

    Abstract: In the stereo matching task, matching cost aggregation is crucial in both traditional methods and deep neural network models in order to accurately estimate disparities. We propose two novel neural net layers, aimed at capturing local and the whole-image cost dependencies respectively. The first is a semi-global aggregation layer which is a differentiable approximation of the semi-global matching,… ▽ More

    Submitted 13 April, 2019; originally announced April 2019.

    Comments: CVPR 2019 (Oral Presentation)

  34. arXiv:1904.04562  [pdf, other

    cs.CV cs.LG

    Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks

    Authors: Eunwoo Kim, Chanho Ahn, Philip H. S. Torr, Songhwai Oh

    Abstract: Deep networks consume a large amount of memory by their nature. A natural question arises can we reduce that memory requirement whilst maintaining performance. In particular, in this work we address the problem of memory efficient learning for multiple tasks. To this end, we propose a novel network architecture producing multiple networks of different configurations, termed deep virtual networks (… ▽ More

    Submitted 9 April, 2019; originally announced April 2019.

    Comments: CVPR 2019

  35. arXiv:1904.02957  [pdf, other

    cs.CV

    Learning to Adapt for Stereo

    Authors: Alessio Tonioni, Oscar Rahnama, Thomas Joy, Luigi Di Stefano, Thalaiyasingam Ajanthan, Philip H. S. Torr

    Abstract: Real world applications of stereo depth estimation require models that are robust to dynamic variations in the environment. Even though deep learning based stereo methods are successful, they often fail to generalize to unseen variations in the environment, making them less suitable for practical applications such as autonomous driving. In this work, we introduce a "learning-to-adapt" framework th… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: Accepted at CVPR2019. Code available at https://github.com/CVLAB-Unibo/Learning2AdaptForStereo

    Journal ref: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 9661-9670

  36. arXiv:1902.10486  [pdf, other

    cs.LG stat.ML

    On Tiny Episodic Memories in Continual Learning

    Authors: Arslan Chaudhry, Marcus Rohrbach, Mohamed Elhoseiny, Thalaiyasingam Ajanthan, Puneet K. Dokania, Philip H. S. Torr, Marc'Aurelio Ranzato

    Abstract: In continual learning (CL), an agent learns from a stream of tasks leveraging prior experience to transfer knowledge to future tasks. It is an ideal framework to decrease the amount of supervision in the existing learning algorithms. But for a successful knowledge transfer, the learner needs to remember how to perform previous tasks. One way to endow the learner the ability to perform tasks seen i… ▽ More

    Submitted 4 June, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

    Comments: Making the main point of the paper more clear

  37. arXiv:1902.08134  [pdf, other

    cs.LG cs.CV stat.ML

    Domain Partitioning Network

    Authors: Botos Csaba, Adnane Boukhayma, Viveka Kulharia, András Horváth, Philip H. S. Torr

    Abstract: Standard adversarial training involves two agents, namely a generator and a discriminator, playing a mini-max game. However, even if the players converge to an equilibrium, the generator may only recover a part of the target data distribution, in a situation commonly referred to as mode collapse. In this work, we present the Domain Partitioning Network (DoPaNet), a new approach to deal with mode c… ▽ More

    Submitted 21 February, 2019; originally announced February 2019.

    Comments: 18 pages, 13 figures

  38. arXiv:1902.04043  [pdf, other

    cs.LG cs.MA stat.ML

    The StarCraft Multi-Agent Challenge

    Authors: Mikayel Samvelyan, Tabish Rashid, Christian Schroeder de Witt, Gregory Farquhar, Nantas Nardelli, Tim G. J. Rudner, Chia-Man Hung, Philip H. S. Torr, Jakob Foerster, Shimon Whiteson

    Abstract: In the last few years, deep multi-agent reinforcement learning (RL) has become a highly active area of research. A particularly challenging class of problems in this area is partially observable, cooperative, multi-agent learning, in which teams of agents must learn to coordinate their behaviour while conditioning only on their private observations. This is an attractive research area since such p… ▽ More

    Submitted 9 December, 2019; v1 submitted 11 February, 2019; originally announced February 2019.

  39. arXiv:1902.03451  [pdf, other

    cs.CV cs.AI cs.LG

    3D Hand Shape and Pose from Images in the Wild

    Authors: Adnane Boukhayma, Rodrigo de Bem, Philip H. S. Torr

    Abstract: We present in this work the first end-to-end deep learning based method that predicts both 3D hand shape and pose from RGB images in the wild. Our network consists of the concatenation of a deep convolutional encoder, and a fixed model-based decoder. Given an input image, and optionally 2D joint detections obtained from an independent CNN, the encoder predicts a set of hand and view parameters. Th… ▽ More

    Submitted 9 February, 2019; originally announced February 2019.

  40. arXiv:1901.10650  [pdf, other

    cs.CV

    Adversarial Metric Attack and Defense for Person Re-identification

    Authors: Song Bai, Yingwei Li, Yuyin Zhou, Qizhu Li, Philip H. S. Torr

    Abstract: Person re-identification (re-ID) has attracted much attention recently due to its great importance in video surveillance. In general, distance metrics used to identify two person images are expected to be robust under various appearance changes. However, our work observes the extreme vulnerability of existing distance metrics to adversarial examples, generated by simply adding human-imperceptible… ▽ More

    Submitted 10 October, 2020; v1 submitted 29 January, 2019; originally announced January 2019.

    Comments: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

  41. arXiv:1901.08150  [pdf, other

    cs.LG cs.CV stat.ML

    Hypergraph Convolution and Hypergraph Attention

    Authors: Song Bai, Feihu Zhang, Philip H. S. Torr

    Abstract: Recently, graph neural networks have attracted great attention and achieved prominent performance in various research fields. Most of those algorithms have assumed pairwise relationships of objects of interest. However, in many real applications, the relationships between objects are in higher-order, beyond a pairwise formulation. To efficiently learn deep embeddings on the high-order graph-struct… ▽ More

    Submitted 10 October, 2020; v1 submitted 23 January, 2019; originally announced January 2019.

    Comments: Accepted by Pattern Recognition

  42. arXiv:1812.11276  [pdf, other

    cs.LG stat.ML

    Learn to Interpret Atari Agents

    Authors: Zhao Yang, Song Bai, Li Zhang, Philip H. S. Torr

    Abstract: Deep reinforcement learning (DeepRL) agents surpass human-level performance in many tasks. However, the direct map** from states to actions makes it hard to interpret the rationale behind the decision-making of the agents. In contrast to previous a-posteriori methods for visualizing DeepRL policies, in this work, we propose to equip the DeepRL model with an innate visualization ability. Our prop… ▽ More

    Submitted 5 April, 2023; v1 submitted 28 December, 2018; originally announced December 2018.

    Comments: An old report. Uploaded for archival purposes only

  43. arXiv:1812.06417  [pdf, other

    cs.CV cs.CL cs.LG

    Visual Dialogue without Vision or Dialogue

    Authors: Daniela Massiceti, Puneet K. Dokania, N. Siddharth, Philip H. S. Torr

    Abstract: We characterise some of the quirks and shortcomings in the exploration of Visual Dialogue - a sequential question-answering task where the questions and corresponding answers are related through given visual stimuli. To do so, we develop an embarrassingly simple method based on Canonical Correlation Analysis (CCA) that, on the standard dataset, achieves near state-of-the-art performance on mean ra… ▽ More

    Submitted 22 October, 2019; v1 submitted 16 December, 2018; originally announced December 2018.

    Comments: 2018 NeurIPS Workshop on Critiquing and Correcting Trends in Machine Learning

  44. arXiv:1812.05050  [pdf, other

    cs.CV

    Fast Online Object Tracking and Segmentation: A Unifying Approach

    Authors: Qiang Wang, Li Zhang, Luca Bertinetto, Weiming Hu, Philip H. S. Torr

    Abstract: In this paper we illustrate how to perform both visual object tracking and semi-supervised video object segmentation, in real-time, with a single simple approach. Our method, dubbed SiamMask, improves the offline training procedure of popular fully-convolutional Siamese approaches for object tracking by augmenting their loss with a binary segmentation task. Once trained, SiamMask solely relies on… ▽ More

    Submitted 4 May, 2019; v1 submitted 12 December, 2018; originally announced December 2018.

    Comments: CVPR 2019 camera ready. Code available at https://github.com/foolwood/SiamMask

  45. arXiv:1812.04353  [pdf, other

    cs.CV cs.LG

    Proximal Mean-field for Neural Network Quantization

    Authors: Thalaiyasingam Ajanthan, Puneet K. Dokania, Richard Hartley, Philip H. S. Torr

    Abstract: Compressing large Neural Networks (NN) by quantizing the parameters, while maintaining the performance is highly desirable due to reduced memory and time complexity. In this work, we cast NN quantization as a discrete labelling problem, and by examining relaxations, we design an efficient iterative optimization procedure that involves stochastic gradient descent followed by a projection. We prove… ▽ More

    Submitted 19 August, 2019; v1 submitted 11 December, 2018; originally announced December 2018.

    Journal ref: ICCV, 2019

  46. arXiv:1812.01397  [pdf, other

    cs.CV

    Meta Learning Deep Visual Words for Fast Video Object Segmentation

    Authors: Harkirat Singh Behl, Mohammad Najafi, Anurag Arnab, Philip H. S. Torr

    Abstract: Personal robots and driverless cars need to be able to operate in novel environments and thus quickly and efficiently learn to recognise new object classes. We address this problem by considering the task of video object segmentation. Previous accurate methods for this task finetune a model using the first annotated frame, and/or use additional inputs such as optical flow and complex post-processi… ▽ More

    Submitted 16 August, 2020; v1 submitted 4 December, 2018; originally announced December 2018.

    Journal ref: In Proceedings of International Conference on Intelligent Robots and Systems (IROS) 2020

  47. arXiv:1811.07807  [pdf, other

    cs.CV

    Deeper Interpretability of Deep Networks

    Authors: Tian Xu, Jiayu Zhan, Oliver G. B. Garrod, Philip H. S. Torr, Song-Chun Zhu, Robin A. A. Ince, Philippe G. Schyns

    Abstract: Deep Convolutional Neural Networks (CNNs) have been one of the most influential recent developments in computer vision, particularly for categorization. There is an increasing demand for explainable AI as these systems are deployed in the real world. However, understanding the information represented and processed in CNNs remains in most cases challenging. Within this paper, we explore the use of… ▽ More

    Submitted 20 November, 2018; v1 submitted 19 November, 2018; originally announced November 2018.

  48. R$^3$SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems

    Authors: Oscar Rahnama, Tommaso Cavallari, Stuart Golodetz, Simon Walker, Philip H. S. Torr

    Abstract: Stereo depth estimation is used for many computer vision applications. Though many popular methods strive solely for depth quality, for real-time mobile applications (e.g. prosthetic glasses or micro-UAVs), speed and power efficiency are equally, if not more, important. Many real-world systems rely on Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but power efficiency is… ▽ More

    Submitted 30 October, 2018; originally announced October 2018.

    Comments: Accepted in FPT 2018 as Oral presentation, 8 pages, 6 figures, 4 tables

    Journal ref: 2018 International Conference on Field-Programmable Technology (FPT)

  49. Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade

    Authors: Tommaso Cavallari, Stuart Golodetz, Nicholas A. Lord, Julien Valentin, Victor A. Prisacariu, Luigi Di Stefano, Philip H. S. Torr

    Abstract: Camera pose estimation is an important problem in computer vision. Common techniques either match the current image against keyframes with known poses, directly regress the pose, or establish correspondences between keypoints in the image and points in the scene to estimate the pose. In recent years, regression forests have become a popular alternative to establish such correspondences. They achie… ▽ More

    Submitted 2 July, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: Tommaso Cavallari, Stuart Golodetz, Nicholas Lord and Julien Valentin assert joint first authorship

    MSC Class: 68T45

  50. arXiv:1810.11702  [pdf, other

    cs.MA cs.AI cs.GT cs.LG

    Multi-Agent Common Knowledge Reinforcement Learning

    Authors: Christian A. Schroeder de Witt, Jakob N. Foerster, Gregory Farquhar, Philip H. S. Torr, Wendelin Boehmer, Shimon Whiteson

    Abstract: Cooperative multi-agent reinforcement learning often requires decentralised policies, which severely limit the agents' ability to coordinate their behaviour. In this paper, we show that common knowledge between agents allows for complex decentralised coordination. Common knowledge arises naturally in a large number of decentralised cooperative multi-agent tasks, for example, when agents can recons… ▽ More

    Submitted 11 January, 2020; v1 submitted 27 October, 2018; originally announced October 2018.

    Comments: Advances in Neural Information Processing Systems, 9924-9935