Skip to main content

Showing 51–100 of 170 results for author: Hengel, A V D

.
  1. arXiv:2006.00753  [pdf, other

    cs.CV

    Structured Multimodal Attentions for TextVQA

    Authors: Chenyu Gao, Qi Zhu, Peng Wang, Hui Li, Yuliang Liu, Anton van den Hengel, Qi Wu

    Abstract: In this paper, we propose an end-to-end structured multimodal attention (SMA) neural network to mainly solve the first two issues above. SMA first uses a structural graph representation to encode the object-object, object-text and text-text relationships appearing in the image, and then designs a multimodal graph attention network to reason over it. Finally, the outputs from the above modules are… ▽ More

    Submitted 25 November, 2021; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: winner of TextVQA Challenge 2020, Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence

  2. arXiv:2005.09241  [pdf, other

    cs.CV cs.LG

    On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law

    Authors: Damien Teney, Kushal Kafle, Robik Shrestha, Ehsan Abbasnejad, Christopher Kanan, Anton van den Hengel

    Abstract: Out-of-distribution (OOD) testing is increasingly popular for evaluating a machine learning system's ability to generalize beyond the biases of a training set. OOD benchmarks are designed to present a different joint distribution of data and labels between training and test time. VQA-CP has become the standard OOD benchmark for visual question answering, but we discovered three troubling practices… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

  3. arXiv:2005.01239  [pdf, other

    cs.CV cs.LG

    Visual Question Answering with Prior Class Semantics

    Authors: Violetta Shevchenko, Damien Teney, Anthony Dick, Anton van den Hengel

    Abstract: We present a novel mechanism to embed prior knowledge in a model for visual question answering. The open-set nature of the task is at odds with the ubiquitous approach of training of a fixed classifier. We show how to exploit additional information pertaining to the semantics of candidate answers. We extend the answer prediction process with a regression objective in a semantic space, in which we… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

  4. arXiv:2004.09034  [pdf, other

    cs.CV cs.LG

    Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision

    Authors: Damien Teney, Ehsan Abbasnedjad, Anton van den Hengel

    Abstract: One of the primary challenges limiting the applicability of deep learning is its susceptibility to learning spurious correlations rather than the underlying mechanisms of the task of interest. The resulting failure to generalise cannot be addressed by simply using more data from the same distribution. We propose an auxiliary training objective that improves the generalization capabilities of neura… ▽ More

    Submitted 19 April, 2020; originally announced April 2020.

  5. arXiv:2003.06780  [pdf, other

    cs.CV

    Self-trained Deep Ordinal Regression for End-to-End Video Anomaly Detection

    Authors: Guansong Pang, Cheng Yan, Chunhua Shen, Anton van den Hengel, Xiao Bai

    Abstract: Video anomaly detection is of critical practical importance to a variety of real applications because it allows human attention to be focused on events that are likely to be of interest, in spite of an otherwise overwhelming volume of video. We show that applying self-trained deep ordinal regression to video anomaly detection overcomes two key limitations of existing methods, namely, 1) being high… ▽ More

    Submitted 15 March, 2020; originally announced March 2020.

    Comments: Accepted to Proc. IEEE Conf. Computer Vision and Pattern Recognition 2020

  6. arXiv:2002.11894  [pdf, other

    cs.CV

    Unshuffling Data for Improved Generalization

    Authors: Damien Teney, Ehsan Abbasnejad, Anton van den Hengel

    Abstract: Generalization beyond the training distribution is a core challenge in machine learning. The common practice of mixing and shuffling examples when training neural networks may not be optimal in this regard. We show that partitioning the data into well-chosen, non-i.i.d. subsets treated as multiple training environments can guide the learning of models with better out-of-distribution generalization… ▽ More

    Submitted 20 November, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

  7. arXiv:2002.10215  [pdf, other

    cs.CV

    On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

    Authors: Xinyu Wang, Yuliang Liu, Chunhua Shen, Chun Chet Ng, Canjie Luo, Lianwen **, Chee Seng Chan, Anton van den Hengel, Liangwei Wang

    Abstract: Visual Question Answering (VQA) methods have made incredible progress, but suffer from a failure to generalize. This is visible in the fact that they are vulnerable to learning coincidental correlations in the data rather than deeper relations between image content and ideas expressed in language. We present a dataset that takes a step towards addressing this problem in that it contains questions… ▽ More

    Submitted 25 February, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: Accepted to Proc. IEEE Conf. Computer Vision and Pattern Recognition 2020

  8. arXiv:2001.02381  [pdf, other

    eess.IV cs.CV

    Learning to Zoom-in via Learning to Zoom-out: Real-world Super-resolution by Generating and Adapting Degradation

    Authors: Dong Gong, Wei Sun, Qinfeng Shi, Anton van den Hengel, Yanning Zhang

    Abstract: Most learning-based super-resolution (SR) methods aim to recover high-resolution (HR) image from a given low-resolution (LR) image via learning on LR-HR image pairs. The SR methods learned on synthetic data do not perform well in real-world, due to the domain gap between the artificially synthesized and real LR images. Some efforts are thus taken to capture real-world image pairs. The captured LR-… ▽ More

    Submitted 8 January, 2020; originally announced January 2020.

  9. arXiv:1911.08623  [pdf, other

    cs.LG stat.ML

    Deep Anomaly Detection with Deviation Networks

    Authors: Guansong Pang, Chunhua Shen, Anton van den Hengel

    Abstract: Although deep learning has been applied to successfully address many data mining problems, relatively limited work has been done on deep learning for anomaly detection. Existing deep anomaly detection methods, which focus on learning new feature representations to enable downstream anomaly detection methods, perform indirect optimization of anomaly scores, leading to data-inefficient learning and… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: 10 Pages, Published in KDD19

  10. arXiv:1910.13601  [pdf, other

    cs.LG stat.ML

    Deep Weakly-supervised Anomaly Detection

    Authors: Guansong Pang, Chunhua Shen, Huidong **, Anton van den Hengel

    Abstract: Recent semi-supervised anomaly detection methods that are trained using small labeled anomaly examples and large unlabeled data (mostly normal data) have shown largely improved performance over unsupervised methods. However, these methods often focus on fitting abnormalities illustrated by the given anomaly examples only (i.e.,, seen anomalies), and consequently they fail to generalize to those th… ▽ More

    Submitted 5 June, 2023; v1 submitted 29 October, 2019; originally announced October 2019.

    Comments: Accepted to KDD 2023

  11. REFUGE Challenge: A Unified Framework for Evaluating Automated Methods for Glaucoma Assessment from Fundus Photographs

    Authors: José Ignacio Orlando, Huazhu Fu, João Barbossa Breda, Karel van Keer, Deepti R. Bathula, Andrés Diaz-Pinto, Ruogu Fang, Pheng-Ann Heng, Jeyoung Kim, JoonHo Lee, Joonseok Lee, Xiaoxiao Li, Peng Liu, Shuai Lu, Balamurali Murugesan, Valery Naranjo, Sai Samarth R. Phaye, Sharath M. Shankaranarayana, Apoorva Sikka, Jaemin Son, Anton van den Hengel, Shujun Wang, Junyan Wu, Zifeng Wu, Guanghui Xu , et al. (7 additional authors not shown)

    Abstract: Glaucoma is one of the leading causes of irreversible but preventable blindness in working age populations. Color fundus photography (CFP) is the most cost-effective imaging modality to screen for retinal disorders. However, its application to glaucoma has been limited to the computation of a few related biomarkers such as the vertical cup-to-disc ratio. Deep learning approaches, although widely a… ▽ More

    Submitted 8 October, 2019; originally announced October 2019.

    Comments: Accepted for publication in Medical Image Analysis

  12. arXiv:1909.13471  [pdf, other

    cs.CV cs.LG

    On Incorporating Semantic Prior Knowledge in Deep Learning Through Embedding-Space Constraints

    Authors: Damien Teney, Ehsan Abbasnejad, Anton van den Hengel

    Abstract: The knowledge that humans hold about a problem often extends far beyond a set of training data and output labels. While the success of deep learning mostly relies on supervised training, important properties cannot be inferred efficiently from end-to-end annotations alone, for example causal relations or domain-specific invariances. We present a general technique to supplement supervised training… ▽ More

    Submitted 16 November, 2019; v1 submitted 30 September, 2019; originally announced September 2019.

  13. arXiv:1907.12271  [pdf, other

    cs.CV

    V-PROM: A Benchmark for Visual Reasoning Using Visual Progressive Matrices

    Authors: Damien Teney, Peng Wang, Jiewei Cao, Lingqiao Liu, Chunhua Shen, Anton van den Hengel

    Abstract: One of the primary challenges faced by deep learning is the degree to which current methods exploit superficial statistics and dataset bias, rather than learning to generalise over the specific representations they have experienced. This is a critical concern because generalisation enables robust reasoning over unseen data, whereas leveraging superficial statistics is fragile to even small changes… ▽ More

    Submitted 29 July, 2019; originally announced July 2019.

  14. arXiv:1905.05404  [pdf, other

    cs.CV

    An Effective Two-Branch Model-Based Deep Network for Single Image Deraining

    Authors: Yinglong Wang, Dong Gong, Jie Yang, Qinfeng Shi, Anton van den Hengel, Dehua Xie, Bing Zeng

    Abstract: Removing rain effects from an image is of importance for various applications such as autonomous driving, drone piloting, and photo editing. Conventional methods rely on some heuristics to handcraft various priors to remove or separate the rain effects from an image. Recent deep learning models are proposed to learn end-to-end methods to complete this task. However, they often fail to obtain satis… ▽ More

    Submitted 17 September, 2019; v1 submitted 14 May, 2019; originally announced May 2019.

    Comments: 10 pages, 9 figures, 3 tables

  15. Show, Price and Negotiate: A Negotiator with Online Value Look-Ahead

    Authors: Amin Parvaneh, Ehsan Abbasnejad, Qi Wu, Javen Qinfeng Shi, Anton van den Hengel

    Abstract: Negotiation, as an essential and complicated aspect of online shop**, is still challenging for an intelligent agent. To that end, we propose the Price Negotiator, a modular deep neural network that addresses the unsolved problems in recent studies by (1) considering images of the items as a crucial, though neglected, source of information in a negotiation, (2) heuristically finding the most simi… ▽ More

    Submitted 12 March, 2021; v1 submitted 7 May, 2019; originally announced May 2019.

    Comments: published in IEEE Transactions on Multimedia

  16. arXiv:1904.10293  [pdf, other

    cs.CV

    Attention-guided Network for Ghost-free High Dynamic Range Imaging

    Authors: Qingsen Yan, Dong Gong, Qinfeng Shi, Anton van den Hengel, Chunhua Shen, Ian Reid, Yanning Zhang

    Abstract: Ghosting artifacts caused by moving objects or misalignments is a key challenge in high dynamic range (HDR) imaging for dynamic scenes. Previous methods first register the input low dynamic range (LDR) images using optical flow before merging them, which are error-prone and cause ghosts in results. A very recent work tries to bypass optical flows via a deep network with skip-connections, however,… ▽ More

    Submitted 23 April, 2019; originally announced April 2019.

    Comments: Accepted to appear at CVPR 2019

  17. arXiv:1904.10151  [pdf, other

    cs.CV cs.CL

    REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments

    Authors: Yuankai Qi, Qi Wu, Peter Anderson, Xin Wang, William Yang Wang, Chunhua Shen, Anton van den Hengel

    Abstract: One of the long-term challenges of robotics is to enable robots to interact with humans in the visual world via natural language, as humans are visual animals that communicate through language. Overcoming this challenge requires the ability to perform a wide variety of complex tasks in response to multifarious instructions from humans. In the hope that it might drive progress towards more flexible… ▽ More

    Submitted 5 January, 2020; v1 submitted 23 April, 2019; originally announced April 2019.

  18. arXiv:1904.03367  [pdf, other

    cs.LG stat.ML

    Reinforcement Learning with Attention that Works: A Self-Supervised Approach

    Authors: Anthony Manchin, Ehsan Abbasnejad, Anton van den Hengel

    Abstract: Attention models have had a significant positive impact on deep learning across a range of tasks. However previous attempts at integrating attention with reinforcement learning have failed to produce significant improvements. We propose the first combination of self attention and reinforcement learning that is capable of producing significant improvements, including new state of the art results in… ▽ More

    Submitted 6 April, 2019; originally announced April 2019.

  19. arXiv:1904.02865  [pdf, other

    cs.CV

    Actively Seeking and Learning from Live Data

    Authors: Damien Teney, Anton van den Hengel

    Abstract: One of the key limitations of traditional machine learning methods is their requirement for training data that exemplifies all the information to be learned. This is a particular problem for visual question answering methods, which may be asked questions about virtually anything. The approach we propose is a step toward overcoming this limitation by searching for the information required at test t… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

  20. arXiv:1904.02639  [pdf, other

    cs.CV

    Memorizing Normality to Detect Anomaly: Memory-augmented Deep Autoencoder for Unsupervised Anomaly Detection

    Authors: Dong Gong, Lingqiao Liu, Vuong Le, Budhaditya Saha, Moussa Reda Mansour, Svetha Venkatesh, Anton van den Hengel

    Abstract: Deep autoencoder has been extensively used for anomaly detection. Training on the normal data, the autoencoder is expected to produce higher reconstruction error for the abnormal inputs than the normal ones, which is adopted as a criterion for identifying anomalies. However, this assumption does not always hold in practice. It has been observed that sometimes the autoencoder "generalizes" so well… ▽ More

    Submitted 6 August, 2019; v1 submitted 4 April, 2019; originally announced April 2019.

    Comments: Accepted to appear at ICCV 2019

  21. arXiv:1812.06401  [pdf, other

    cs.AI cs.CL cs.LG

    What's to know? Uncertainty as a Guide to Asking Goal-oriented Questions

    Authors: Ehsan Abbasnejad, Qi Wu, Javen Shi, Anton van den Hengel

    Abstract: One of the core challenges in Visual Dialogue problems is asking the question that will provide the most useful information towards achieving the required objective. Encouraging an agent to ask the right questions is difficult because we don't know a-priori what information the agent will need to achieve its task, and we don't have an explicit model of what it knows already. We propose a solution… ▽ More

    Submitted 16 December, 2018; originally announced December 2018.

  22. arXiv:1812.06398  [pdf, other

    cs.LG stat.ML

    Gold Seeker: Information Gain from Policy Distributions for Goal-oriented Vision-and-Langauge Reasoning

    Authors: Ehsan Abbasnejad, Iman Abbasnejad, Qi Wu, Javen Shi, Anton van den Hengel

    Abstract: As Computer Vision moves from a passive analysis of pixels to active analysis of semantics, the breadth of information algorithms need to reason over has expanded significantly. One of the key challenges in this vein is the ability to identify the information required to make a decision, and select an action that will recover it. We propose a reinforcement-learning approach that maintains a distri… ▽ More

    Submitted 29 March, 2020; v1 submitted 16 December, 2018; originally announced December 2018.

  23. arXiv:1812.04794  [pdf, other

    cs.CV

    Neighbourhood Watch: Referring Expression Comprehension via Language-guided Graph Attention Networks

    Authors: Peng Wang, Qi Wu, Jiewei Cao, Chunhua Shen, Lianli Gao, Anton van den Hengel

    Abstract: The task in referring expression comprehension is to localise the object instance in an image described by a referring expression phrased in natural language. As a language-to-vision matching task, the key to this problem is to learn a discriminative object feature that can adapt to the expression used. To avoid ambiguity, the expression normally tends to describe not only the properties of the re… ▽ More

    Submitted 11 December, 2018; originally announced December 2018.

  24. arXiv:1811.11903  [pdf, other

    cs.CV

    Visual Question Answering as Reading Comprehension

    Authors: Hui Li, Peng Wang, Chunhua Shen, Anton van den Hengel

    Abstract: Visual question answering (VQA) demands simultaneous comprehension of both the image visual content and natural language questions. In some cases, the reasoning needs the help of common sense or general knowledge which usually appear in the form of text. Current methods jointly embed both the visual information and the textual feature into the same space. However, how to model the complex interact… ▽ More

    Submitted 28 November, 2018; originally announced November 2018.

  25. arXiv:1810.10117  [pdf, other

    cs.CV

    End-to-End Diagnosis and Segmentation Learning from Cardiac Magnetic Resonance Imaging

    Authors: Gerard Snaauw, Dong Gong, Gabriel Maicas, Anton van den Hengel, Wiro J. Niessen, Johan Verjans, Gustavo Carneiro

    Abstract: Cardiac magnetic resonance (CMR) is used extensively in the diagnosis and management of cardiovascular disease. Deep learning methods have proven to deliver segmentation results comparable to human experts in CMR imaging, but there have been no convincing results for the problem of end-to-end segmentation and diagnosis from CMR. This is in part due to a lack of sufficiently large datasets required… ▽ More

    Submitted 23 October, 2018; originally announced October 2018.

    Comments: submitted to 2019 IEEE International Symposium on Biomedical Imaging (ISBI)

  26. MPTV: Matching Pursuit Based Total Variation Minimization for Image Deconvolution

    Authors: Dong Gong, Mingkui Tan, Qinfeng Shi, Anton van den Hengel, Yanning Zhang

    Abstract: Total variation (TV) regularization has proven effective for a range of computer vision tasks through its preferential weighting of sharp image edges. Existing TV-based methods, however, often suffer from the over-smoothing issue and solution bias caused by the homogeneous penalization. In this paper, we consider addressing these issues by applying inhomogeneous regularization on different image c… ▽ More

    Submitted 12 October, 2018; originally announced October 2018.

  27. arXiv:1808.10075  [pdf, other

    cs.CV

    Towards Effective Deep Embedding for Zero-Shot Learning

    Authors: Lei Zhang, Peng Wang, Lingqiao Liu, Chunhua Shen, Wei Wei, Yannning Zhang, Anton Van Den Hengel

    Abstract: Zero-shot learning (ZSL) can be formulated as a cross-domain matching problem: after being projected into a joint embedding space, a visual sample will match against all candidate class-level semantic descriptions and be assigned to the nearest class. In this process, the embedding space underpins the success of such matching and is crucial for ZSL. In this paper, we conduct an in-depth study on t… ▽ More

    Submitted 10 December, 2018; v1 submitted 29 August, 2018; originally announced August 2018.

    Comments: Working in progress

  28. arXiv:1806.01576  [pdf, other

    cs.CV

    Adaptive Importance Learning for Improving Lightweight Image Super-resolution Network

    Authors: Lei Zhang, Peng Wang, Chunhua Shen, Lingqiao Liu, Wei Wei, Yanning Zhang, Anton van den Hengel

    Abstract: Deep neural networks have achieved remarkable success in single image super-resolution (SISR). The computing and memory requirements of these methods have hindered their application to broad classes of real devices with limited computing power, however. One approach to this problem has been lightweight network architectures that bal- ance the super-resolution performance and the computation burden… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.

    Comments: 16 pages

  29. arXiv:1804.03368  [pdf, other

    cs.CV

    Learning Deep Gradient Descent Optimization for Image Deconvolution

    Authors: Dong Gong, Zhen Zhang, Qinfeng Shi, Anton van den Hengel, Chunhua Shen, Yanning Zhang

    Abstract: As an integral component of blind image deblurring, non-blind deconvolution removes image blur with a given blur kernel, which is essential but difficult due to the ill-posed nature of the inverse problem. The predominant approach is based on optimization subject to regularization functions that are either manually designed, or learned from examples. Existing learning based methods have shown supe… ▽ More

    Submitted 17 February, 2020; v1 submitted 10 April, 2018; originally announced April 2018.

  30. arXiv:1712.00213  [pdf, other

    cs.CV

    Real-time Semantic Image Segmentation via Spatial Sparsity

    Authors: Zifeng Wu, Chunhua Shen, Anton van den Hengel

    Abstract: We propose an approach to semantic (image) segmentation that reduces the computational costs by a factor of 25 with limited impact on the quality of results. Semantic segmentation has a number of practical applications, and for most such applications the computational costs are critical. The method follows a typical two-column network structure, where one column accepts an input image, while the o… ▽ More

    Submitted 1 December, 2017; originally announced December 2017.

  31. arXiv:1711.08105  [pdf, other

    cs.CV

    Visual Question Answering as a Meta Learning Task

    Authors: Damien Teney, Anton van den Hengel

    Abstract: The predominant approach to Visual Question Answering (VQA) demands that the model represents within its weights all of the information required to answer any question about any image. Learning this information from any real training set seems unlikely, and representing it in a reasonable number of weights doubly so. We propose instead to approach VQA as a meta learning task, thus separating the q… ▽ More

    Submitted 21 November, 2017; originally announced November 2017.

  32. arXiv:1711.07614  [pdf, other

    cs.CV cs.AI cs.CL

    Asking the Difficult Questions: Goal-Oriented Visual Question Generation via Intermediate Rewards

    Authors: Junjie Zhang, Qi Wu, Chunhua Shen, Jian Zhang, Jianfeng Lu, Anton van den Hengel

    Abstract: Despite significant progress in a variety of vision-and-language problems, develo** a method capable of asking intelligent, goal-oriented questions about images is proven to be an inscrutable challenge. Towards this end, we propose a Deep Reinforcement Learning framework based on three new intermediate rewards, namely goal-achieved, progressive and informativeness that encourage the generation o… ▽ More

    Submitted 20 November, 2017; originally announced November 2017.

  33. arXiv:1711.07613  [pdf, other

    cs.CV cs.AI cs.CL

    Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning

    Authors: Qi Wu, Peng Wang, Chunhua Shen, Ian Reid, Anton van den Hengel

    Abstract: The Visual Dialogue task requires an agent to engage in a conversation about an image with a human. It represents an extension of the Visual Question Answering task in that the agent needs to answer a question about an image, but it needs to do so in light of the previous dialogue that has taken place. The key challenge in Visual Dialogue is thus maintaining a consistent, and natural dialogue whil… ▽ More

    Submitted 20 November, 2017; originally announced November 2017.

  34. arXiv:1711.07280  [pdf, other

    cs.CV cs.AI cs.CL cs.RO

    Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments

    Authors: Peter Anderson, Qi Wu, Damien Teney, Jake Bruce, Mark Johnson, Niko Sünderhauf, Ian Reid, Stephen Gould, Anton van den Hengel

    Abstract: A robot that can carry out a natural-language instruction has been a dream since before the Jetsons cartoon series imagined a life of leisure mediated by a fleet of attentive robot helpers. It is a dream that remains stubbornly distant. However, recent advances in vision and language methods have made incredible progress in closely related areas. This is significant because a robot interpreting a… ▽ More

    Submitted 5 April, 2018; v1 submitted 20 November, 2017; originally announced November 2017.

    Comments: CVPR 2018 Spotlight presentation

  35. arXiv:1711.06370  [pdf, other

    cs.CV

    Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries

    Authors: Bohan Zhuang, Qi Wu, Chunhua Shen, Ian Reid, Anton van den Hengel

    Abstract: Recognising objects according to a pre-defined fixed set of class labels has been well studied in the Computer Vision. There are a great many practical applications where the subjects that may be of interest are not known beforehand, or so easily delineated, however. In many of these cases natural language dialog is a natural way to specify the subject of interest, and the task achieving this capa… ▽ More

    Submitted 16 November, 2017; originally announced November 2017.

    Comments: 11 pages

  36. arXiv:1708.02711  [pdf, other

    cs.CV cs.CL

    Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge

    Authors: Damien Teney, Peter Anderson, Xiaodong He, Anton van den Hengel

    Abstract: This paper presents a state-of-the-art model for visual question answering (VQA), which won the first place in the 2017 VQA Challenge. VQA is a task of significant importance for research in artificial intelligence, given its multimodal nature, clear evaluation protocol, and potential real-world applications. The performance of deep neural networks for VQA is very dependent on choices of architect… ▽ More

    Submitted 9 August, 2017; originally announced August 2017.

    Comments: Winner of the 2017 Visual Question Answering (VQA) Challenge at CVPR

  37. arXiv:1708.01008  [pdf, other

    cs.CV

    Beyond Low Rank: A Data-Adaptive Tensor Completion Method

    Authors: Lei Zhang, Wei Wei, Qinfeng Shi, Chunhua Shen, Anton van den Hengel, Yanning Zhang

    Abstract: Low rank tensor representation underpins much of recent progress in tensor completion. In real applications, however, this approach is confronted with two challenging problems, namely (1) tensor rank determination; (2) handling real tensor data which only approximately fulfils the low-rank requirement. To address these two issues, we develop a data-adaptive tensor completion model which explicitly… ▽ More

    Submitted 3 August, 2017; originally announced August 2017.

    Comments: 14 pages, 5 figures

  38. arXiv:1707.05956  [pdf, other

    cs.CV

    When Unsupervised Domain Adaptation Meets Tensor Representations

    Authors: Hao Lu, Lei Zhang, Zhiguo Cao, Wei Wei, Ke Xian, Chunhua Shen, Anton van den Hengel

    Abstract: Domain adaption (DA) allows machine learning methods trained on data sampled from one distribution to be applied to data sampled from another. It is thus of great practical importance to the application of such methods. Despite the fact that tensor representations are widely used in Computer Vision to capture multi-linear relationships that affect the data, most existing DA methods are applicable… ▽ More

    Submitted 19 July, 2017; originally announced July 2017.

    Comments: 16 pages. Accepted to Proc. Int. Conf. Computer Vision (ICCV 2017)

  39. arXiv:1707.05427  [pdf, other

    cs.CV

    Visually Aligned Word Embeddings for Improving Zero-shot Learning

    Authors: Ruizhi Qiao, Lingqiao Liu, Chunhua Shen, Anton van den Hengel

    Abstract: Zero-shot learning (ZSL) highly depends on a good semantic embedding to connect the seen and unseen classes. Recently, distributed word embeddings (DWE) pre-trained from large text corpus have become a popular choice to draw such a connection. Compared with human defined attributes, DWEs are more scalable and easier to obtain. However, they are designed to reflect semantic similarity rather than v… ▽ More

    Submitted 17 July, 2017; originally announced July 2017.

    Comments: Appearing in Proc. British Mach. Vis. Conf. (BMVC) 2017

  40. arXiv:1707.04968  [pdf, other

    cs.CV cs.CL

    Visual Question Answering with Memory-Augmented Networks

    Authors: Chao Ma, Chunhua Shen, Anthony Dick, Qi Wu, Peng Wang, Anton van den Hengel, Ian Reid

    Abstract: In this paper, we exploit a memory-augmented neural network to predict accurate answers to visual questions, even when those answers occur rarely in the training set. The memory network incorporates both internal and external memory blocks and selectively pays attention to each training exemplar. We show that memory-augmented neural networks are able to maintain a relatively long-term memory of sc… ▽ More

    Submitted 25 March, 2018; v1 submitted 16 July, 2017; originally announced July 2017.

    Comments: CVPR 2018

  41. arXiv:1706.05477  [pdf, other

    cs.LG cs.AI stat.ML

    Bayesian Conditional Generative Adverserial Networks

    Authors: M. Ehsan Abbasnejad, Qinfeng Shi, Iman Abbasnejad, Anton van den Hengel, Anthony Dick

    Abstract: Traditional GANs use a deterministic generator function (typically a neural network) to transform a random noise input $z$ to a sample $\mathbf{x}$ that the discriminator seeks to distinguish. We propose a new GAN called Bayesian Conditional Generative Adversarial Networks (BC-GANs) that use a random generator function to transform a deterministic input $y'$ to a sample $\mathbf{x}$. Our BC-GANs e… ▽ More

    Submitted 17 June, 2017; originally announced June 2017.

  42. arXiv:1705.09892  [pdf, other

    cs.CV

    Care about you: towards large-scale human-centric visual relationship detection

    Authors: Bohan Zhuang, Qi Wu, Chunhua Shen, Ian Reid, Anton van den Hengel

    Abstract: Visual relationship detection aims to capture interactions between pairs of objects in images. Relationships between objects and humans represent a particularly important subset of this problem, with implications for challenges such as understanding human behaviour, and identifying affordances, amongst others. In addressing this problem we first construct a large-scale human-centric visual relatio… ▽ More

    Submitted 28 May, 2017; originally announced May 2017.

  43. arXiv:1612.05386  [pdf, other

    cs.CV

    The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions

    Authors: Peng Wang, Qi Wu, Chunhua Shen, Anton van den Hengel

    Abstract: One of the most intriguing features of the Visual Question Answering (VQA) challenge is the unpredictability of the questions. Extracting the information required to answer them demands a variety of image operations from detection and counting, to segmentation and reconstruction. To train a method to perform even one of these operations accurately from {image,question,answer} tuples would be chall… ▽ More

    Submitted 16 December, 2016; originally announced December 2016.

  44. arXiv:1612.02583  [pdf, other

    cs.CV

    From Motion Blur to Motion Flow: a Deep Learning Solution for Removing Heterogeneous Motion Blur

    Authors: Dong Gong, Jie Yang, Lingqiao Liu, Yanning Zhang, Ian Reid, Chunhua Shen, Anton van den Hengel, Qinfeng Shi

    Abstract: Removing pixel-wise heterogeneous motion blur is challenging due to the ill-posed nature of the problem. The predominant solution is to estimate the blur kernel by adding a prior, but the extensive literature on the subject indicates the difficulty in identifying a prior which is suitably informative, and general. Rather than imposing a prior based on theory, we propose instead to learn one from t… ▽ More

    Submitted 8 December, 2016; originally announced December 2016.

  45. arXiv:1611.10080  [pdf, other

    cs.CV

    Wider or Deeper: Revisiting the ResNet Model for Visual Recognition

    Authors: Zifeng Wu, Chunhua Shen, Anton van den Hengel

    Abstract: The trend towards increasingly deep neural networks has been driven by a general observation that increasing depth increases the performance of a network. Recently, however, evidence has been amassing that simply increasing depth may not be the best way to increase performance, particularly given other limitations. Investigations into deep residual networks have also suggested that they may not in… ▽ More

    Submitted 30 November, 2016; originally announced November 2016.

    Comments: Code available at: https://github.com/itijyou/ademxapp

  46. arXiv:1611.09967  [pdf, other

    cs.CV

    Sequential Person Recognition in Photo Albums with a Recurrent Network

    Authors: Yao Li, Guosheng Lin, Bohan Zhuang, Lingqiao Liu, Chunhua Shen, Anton van den Hengel

    Abstract: Recognizing the identities of people in everyday photos is still a very challenging problem for machine vision, due to non-frontal faces, changes in clothing, location, lighting and similar. Recent studies have shown that rich relational information between people in the same photo can help in recognizing their identities. In this work, we propose to model the relational information between people… ▽ More

    Submitted 29 November, 2016; originally announced November 2016.

  47. arXiv:1611.07800  [pdf, other

    cs.LG stat.ML

    Infinite Variational Autoencoder for Semi-Supervised Learning

    Authors: Ehsan Abbasnejad, Anthony Dick, Anton van den Hengel

    Abstract: This paper presents an infinite variational autoencoder (VAE) whose capacity adapts to suit the input data. This is achieved using a mixture model where the mixing coefficients are modeled by a Dirichlet process, allowing us to integrate over the coefficients when performing inference. Critically, this then allows us to automatically vary the number of autoencoders in the mixture based on the data… ▽ More

    Submitted 23 November, 2016; v1 submitted 23 November, 2016; originally announced November 2016.

  48. arXiv:1611.05546  [pdf, other

    cs.CV cs.AI cs.CL

    Zero-Shot Visual Question Answering

    Authors: Damien Teney, Anton van den Hengel

    Abstract: Part of the appeal of Visual Question Answering (VQA) is its promise to answer new questions about previously unseen images. Most current methods demand training questions that illustrate every possible concept, and will therefore never achieve this capability, since the volume of required training data would be prohibitive. Answering general questions about images requires methods capable of Zero… ▽ More

    Submitted 20 November, 2016; v1 submitted 16 November, 2016; originally announced November 2016.

  49. arXiv:1611.01773  [pdf, other

    cs.CV

    The Shallow End: Empowering Shallower Deep-Convolutional Networks through Auxiliary Outputs

    Authors: Yong Guo, Jian Chen, Qing Du, Anton Van Den Hengel, Qinfeng Shi, Mingkui Tan

    Abstract: Depth is one of the key factors behind the success of convolutional neural networks (CNNs). Since ResNet, we are able to train very deep CNNs as the gradient vanishing issue has been largely addressed by the introduction of skip connections. However, we observe that, when the depth is very large, the intermediate layers (especially shallow layers) may fail to receive sufficient supervision from th… ▽ More

    Submitted 15 February, 2020; v1 submitted 6 November, 2016; originally announced November 2016.

  50. arXiv:1609.05600  [pdf, other

    cs.CV cs.AI cs.CL

    Graph-Structured Representations for Visual Question Answering

    Authors: Damien Teney, Lingqiao Liu, Anton van den Hengel

    Abstract: This paper proposes to improve visual question answering (VQA) with structured representations of both scene contents and questions. A key challenge in VQA is to require joint reasoning over the visual and text domains. The predominant CNN/LSTM-based approach to VQA is limited by monolithic vector representations that largely ignore structure in the scene and in the form of the question. CNN featu… ▽ More

    Submitted 30 March, 2017; v1 submitted 19 September, 2016; originally announced September 2016.