Skip to main content

Showing 1–50 of 168 results for author: Hengel, A V D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.02880  [pdf, other

    cs.LG cs.AI cs.CV

    Knowledge Composition using Task Vectors with Learned Anisotropic Scaling

    Authors: Frederic Z. Zhang, Paul Albert, Cristian Rodriguez-Opazo, Anton van den Hengel, Ehsan Abbasnejad

    Abstract: Pre-trained models produce strong generic representations that can be adapted via fine-tuning. The learned weight difference relative to the pre-trained model, known as a task vector, characterises the direction and stride of fine-tuning. The significance of task vectors is such that simple arithmetic operations on them can be used to combine diverse representations from different domains. This pa… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2405.17139  [pdf, other

    cs.CV cs.AI cs.LG

    Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling

    Authors: Cristian Rodriguez-Opazo, Ehsan Abbasnejad, Damien Teney, Edison Marrese-Taylor, Hamed Damirchi, Anton van den Hengel

    Abstract: Contrastive Language-Image Pretraining (CLIP) stands out as a prominent method for image representation learning. Various architectures, from vision transformers (ViTs) to convolutional networks (ResNets) have been trained with CLIP to serve as general solutions to diverse vision tasks. This paper explores the differences across various CLIP-trained vision backbones. Despite using the same data an… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2312.14400

  3. arXiv:2404.13298  [pdf, other

    cs.IR eess.SY

    MARec: Metadata Alignment for cold-start Recommendation

    Authors: Julien Monteil, Volodymyr Vaskovych, Wentao Lu, Anirban Majumder, Anton van den Hengel

    Abstract: For many recommender systems the primary data source is a historical record of user clicks. The associated click matrix which is often very sparse, however, as the number of users x products can be far larger than the number of clicks, and such sparsity is accentuated in cold-start settings. The sparsity of the click matrix is the reason matrix factorization and autoencoders techniques remain high… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  4. arXiv:2404.02388  [pdf, other

    cs.CV

    CAPE: CAM as a Probabilistic Ensemble for Enhanced DNN Interpretation

    Authors: Townim Faisal Chowdhury, Kewen Liao, Vu Minh Hieu Phan, Minh-Son To, Yutong Xie, Kevin Hung, David Ross, Anton van den Hengel, Johan W. Verjans, Zhibin Liao

    Abstract: Deep Neural Networks (DNNs) are widely used for visual classification tasks, but their complex computation process and black-box nature hinder decision transparency and interpretability. Class activation maps (CAMs) and recent variants provide ways to visually explain the DNN decision-making process by displaying 'attention' heatmaps of the DNNs. Nevertheless, the CAM explanation only offers relat… ▽ More

    Submitted 4 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  5. arXiv:2403.15711  [pdf, other

    cs.LG stat.ME stat.ML

    Identifiable Latent Neural Causal Models

    Authors: Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi

    Abstract: Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data. It is particularly good at predictions under unseen distribution shifts, because these shifts can generally be interpreted as consequences of interventions. Hence leveraging {seen} distribution shifts becomes a natural strategy to help identifying causal representations, which in… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  6. arXiv:2403.07356  [pdf, other

    cs.CV cs.LG

    Premonition: Using Generative Models to Preempt Future Data Changes in Continual Learning

    Authors: Mark D. McDonnell, Dong Gong, Ehsan Abbasnejad, Anton van den Hengel

    Abstract: Continual learning requires a model to adapt to ongoing changes in the data distribution, and often to the set of tasks to be performed. It is rare, however, that the data and task changes are completely unpredictable. Given a description of an overarching goal or data theme, which we call a realm, humans can often guess what concepts are associated with it. We show here that the combination of a… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 31 pages total (14 main paper, 5 references, 12 appendices)

  7. arXiv:2402.18842  [pdf, other

    cs.CV

    ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

    Authors: Xianghui Yang, Yan Zuo, Sameera Ramasinghe, Loris Bazzani, Gil Avraham, Anton van den Hengel

    Abstract: Novel-view synthesis through diffusion models has demonstrated remarkable potential for generating diverse and high-quality images. Yet, the independent process of image generation in these prevailing methods leads to challenges in maintaining multiple-view consistency. To address this, we introduce ViewFusion, a novel, training-free algorithm that can be seamlessly integrated into existing pre-tr… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: CVPR2024,homepage:https://wi-sc.github.io/ViewFusion.github.io/

  8. arXiv:2402.12636  [pdf, other

    cs.CL

    StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing

    Authors: Gaoxiang Cong, Yuankai Qi, Liang Li, Amin Beheshti, Zhedong Zhang, Anton van den Hengel, Ming-Hsuan Yang, Chenggang Yan, Qingming Huang

    Abstract: Given a script, the challenge in Movie Dubbing (Visual Voice Cloning, V2C) is to generate speech that aligns well with the video in both time and emotion, based on the tone of a reference audio track. Existing state-of-the-art V2C models break the phonemes in the script according to the divisions between video frames, which solves the temporal alignment problem but leads to incomplete phoneme pron… ▽ More

    Submitted 1 July, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  9. arXiv:2402.06223  [pdf, other

    cs.LG cs.CV stat.ML

    Revealing Multimodal Contrastive Representation Learning through Latent Partial Causal Models

    Authors: Yuhang Liu, Zhen Zhang, Dong Gong, Biwei Huang, Mingming Gong, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi

    Abstract: Multimodal contrastive representation learning methods have proven successful across a range of domains, partly due to their ability to generate meaningful shared representations of complex phenomena. To enhance the depth of analysis and understanding of these acquired representations, we introduce a unified causal model specifically designed for multimodal data. By examining this model, we show t… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  10. arXiv:2402.01157  [pdf, other

    cs.CV

    Source-Free Unsupervised Domain Adaptation with Hypothesis Consolidation of Prediction Rationale

    Authors: Yangyang Shu, Xiaofeng Cao, Qi Chen, Bowen Zhang, Ziqin Zhou, Anton van den Hengel, Lingqiao Liu

    Abstract: Source-Free Unsupervised Domain Adaptation (SFUDA) is a challenging task where a model needs to be adapted to a new domain without access to target domain labels or source domain data. The primary difficulty in this task is that the model's predictions may be inaccurate, and using these inaccurate predictions for model adaptation can lead to misleading results. To address this issue, this paper pr… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  11. arXiv:2312.14400  [pdf, other

    cs.CV

    Unveiling Backbone Effects in CLIP: Exploring Representational Synergies and Variances

    Authors: Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Ehsan Abbasnejad, Hamed Damirchi, Ignacio M. Jara, Felipe Bravo-Marquez, Anton van den Hengel

    Abstract: Contrastive Language-Image Pretraining (CLIP) stands out as a prominent method for image representation learning. Various neural architectures, spanning Transformer-based models like Vision Transformers (ViTs) to Convolutional Networks (ConvNets) like ResNets, are trained with CLIP and serve as universal backbones across diverse vision tasks. Despite utilizing the same data and training objectives… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

  12. arXiv:2312.05923  [pdf, other

    cs.CV

    Weakly Supervised Video Individual CountingWeakly Supervised Video Individual Counting

    Authors: Xinyan Liu, Guorong Li, Yuankai Qi, Ziheng Yan, Zhenjun Han, Anton van den Hengel, Ming-Hsuan Yang, Qingming Huang

    Abstract: Video Individual Counting (VIC) aims to predict the number of unique individuals in a single video. % Existing methods learn representations based on trajectory labels for individuals, which are annotation-expensive. % To provide a more realistic reflection of the underlying practical challenge, we introduce a weakly supervised VIC task, wherein trajectory labels are not provided. Instead, two typ… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  13. arXiv:2311.17949  [pdf, other

    cs.CV

    Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines

    Authors: Hamed Damirchi, Cristian Rodríguez-Opazo, Ehsan Abbasnejad, Damien Teney, Javen Qinfeng Shi, Stephen Gould, Anton van den Hengel

    Abstract: Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box. The Web likely contains the information necessary to excel on any specific application, but identifying the right data a priori is challenging. This paper shows how to leverage recent advances in NLP and multi-modal le… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  14. arXiv:2310.15580  [pdf, other

    cs.LG

    Identifiable Latent Polynomial Causal Models Through the Lens of Change

    Authors: Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi

    Abstract: Causal representation learning aims to unveil latent high-level causal representations from observed low-level data. One of its primary tasks is to provide reliable assurance of identifying these latent causal models, known as identifiability. A recent breakthrough explores identifiability by leveraging the change of causal influences among latent causal variables across multiple environments \cit… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  15. arXiv:2308.11158  [pdf, other

    cs.CV

    Domain Generalization via Rationale Invariance

    Authors: Liang Chen, Yong Zhang, Yibing Song, Anton van den Hengel, Lingqiao Liu

    Abstract: This paper offers a new perspective to ease the challenge of domain generalization, which involves maintaining robust results even in unseen environments. Our design focuses on the decision-making process in the final classifier layer. Specifically, we propose treating the element-wise contributions to the final results as the rationale for making a decision and representing the rationale for each… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: Accepted in ICCV 2023

  16. arXiv:2307.02251  [pdf, other

    cs.LG cs.CV

    RanPAC: Random Projections and Pre-trained Models for Continual Learning

    Authors: Mark D. McDonnell, Dong Gong, Amin Parveneh, Ehsan Abbasnejad, Anton van den Hengel

    Abstract: Continual learning (CL) aims to incrementally learn different tasks (such as classification) in a non-stationary data stream without forgetting old ones. Most CL works focus on tackling catastrophic forgetting under a learning-from-scratch paradigm. However, with the increasing prominence of foundation models, pre-trained models equipped with informative representations have become available for v… ▽ More

    Submitted 15 January, 2024; v1 submitted 5 July, 2023; originally announced July 2023.

    Comments: 32 pages, 11 figures

    Journal ref: 37th Annual Conference on Neural Information Processing Systems (NeurIPS 2023), Dec 2023, New Orleans, United States

  17. arXiv:2306.06844  [pdf, other

    stat.ML cs.LG

    Provably Efficient Bayesian Optimization with Unknown Gaussian Process Hyperparameter Estimation

    Authors: Huong Ha, Vu Nguyen, Hung Tran-The, Hongyu Zhang, Xiuzhen Zhang, Anton van den Hengel

    Abstract: Gaussian process (GP) based Bayesian optimization (BO) is a powerful method for optimizing black-box functions efficiently. The practical performance and theoretical guarantees of this approach depend on having the correct GP hyperparameter values, which are usually unknown in advance and need to be estimated from the observed data. However, in practice, these estimations could be incorrect due to… ▽ More

    Submitted 6 June, 2024; v1 submitted 11 June, 2023; originally announced June 2023.

    Comments: 25 pages, 5 figures

  18. arXiv:2304.02199  [pdf, other

    cs.CV

    Knowledge Combination to Learn Rotated Detection Without Rotated Annotation

    Authors: Tianyu Zhu, Bryce Ferenczi, Pulak Purkait, Tom Drummond, Hamid Rezatofighi, Anton van den Hengel

    Abstract: Rotated bounding boxes drastically reduce output ambiguity of elongated objects, making it superior to axis-aligned bounding boxes. Despite the effectiveness, rotated detectors are not widely employed. Annotating rotated bounding boxes is such a laborious process that they are not provided in many detection datasets where axis-aligned annotations are used instead. In this paper, we propose a frame… ▽ More

    Submitted 4 May, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: 10 pages, 5 figures, Accepted by CVPR 2023

  19. arXiv:2303.17127  [pdf, other

    cs.CV

    Adaptive Cross Batch Normalization for Metric Learning

    Authors: Thalaiyasingam Ajanthan, Matt Ma, Anton van den Hengel, Stephen Gould

    Abstract: Metric learning is a fundamental problem in computer vision whereby a model is trained to learn a semantically useful embedding space via ranking losses. Traditionally, the effectiveness of a ranking loss depends on the minibatch size, and is, therefore, inherently limited by the memory constraints of the underlying hardware. While simply accumulating the embeddings across minibatches has proved u… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

  20. arXiv:2303.01669  [pdf, other

    cs.CV cs.LG

    Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems

    Authors: Yangyang Shu, Anton van den Hengel, Lingqiao Liu

    Abstract: Self-supervised learning (SSL) strategies have demonstrated remarkable performance in various recognition tasks. However, both our preliminary investigation and recent studies suggest that they may be less effective in learning representations for fine-grained visual recognition (FGVR) since many features helpful for optimizing SSL objectives are not suitable for characterizing the subtle differen… ▽ More

    Submitted 27 July, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: CVPR 2023

  21. arXiv:2302.13543  [pdf, other

    cs.CV

    BLiRF: Bandlimited Radiance Fields for Dynamic Scene Modeling

    Authors: Sameera Ramasinghe, Violetta Shevchenko, Gil Avraham, Anton Van Den Hengel

    Abstract: Reasoning the 3D structure of a non-rigid dynamic scene from a single moving camera is an under-constrained problem. Inspired by the remarkable progress of neural radiance fields (NeRFs) in photo-realistic novel view synthesis of static scenes, extensions have been proposed for dynamic settings. These methods heavily rely on neural priors in order to regularize the problem. In this work, we take a… ▽ More

    Submitted 24 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

  22. arXiv:2302.00178  [pdf, other

    cs.CV cs.CL cs.LG

    Program Generation from Diverse Video Demonstrations

    Authors: Anthony Manchin, Jamie Sherrah, Qi Wu, Anton van den Hengel

    Abstract: The ability to use inductive reasoning to extract general rules from multiple observations is a vital indicator of intelligence. As humans, we use this ability to not only interpret the world around us, but also to predict the outcomes of the various interactions we experience. Generalising over multiple observations is a task that has historically presented difficulties for machines to grasp, esp… ▽ More

    Submitted 31 January, 2023; originally announced February 2023.

  23. arXiv:2212.11491  [pdf, other

    cs.LG cs.CV

    Understanding and Improving the Role of Projection Head in Self-Supervised Learning

    Authors: Kartik Gupta, Thalaiyasingam Ajanthan, Anton van den Hengel, Stephen Gould

    Abstract: Self-supervised learning (SSL) aims to produce useful feature representations without access to any human-labeled data annotations. Due to the success of recent SSL methods based on contrastive learning, such as SimCLR, this problem has gained popularity. Most current contrastive learning approaches append a parametrized projection head to the end of some backbone network to optimize the InfoNCE o… ▽ More

    Submitted 22 December, 2022; originally announced December 2022.

  24. arXiv:2208.14161  [pdf, other

    cs.LG stat.ML

    Identifiable Latent Causal Content for Domain Adaptation under Latent Covariate Shift

    Authors: Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi

    Abstract: Multi-source domain adaptation (MSDA) addresses the challenge of learning a label prediction function for an unlabeled target domain by leveraging both the labeled data from multiple source domains and the unlabeled data from the target domain. Conventional MSDA approaches often rely on covariate shift or conditional shift paradigms, which assume a consistent label distribution across domains. How… ▽ More

    Submitted 31 March, 2024; v1 submitted 30 August, 2022; originally announced August 2022.

  25. arXiv:2208.14153  [pdf, other

    cs.LG stat.ML

    Identifying Weight-Variant Latent Causal Models

    Authors: Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi

    Abstract: The task of causal representation learning aims to uncover latent higher-level causal representations that affect lower-level observations. Identifying true latent causal representations from observed data, while allowing instantaneous causal relations among latent variables, remains a challenge, however. To this end, we start from the analysis of three intrinsic properties in identifying latent s… ▽ More

    Submitted 20 February, 2023; v1 submitted 30 August, 2022; originally announced August 2022.

  26. arXiv:2208.13138  [pdf, other

    cs.CV

    ClusTR: Exploring Efficient Self-attention via Clustering for Vision Transformers

    Authors: Yutong Xie, Jianpeng Zhang, Yong Xia, Anton van den Hengel, Qi Wu

    Abstract: Although Transformers have successfully transitioned from their language modelling origins to image-based applications, their quadratic computational complexity remains a challenge, particularly for dense prediction. In this paper we propose a content-based sparse attention method, as an alternative to dense self-attention, aiming to reduce the computation complexity while retaining the ability to… ▽ More

    Submitted 28 August, 2022; originally announced August 2022.

    Comments: 14 pages

  27. arXiv:2206.14355  [pdf, other

    cs.CV cs.CL cs.LG

    EBMs vs. CL: Exploring Self-Supervised Visual Pretraining for Visual Question Answering

    Authors: Violetta Shevchenko, Ehsan Abbasnejad, Anthony Dick, Anton van den Hengel, Damien Teney

    Abstract: The availability of clean and diverse labeled data is a major roadblock for training models on complex tasks such as visual question answering (VQA). The extensive work on large vision-and-language models has shown that self-supervised learning is effective for pretraining multimodal interactions. In this technical report, we focus on visual representations. We review and evaluate self-supervised… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

  28. arXiv:2206.05880  [pdf, other

    cs.LG

    Confident Sinkhorn Allocation for Pseudo-Labeling

    Authors: Vu Nguyen, Hisham Husain, Sachin Farfade, Anton van den Hengel

    Abstract: Semi-supervised learning is a critical tool in reducing machine learning's dependence on labeled data. It has been successfully applied to structured data, such as images and natural language, by exploiting the inherent spatial and semantic structure therein with pretrained models or data augmentation. These methods are not applicable, however, when the data does not have the appropriate structure… ▽ More

    Submitted 5 March, 2024; v1 submitted 12 June, 2022; originally announced June 2022.

    Comments: Code https://github.com/amzn/confident-sinkhorn-allocation

  29. arXiv:2204.11402  [pdf, other

    cs.CV

    PointInst3D: Segmenting 3D Instances by Points

    Authors: Tong He, Wei Yin, Chunhua Shen, Anton van den Hengel

    Abstract: The current state-of-the-art methods in 3D instance segmentation typically involve a clustering step, despite the tendency towards heuristics, greedy algorithms, and a lack of robustness to the changes in data statistics. In contrast, we propose a fully-convolutional 3D point cloud instance segmentation method that works in a per-point prediction fashion. In doing so it avoids the challenges that… ▽ More

    Submitted 12 July, 2022; v1 submitted 24 April, 2022; originally announced April 2022.

    Comments: Accepted by ECCV22. Code and model will be released at https://github.com/tonghe90/PointInst3D

  30. CNN Attention Guidance for Improved Orthopedics Radiographic Fracture Classification

    Authors: Zhibin Liao, Kewen Liao, Haifeng Shen, Marouska F. van Boxel, Jasper Prijs, Ruurd L. Jaarsma, Job N. Doornberg, Anton van den Hengel, Johan W. Verjans

    Abstract: Convolutional neural networks (CNNs) have gained significant popularity in orthopedic imaging in recent years due to their ability to solve fracture classification problems. A common criticism of CNNs is their opaque learning and reasoning process, making it difficult to trust machine diagnosis and the subsequent adoption of such algorithms in clinical setting. This is especially true when the CNN… ▽ More

    Submitted 20 March, 2022; originally announced March 2022.

    Comments: 12 pages, Published in IEEE Journal of Biomedical and Health Informatics

  31. arXiv:2203.07034  [pdf, other

    cs.CV

    Active Learning by Feature Mixing

    Authors: Amin Parvaneh, Ehsan Abbasnejad, Damien Teney, Reza Haffari, Anton van den Hengel, Javen Qinfeng Shi

    Abstract: The promise of active learning (AL) is to reduce labelling costs by selecting the most valuable examples to annotate from a pool of unlabelled data. Identifying these examples is especially challenging with high-dimensional data (e.g. images, videos) and in low-data regimes. In this paper, we propose a novel method for batch AL called ALFA-Mix. We identify unlabelled instances with sufficiently-di… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Comments: CVPR 2022

  32. arXiv:2203.02128  [pdf, other

    cs.LG math.OC stat.ML

    Distributionally Robust Bayesian Optimization with $\varphi$-divergences

    Authors: Hisham Husain, Vu Nguyen, Anton van den Hengel

    Abstract: The study of robustness has received much attention due to its inevitability in data-driven settings where many systems face uncertainty. One such example of concern is Bayesian Optimization (BO), where uncertainty is multi-faceted, yet there only exists a limited number of works dedicated to this direction. In particular, there is the work of Kirschner et al. (2020), which bridges the existing li… ▽ More

    Submitted 27 October, 2023; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: NeurIPS 2023 camera ready paper

  33. arXiv:2202.11233  [pdf, other

    cs.CV

    Retrieval Augmented Classification for Long-Tail Visual Recognition

    Authors: Alexander Long, Wei Yin, Thalaiyasingam Ajanthan, Vu Nguyen, Pulak Purkait, Ravi Garg, Alan Blair, Chunhua Shen, Anton van den Hengel

    Abstract: We introduce Retrieval Augmented Classification (RAC), a generic approach to augmenting standard image classification pipelines with an explicit retrieval module. RAC consists of a standard base image encoder fused with a parallel retrieval branch that queries a non-parametric external memory of pre-encoded images and associated text snippets. We apply RAC to the problem of long-tail classificatio… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  34. arXiv:2202.10203  [pdf, other

    cs.LG cs.CV

    Learning Bayesian Sparse Networks with Full Experience Replay for Continual Learning

    Authors: Dong Gong, Qingsen Yan, Yuhang Liu, Anton van den Hengel, Javen Qinfeng Shi

    Abstract: Continual Learning (CL) methods aim to enable machine learning models to learn new tasks without catastrophic forgetting of those that have been previously mastered. Existing CL approaches often keep a buffer of previously-seen samples, perform knowledge distillation, or use regularization techniques towards this goal. Despite their performance, they still suffer from interference across tasks whi… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

  35. arXiv:2202.09517  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Deep Learning for Hate Speech Detection: A Comparative Study

    Authors: Jitendra Singh Malik, Hezhe Qiao, Guansong Pang, Anton van den Hengel

    Abstract: Automated hate speech detection is an important tool in combating the spread of hate speech, particularly in social media. Numerous methods have been developed for the task, including a recent proliferation of deep-learning based approaches. A variety of datasets have also been developed, exemplifying various manifestations of the hate-speech detection problem. We present here a large-scale empiri… ▽ More

    Submitted 6 December, 2023; v1 submitted 18 February, 2022; originally announced February 2022.

    Comments: 18 pages, 4 figures, and 6 tables

  36. arXiv:2202.02002  [pdf, other

    cs.CV

    Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings

    Authors: Wei Yin, Yifan Liu, Chunhua Shen, Baichuan Sun, Anton van den Hengel

    Abstract: We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting. It thus achieves results equivalent to those of the supervised methods, on each of the major semantic segmentation datasets, without training on those datasets. This is achieved by replacing each class label with a vector-valued embedding of a short paragraph t… ▽ More

    Submitted 30 April, 2024; v1 submitted 4 February, 2022; originally announced February 2022.

    Comments: 14 pages. Accepted to Int. J. Comp. Vis. (IJCV)

  37. arXiv:2201.07412  [pdf, other

    cs.CV

    Poseur: Direct Human Pose Regression with Transformers

    Authors: Weian Mao, Yongtao Ge, Chunhua Shen, Zhi Tian, Xinlong Wang, Zhibin Wang, Anton van den Hengel

    Abstract: We propose a direct, regression-based approach to 2D human pose estimation from single images. We formulate the problem as a sequence prediction task, which we solve using a Transformer network. This network directly learns a regression map** from images to the keypoint coordinates, without resorting to intermediate representations such as heatmaps. This approach avoids much of the complexity as… ▽ More

    Submitted 20 July, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: Accepted to Proc. Eur. Conf. Comp. Vision (ECCV) 2022

  38. arXiv:2112.10063  [pdf, other

    cs.CV cs.AI cs.LG

    Deep Graph-level Anomaly Detection by Glocal Knowledge Distillation

    Authors: Rongrong Ma, Guansong Pang, Ling Chen, Anton van den Hengel

    Abstract: Graph-level anomaly detection (GAD) describes the problem of detecting graphs that are abnormal in their structure and/or the features of their nodes, as compared to other graphs. One of the challenges in GAD is to devise graph representations that enable the detection of both locally- and globally-anomalous graphs, i.e., graphs that are abnormal in their fine-grained (node-level) or holistic (gra… ▽ More

    Submitted 19 December, 2021; originally announced December 2021.

    Comments: Accepted to WSDM 2022

  39. arXiv:2108.00462  [pdf, other

    cs.CV cs.AI cs.LG

    Explainable Deep Few-shot Anomaly Detection with Deviation Networks

    Authors: Guansong Pang, Choubo Ding, Chunhua Shen, Anton van den Hengel

    Abstract: Existing anomaly detection paradigms overwhelmingly focus on training detection models using exclusively normal data or unlabeled data (mostly normal samples). One notorious issue with these approaches is that they are weak in discriminating anomalies from normal samples due to the lack of the knowledge about the anomalies. Here, we study the problem of few-shot anomaly detection, in which we aim… ▽ More

    Submitted 1 August, 2021; originally announced August 2021.

    Comments: 16 pages, 8 figures, 5 tables

  40. arXiv:2107.08392  [pdf, other

    cs.CV

    Dynamic Convolution for 3D Point Cloud Instance Segmentation

    Authors: Tong He, Chunhua Shen, Anton van den Hengel

    Abstract: We propose an approach to instance segmentation from 3D point clouds based on dynamic convolution. This enables it to adapt, at inference, to varying feature and object scales. Doing so avoids some pitfalls of bottom up approaches, including a dependence on hyper-parameter tuning and heuristic post-processing pipelines to compensate for the inevitable variability in object sizes, even within a sin… ▽ More

    Submitted 14 October, 2022; v1 submitted 18 July, 2021; originally announced July 2021.

    Comments: Accepted to IEEE Trans. Pattern Analysis and Machine Intelligence. Extended version of arXiv:2011.13328

  41. arXiv:2105.05612  [pdf, other

    cs.LG cs.CV

    Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization

    Authors: Damien Teney, Ehsan Abbasnejad, Simon Lucey, Anton van den Hengel

    Abstract: Neural networks trained with SGD were recently shown to rely preferentially on linearly-predictive features and can ignore complex, equally-predictive ones. This simplicity bias can explain their lack of robustness out of distribution (OOD). The more complex the task to learn, the more likely it is that statistical artifacts (i.e. selection biases, spurious correlations) are simpler than the mecha… ▽ More

    Submitted 11 September, 2022; v1 submitted 12 May, 2021; originally announced May 2021.

    Comments: CVPR 2022

  42. arXiv:2104.04167  [pdf, other

    cs.CL cs.CV

    The Road to Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation

    Authors: Yuankai Qi, Zizheng Pan, Yicong Hong, Ming-Hsuan Yang, Anton van den Hengel, Qi Wu

    Abstract: Vision-and-Language Navigation (VLN) requires an agent to find a path to a remote location on the basis of natural-language instructions and a set of photo-realistic panoramas. Most existing methods take the words in the instructions and the discrete views of each panorama as the minimal unit of encoding. However, this requires a model to match different nouns (e.g., TV, table) against the same in… ▽ More

    Submitted 25 August, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

    Comments: Original title: Know What and Know Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation

  43. arXiv:2103.00446  [pdf, other

    cs.CV

    Learning for Visual Navigation by Imagining the Success

    Authors: Mahdi Kazemi Moghaddam, Ehsan Abbasnejad, Qi Wu, Javen Shi, Anton Van Den Hengel

    Abstract: Visual navigation is often cast as a reinforcement learning (RL) problem. Current methods typically result in a suboptimal policy that learns general obstacle avoidance and search behaviours. For example, in the target-object navigation setting, the policies learnt by traditional methods often fail to complete the task, even when the target is clearly within reach from a human perspective. In orde… ▽ More

    Submitted 28 February, 2021; originally announced March 2021.

  44. arXiv:2101.06013  [pdf, other

    cs.CV cs.LG

    Reasoning over Vision and Language: Exploring the Benefits of Supplemental Knowledge

    Authors: Violetta Shevchenko, Damien Teney, Anthony Dick, Anton van den Hengel

    Abstract: The limits of applicability of vision-and-language models are defined by the coverage of their training data. Tasks like vision question answering (VQA) often require commonsense and factual information beyond what can be learned from task-specific datasets. This paper investigates the injection of knowledge from general-purpose knowledge bases (KBs) into vision-and-language transformers. We use a… ▽ More

    Submitted 15 January, 2021; originally announced January 2021.

  45. arXiv:2012.02950  [pdf, other

    cs.LG cs.AI cs.CY

    Deep Depression Prediction on Longitudinal Data via Joint Anomaly Ranking and Classification

    Authors: Guansong Pang, Ngoc Thien Anh Pham, Emma Baker, Rebecca Bentley, Anton van den Hengel

    Abstract: A wide variety of methods have been developed for identifying depression, but they focus primarily on measuring the degree to which individuals are suffering from depression currently. In this work we explore the possibility of predicting future depression using machine learning applied to longitudinal socio-demographic data. In doing so we show that data such as housing status, and the details of… ▽ More

    Submitted 20 March, 2022; v1 submitted 5 December, 2020; originally announced December 2020.

    Comments: Accepted to PAKDD 2022

  46. arXiv:2011.13328  [pdf, other

    cs.CV

    DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution

    Authors: Tong He, Chunhua Shen, Anton van den Hengel

    Abstract: Previous top-performing approaches for point cloud instance segmentation involve a bottom-up strategy, which often includes inefficient operations or complex pipelines, such as grou** over-segmented components, introducing additional steps for refining, or designing complicated loss functions. The inevitable variation in the instance scales can lead bottom-up methods to become particularly sensi… ▽ More

    Submitted 5 March, 2021; v1 submitted 26 November, 2020; originally announced November 2020.

    Comments: Appearing in IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2021

  47. arXiv:2009.06847  [pdf, other

    cs.LG cs.AI cs.CR cs.CV stat.ML

    Toward Deep Supervised Anomaly Detection: Reinforcement Learning from Partially Labeled Anomaly Data

    Authors: Guansong Pang, Anton van den Hengel, Chunhua Shen, Longbing Cao

    Abstract: We consider the problem of anomaly detection with a small set of partially labeled anomaly examples and a large-scale unlabeled dataset. This is a common scenario in many important applications. Existing related methods either exclusively fit the limited anomaly examples that typically do not span the entire set of anomalies, or proceed with unsupervised learning from the unlabeled data. We propos… ▽ More

    Submitted 10 June, 2021; v1 submitted 14 September, 2020; originally announced September 2020.

    Comments: Accepted to KDD 2021

  48. arXiv:2007.14626  [pdf, other

    cs.CL cs.CV

    Object-and-Action Aware Model for Visual Language Navigation

    Authors: Yuankai Qi, Zizheng Pan, Sheng** Zhang, Anton van den Hengel, Qi Wu

    Abstract: Vision-and-Language Navigation (VLN) is unique in that it requires turning relatively general natural-language instructions into robot agent actions, on the basis of the visible environment. This requires to extract value from two very different types of natural-language information. The first is object description (e.g., 'table', 'door'), each presenting as a tip for the agent to determine the ne… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

  49. arXiv:2007.02500  [pdf, other

    cs.LG cs.CV stat.ML

    Deep Learning for Anomaly Detection: A Review

    Authors: Guansong Pang, Chunhua Shen, Longbing Cao, Anton van den Hengel

    Abstract: Anomaly detection, a.k.a. outlier detection or novelty detection, has been a lasting yet active research area in various research communities for several decades. There are still some unique problem complexities and challenges that require advanced approaches. In recent years, deep learning enabled anomaly detection, i.e., deep anomaly detection, has emerged as a critical direction. This paper sur… ▽ More

    Submitted 4 December, 2020; v1 submitted 5 July, 2020; originally announced July 2020.

    Comments: Survey paper, 36 pages, 180 references, 2 figures, 4 tables

    Journal ref: ACM Computing Surveys, 2020

  50. arXiv:2006.00753  [pdf, other

    cs.CV

    Structured Multimodal Attentions for TextVQA

    Authors: Chenyu Gao, Qi Zhu, Peng Wang, Hui Li, Yuliang Liu, Anton van den Hengel, Qi Wu

    Abstract: In this paper, we propose an end-to-end structured multimodal attention (SMA) neural network to mainly solve the first two issues above. SMA first uses a structural graph representation to encode the object-object, object-text and text-text relationships appearing in the image, and then designs a multimodal graph attention network to reason over it. Finally, the outputs from the above modules are… ▽ More

    Submitted 25 November, 2021; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: winner of TextVQA Challenge 2020, Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence