Skip to main content

Showing 1–50 of 55 results for author: Savvides, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.19854  [pdf, other

    cs.CV

    RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection

    Authors: Fangyi Chen, Han Zhang, Zhantao Yang, Hao Chen, Kai Hu, Marios Savvides

    Abstract: Open-vocabulary object detection (OVD) requires solid modeling of the region-semantic relationship, which could be learned from massive region-text pairs. However, such data is limited in practice due to significant annotation costs. In this work, we propose RTGen to generate scalable open-vocabulary region-text pairs and demonstrate its capability to boost the performance of open-vocabulary objec… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Technical report

  2. arXiv:2304.07408  [pdf, other

    cs.CV cs.LG

    Fairness in Visual Clustering: A Novel Transformer Clustering Approach

    Authors: Xuan-Bac Nguyen, Chi Nhan Duong, Marios Savvides, Kaushik Roy, Hugh Churchill, Khoa Luu

    Abstract: Promoting fairness for deep clustering models in unsupervised clustering settings to reduce demographic bias is a challenging goal. This is because of the limitation of large-scale balanced data with well-annotated labels for sensitive or protected attributes. In this paper, we first evaluate demographic bias in deep clustering models from the perspective of cluster purity, which is measured by th… ▽ More

    Submitted 18 September, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

  3. arXiv:2301.10921  [pdf, other

    cs.LG cs.AI cs.CV

    SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised Learning

    Authors: Hao Chen, Ran Tao, Yue Fan, Yidong Wang, **dong Wang, Bernt Schiele, Xing Xie, Bhiksha Raj, Marios Savvides

    Abstract: The critical challenge of Semi-Supervised Learning (SSL) is how to effectively leverage the limited labeled data and massive unlabeled data to improve the model's generalization performance. In this paper, we first revisit the popular pseudo-labeling methods via a unified sample weighting formulation and demonstrate the inherent quantity-quality trade-off problem of pseudo-labeling with thresholdi… ▽ More

    Submitted 15 March, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: Accepted by ICLR 2023

  4. arXiv:2212.07593  [pdf, other

    cs.CV

    Enhanced Training of Query-Based Object Detection via Selective Query Recollection

    Authors: Fangyi Chen, Han Zhang, Kai Hu, Yu-kai Huang, Chenchen Zhu, Marios Savvides

    Abstract: This paper investigates a phenomenon where query-based object detectors mispredict at the last decoding stage while predicting correctly at an intermediate stage. We review the training process and attribute the overlooked phenomenon to two limitations: lack of training emphasis and cascading errors from decoding sequence. We design and present Selective Query Recollection (SQR), a simple and effe… ▽ More

    Submitted 21 March, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

    Comments: CVPR2023

  5. arXiv:2211.11086   

    cs.CV cs.AI cs.LG

    An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised Learning

    Authors: Hao Chen, Yue Fan, Yidong Wang, **dong Wang, Bernt Schiele, Xing Xie, Marios Savvides, Bhiksha Raj

    Abstract: Semi-supervised learning (SSL) has shown great promise in leveraging unlabeled data to improve model performance. While standard SSL assumes uniform data distribution, we consider a more realistic and challenging setting called imbalanced SSL, where imbalanced class distributions occur in both labeled and unlabeled data. Although there are existing endeavors to tackle this challenge, their perform… ▽ More

    Submitted 18 January, 2024; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: Issues in the paper, will re-open later

  6. arXiv:2209.04920  [pdf, other

    cs.CV

    Vec2Face-v2: Unveil Human Faces from their Blackbox Features via Attention-based Network in Face Recognition

    Authors: Thanh-Dat Truong, Chi Nhan Duong, Ngan Le, Marios Savvides, Khoa Luu

    Abstract: In this work, we investigate the problem of face reconstruction given a facial feature representation extracted from a blackbox face recognition engine. Indeed, it is a very challenging problem in practice due to the limitations of abstracted information from the engine. We, therefore, introduce a new method named Attention-based Bijective Generative Adversarial Networks in a Distillation framewor… ▽ More

    Submitted 1 September, 2023; v1 submitted 11 September, 2022; originally announced September 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2003.06958

  7. arXiv:2208.07463  [pdf, other

    cs.CV cs.AI

    Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets

    Authors: Hao Chen, Ran Tao, Han Zhang, Yidong Wang, Xiang Li, Wei Ye, **dong Wang, Guosheng Hu, Marios Savvides

    Abstract: While parameter efficient tuning (PET) methods have shown great potential with transformer architecture on Natural Language Processing (NLP) tasks, their effectiveness with large-scale ConvNets is still under-studied on Computer Vision (CV) tasks. This paper proposes Conv-Adapter, a PET module designed for ConvNets. Conv-Adapter is light-weight, domain-transferable, and architecture-agnostic with… ▽ More

    Submitted 12 April, 2024; v1 submitted 15 August, 2022; originally announced August 2022.

  8. arXiv:2208.07204  [pdf, other

    cs.LG cs.AI cs.CV

    USB: A Unified Semi-supervised Learning Benchmark for Classification

    Authors: Yidong Wang, Hao Chen, Yue Fan, Wang Sun, Ran Tao, Wenxin Hou, Renjie Wang, Linyi Yang, Zhi Zhou, Lan-Zhe Guo, Heli Qi, Zhen Wu, Yu-Feng Li, Satoshi Nakamura, Wei Ye, Marios Savvides, Bhiksha Raj, Takahiro Shinozaki, Bernt Schiele, **dong Wang, Xing Xie, Yue Zhang

    Abstract: Semi-supervised learning (SSL) improves model generalization by leveraging massive unlabeled data to augment limited labeled samples. However, currently, popular SSL evaluation protocols are often constrained to computer vision (CV) tasks. In addition, previous work typically trains deep neural networks from scratch, which is time-consuming and environmentally unfriendly. To address the above issu… ▽ More

    Submitted 13 October, 2022; v1 submitted 12 August, 2022; originally announced August 2022.

    Comments: Accepted by NeurIPS'22 dataset and benchmark track; code at https://github.com/microsoft/Semi-supervised-learning

  9. arXiv:2205.07246  [pdf, other

    cs.LG cs.CV

    FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning

    Authors: Yidong Wang, Hao Chen, Qiang Heng, Wenxin Hou, Yue Fan, Zhen Wu, **dong Wang, Marios Savvides, Takahiro Shinozaki, Bhiksha Raj, Bernt Schiele, Xing Xie

    Abstract: Semi-supervised Learning (SSL) has witnessed great success owing to the impressive performances brought by various methods based on pseudo labeling and consistency regularization. However, we argue that existing methods might fail to utilize the unlabeled data more effectively since they either use a pre-defined / fixed threshold or an ad-hoc threshold adjusting scheme, resulting in inferior perfo… ▽ More

    Submitted 31 January, 2023; v1 submitted 15 May, 2022; originally announced May 2022.

    Comments: Accepted by ICLR 2023. Code: https://github.com/microsoft/Semi-supervised-learning

  10. arXiv:2204.03749  [pdf, other

    cs.CV

    Powering Finetuning in Few-Shot Learning: Domain-Agnostic Bias Reduction with Selected Sampling

    Authors: Ran Tao, Han Zhang, Yutong Zheng, Marios Savvides

    Abstract: In recent works, utilizing a deep network trained on meta-training set serves as a strong baseline in few-shot learning. In this paper, we move forward to refine novel-class features by finetuning a trained deep network. Finetuning is designed to focus on reducing biases in novel-class feature distributions, which we define as two aspects: class-agnostic and class-specific biases. Class-agnostic b… ▽ More

    Submitted 2 June, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: published in AAAI-22

  11. arXiv:2204.00298  [pdf, other

    cs.CV

    Unitail: Detecting, Reading, and Matching in Retail Scene

    Authors: Fangyi Chen, Han Zhang, Zaiwang Li, Jiachen Dou, Shentong Mo, Hao Chen, Yongxin Zhang, Uzair Ahmed, Chenchen Zhu, Marios Savvides

    Abstract: To make full use of computer vision technology in stores, it is required to consider the actual needs that fit the characteristics of the retail scene. Pursuing this goal, we introduce the United Retail Datasets (Unitail), a large-scale benchmark of basic visual tasks on products that challenges algorithms for detecting, reading, and matching. With 1.8M quadrilateral-shaped instances annotated, th… ▽ More

    Submitted 20 July, 2022; v1 submitted 1 April, 2022; originally announced April 2022.

    Comments: ECCV 2022

  12. arXiv:2108.11510  [pdf, other

    cs.CV cs.AI

    Deep Reinforcement Learning in Computer Vision: A Comprehensive Survey

    Authors: Ngan Le, Vidhiwar Singh Rathour, Kashu Yamazaki, Khoa Luu, Marios Savvides

    Abstract: Deep reinforcement learning augments the reinforcement learning framework and utilizes the powerful representation of deep neural networks. Recent works have demonstrated the remarkable successes of deep reinforcement learning in various domains including finance, medicine, healthcare, video games, robotics, and computer vision. In this work, we provide a detailed review of recent and state-of-the… ▽ More

    Submitted 25 August, 2021; originally announced August 2021.

  13. arXiv:2104.00676  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study

    Authors: Zhiqiang Shen, Zechun Liu, Dejia Xu, Zitian Chen, Kwang-Ting Cheng, Marios Savvides

    Abstract: This work aims to empirically clarify a recently discovered perspective that label smoothing is incompatible with knowledge distillation. We begin by introducing the motivation behind on how this incompatibility is raised, i.e., label smoothing erases relative information between teacher logits. We provide a novel connection on how label smoothing affects distributions of semantically similar and… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

    Comments: ICLR 2021. Project page: http://zhiqiangshen.com/projects/LS_and_KD/index.html

  14. arXiv:2103.16605  [pdf, other

    cs.CV

    Unsupervised Disentanglement of Linear-Encoded Facial Semantics

    Authors: Yutong Zheng, Yu-Kai Huang, Ran Tao, Zhiqiang Shen, Marios Savvides

    Abstract: We propose a method to disentangle linear-encoded facial semantics from StyleGAN without external supervision. The method derives from linear regression and sparse representation learning concepts to make the disentangled latent representations easily interpreted as well. We start by coupling StyleGAN with a stabilized 3D deformable facial reconstruction method to decompose single-view GAN generat… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: Accepted in IEEE Conference on Computer Vision and Pattern Recognition 2021 (CVPR2021)

  15. arXiv:2103.01903  [pdf, other

    cs.CV

    Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection

    Authors: Chenchen Zhu, Fangyi Chen, Uzair Ahmed, Zhiqiang Shen, Marios Savvides

    Abstract: Few-shot object detection is an imperative and long-lasting problem due to the inherent long-tail distribution of real-world data. Its performance is largely affected by the data scarcity of novel classes. But the semantic relation between the novel classes and the base classes is constant regardless of the data availability. In this work, we investigate utilizing this semantic relation together w… ▽ More

    Submitted 19 March, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

    Comments: CVPR 2021

  16. arXiv:2102.08946  [pdf, other

    cs.CV cs.AI cs.LG

    S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration

    Authors: Zhiqiang Shen, Zechun Liu, Jie Qin, Lei Huang, Kwang-Ting Cheng, Marios Savvides

    Abstract: Previous studies dominantly target at self-supervised learning on real-valued networks and have achieved many promising results. However, on the more challenging binary neural networks (BNNs), this task has not yet been fully explored in the community. In this paper, we focus on this more difficult scenario: learning networks where both weights and activations are binary, meanwhile, without any hu… ▽ More

    Submitted 21 June, 2021; v1 submitted 17 February, 2021; originally announced February 2021.

    Comments: CVPR 2021 camera-ready version. Self-supervised binary neural networks using distillation loss (5.5~15% improvement over contrastive baseline). Code is available at https://github.com/szq0214/S2-BNN

  17. arXiv:2102.03983  [pdf, other

    cs.CV cs.AI cs.LG

    Partial Is Better Than All: Revisiting Fine-tuning Strategy for Few-shot Learning

    Authors: Zhiqiang Shen, Zechun Liu, Jie Qin, Marios Savvides, Kwang-Ting Cheng

    Abstract: The goal of few-shot learning is to learn a classifier that can recognize unseen classes from limited support data with labels. A common practice for this task is to train a model on the base set first and then transfer to novel classes through fine-tuning (Here fine-tuning procedure is defined as transferring knowledge from base to novel data, i.e. learning to transfer in few-shot scenario.) or m… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

    Comments: AAAI 2021. A search based fine-tuning strategy for few-shot learning

  18. arXiv:2012.02073  [pdf, other

    cs.CV

    A Multi-task Contextual Atrous Residual Network for Brain Tumor Detection & Segmentation

    Authors: Ngan Le, Kashu Yamazaki, Dat Truong, Kha Gia Quach, Marios Savvides

    Abstract: In recent years, deep neural networks have achieved state-of-the-art performance in a variety of recognition and segmentation tasks in medical imaging including brain tumor segmentation. We investigate that segmenting a brain tumor is facing to the imbalanced data problem where the number of pixels belonging to the background class (non tumor pixel) is much larger than the number of pixels belongi… ▽ More

    Submitted 3 December, 2020; originally announced December 2020.

    Comments: Accepted in ICPR 2020

  19. Online Ensemble Model Compression using Knowledge Distillation

    Authors: Devesh Walawalkar, Zhiqiang Shen, Marios Savvides

    Abstract: This paper presents a novel knowledge distillation based model compression framework consisting of a student ensemble. It enables distillation of simultaneously learnt ensemble knowledge onto each of the compressed student models. Each model learns unique representations from the data distribution due to its distinct architecture. This helps the ensemble generalize better by combining every model'… ▽ More

    Submitted 14 November, 2020; originally announced November 2020.

  20. arXiv:2009.08453  [pdf, other

    cs.CV cs.AI cs.LG

    MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks

    Authors: Zhiqiang Shen, Marios Savvides

    Abstract: We introduce a simple yet effective distillation framework that is able to boost the vanilla ResNet-50 to 80%+ Top-1 accuracy on ImageNet without tricks. We construct such a framework through analyzing the problems in the existing classification system and simplify the base method ensemble knowledge distillation via discriminators by: (1) adopting the similarity loss and discriminator only on the… ▽ More

    Submitted 19 March, 2021; v1 submitted 17 September, 2020; originally announced September 2020.

    Comments: 12 pages. Code and trained models are available at: https://github.com/szq0214/MEAL-V2

  21. arXiv:2005.06305  [pdf, other

    cs.CV cs.LG cs.NE

    Binarizing MobileNet via Evolution-based Searching

    Authors: Hai Phan, Zechun Liu, Dang Huynh, Marios Savvides, Kwang-Ting Cheng, Zhiqiang Shen

    Abstract: Binary Neural Networks (BNNs), known to be one among the effectively compact network architectures, have achieved great outcomes in the visual tasks. Designing efficient binary architectures is not trivial due to the binary nature of the network. In this paper, we propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet, a compact network wi… ▽ More

    Submitted 15 May, 2020; v1 submitted 13 May, 2020; originally announced May 2020.

    Comments: Accepted by CVPR2020

  22. arXiv:2003.13048  [pdf, other

    cs.CV

    Attentive CutMix: An Enhanced Data Augmentation Approach for Deep Learning Based Image Classification

    Authors: Devesh Walawalkar, Zhiqiang Shen, Zechun Liu, Marios Savvides

    Abstract: Convolutional neural networks (CNN) are capable of learning robust representation with different regularization methods and activations as convolutional layers are spatially correlated. Based on this property, a large variety of regional dropout strategies have been proposed, such as Cutout, DropBlock, CutMix, etc. These methods aim to promote the network to generalize better by partially occludin… ▽ More

    Submitted 5 April, 2020; v1 submitted 29 March, 2020; originally announced March 2020.

  23. arXiv:2003.05438  [pdf, other

    cs.CV cs.LG eess.IV

    Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation Learning

    Authors: Zhiqiang Shen, Zechun Liu, Zhuang Liu, Marios Savvides, Trevor Darrell, Eric Xing

    Abstract: The recently advanced unsupervised learning approaches use the siamese-like framework to compare two "views" from the same image for learning representations. Making the two views distinctive is a core to guarantee that unsupervised methods can learn meaningful information. However, such frameworks are sometimes fragile on overfitting if the augmentations used for generating two views are not stro… ▽ More

    Submitted 17 February, 2022; v1 submitted 11 March, 2020; originally announced March 2020.

    Comments: AAAI 2022 camera ready version with Appendix (add a formula example with InfoNCE). Code is available at: https://github.com/szq0214/Un-Mix

  24. arXiv:2003.03488  [pdf, other

    cs.CV cs.LG eess.IV

    ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions

    Authors: Zechun Liu, Zhiqiang Shen, Marios Savvides, Kwang-Ting Cheng

    Abstract: In this paper, we propose several ideas for enhancing a binary network to close its accuracy gap from real-valued networks without incurring any additional computational cost. We first construct a baseline network by modifying and binarizing a compact real-valued network with parameter-free shortcuts, bypassing all the intermediate convolutional layers including the downsampling layers. This basel… ▽ More

    Submitted 12 July, 2020; v1 submitted 6 March, 2020; originally announced March 2020.

    Comments: Accepted to ECCV 2020. Code is available at: https://github.com/liuzechun/ReActNet

  25. arXiv:2002.05274  [pdf, other

    cs.CV cs.LG

    Solving Missing-Annotation Object Detection with Background Recalibration Loss

    Authors: Han Zhang, Fangyi Chen, Zhiqiang Shen, Qiqi Hao, Chenchen Zhu, Marios Savvides

    Abstract: This paper focuses on a novel and challenging detection scenario: A majority of true objects/instances is unlabeled in the datasets, so these missing-labeled areas will be regarded as the background during training. Previous art on this problem has proposed to use soft sampling to re-weight the gradients of RoIs based on the overlaps with positive instances, while their method is mainly based on t… ▽ More

    Submitted 3 August, 2020; v1 submitted 12 February, 2020; originally announced February 2020.

    Comments: 5 pages. Paper has been accepted by ICASSP 2020 for presentation in a lecture (oral) session

    MSC Class: 68T45 (Primary);

  26. arXiv:1911.12448  [pdf, other

    cs.CV

    Soft Anchor-Point Object Detection

    Authors: Chenchen Zhu, Fangyi Chen, Zhiqiang Shen, Marios Savvides

    Abstract: Recently, anchor-free detection methods have been through great progress. The major two families, anchor-point detection and key-point detection, are at opposite edges of the speed-accuracy trade-off, with anchor-point detectors having the speed advantage. In this work, we boost the performance of the anchor-point detector over the key-point counterparts while maintaining the speed advantage. To a… ▽ More

    Submitted 6 July, 2020; v1 submitted 27 November, 2019; originally announced November 2019.

    Comments: ECCV 2020

  27. arXiv:1911.10594  [pdf, other

    cs.LG cs.CV stat.ML

    Towards a Hypothesis on Visual Transformation based Self-Supervision

    Authors: Dipan K. Pal, Sreena Nallamothu, Marios Savvides

    Abstract: We propose the first qualitative hypothesis characterizing the behavior of visual transformation based self-supervision, called the VTSS hypothesis. Given a dataset upon which a self-supervised task is performed while predicting instantiations of a transformation, the hypothesis states that if the predicted instantiations of the transformations are already present in the dataset, then the represen… ▽ More

    Submitted 13 February, 2020; v1 submitted 24 November, 2019; originally announced November 2019.

    Comments: Draft

  28. arXiv:1911.05266  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Learning Non-Parametric Invariances from Data with Permanent Random Connectomes

    Authors: Dipan K. Pal, Akshay Chawla, Marios Savvides

    Abstract: One of the fundamental problems in supervised classification and in machine learning in general, is the modelling of non-parametric invariances that exist in data. Most prior art has focused on enforcing priors in the form of invariances to parametric nuisance transformations that are expected to be present in data. Learning non-parametric invariances directly from data remains an important open p… ▽ More

    Submitted 13 August, 2020; v1 submitted 12 November, 2019; originally announced November 2019.

    Comments: Preprint (accepted at NeurIPS SVRHM 2019 Workshop)

  29. arXiv:1911.02559  [pdf, other

    cs.CV cs.LG

    SCL: Towards Accurate Domain Adaptive Object Detection via Gradient Detach Based Stacked Complementary Losses

    Authors: Zhiqiang Shen, Harsh Maheshwari, Weichen Yao, Marios Savvides

    Abstract: Unsupervised domain adaptive object detection aims to learn a robust detector in the domain shift circumstance, where the training (source) domain is label-rich with bounding box annotations, while the testing (target) domain is label-agnostic and the feature distributions between training and testing domains are dissimilar or even totally different. In this paper, we propose a gradient detach bas… ▽ More

    Submitted 21 November, 2019; v1 submitted 6 November, 2019; originally announced November 2019.

  30. arXiv:1908.08520  [pdf, other

    cs.CV cs.LG

    Adversarial-Based Knowledge Distillation for Multi-Model Ensemble and Noisy Data Refinement

    Authors: Zhiqiang Shen, Zhankui He, Wanyun Cui, Jiahui Yu, Yutong Zheng, Chenchen Zhu, Marios Savvides

    Abstract: Generic Image recognition is a fundamental and fairly important visual problem in computer vision. One of the major challenges of this task lies in the fact that single image usually has multiple objects inside while the labels are still one-hot, another one is noisy and sometimes missing labels when annotated by humans. In this paper, we focus on tackling these challenges accompanying with two di… ▽ More

    Submitted 22 August, 2019; originally announced August 2019.

    Comments: This is an extended version of our previous conference paper arXiv:1812.02425

  31. arXiv:1907.12629  [pdf, other

    cs.CV cs.LG

    MoBiNet: A Mobile Binary Network for Image Classification

    Authors: Hai Phan, Dang Huynh, Yihui He, Marios Savvides, Zhiqiang Shen

    Abstract: MobileNet and Binary Neural Networks are two among the most widely used techniques to construct deep learning models for performing a variety of tasks on mobile and embedded platforms.In this paper, we present a simple yet efficient scheme to exploit MobileNet binarization at activation function and model weights. However, training a binary network from scratch with separable depth-wise and point-… ▽ More

    Submitted 30 July, 2019; v1 submitted 29 July, 2019; originally announced July 2019.

  32. arXiv:1903.07154  [pdf, other

    cs.CV cs.LG

    Proximal Splitting Networks for Image Restoration

    Authors: Raied Aljadaany, Dipan K. Pal, Marios Savvides

    Abstract: Image restoration problems are typically ill-posed requiring the design of suitable priors. These priors are typically hand-designed and are fully instantiated throughout the process. In this paper, we introduce a novel framework for handling inverse problems related to image restoration based on elements from the half quadratic splitting method and proximal operators. Modeling the proximal operat… ▽ More

    Submitted 17 March, 2019; originally announced March 2019.

  33. arXiv:1903.00621  [pdf, other

    cs.CV

    Feature Selective Anchor-Free Module for Single-Shot Object Detection

    Authors: Chenchen Zhu, Yihui He, Marios Savvides

    Abstract: We motivate and present feature selective anchor-free (FSAF) module, a simple and effective building block for single-shot object detectors. It can be plugged into single-shot detectors with feature pyramid structure. The FSAF module addresses two limitations brought up by the conventional anchor-based detection: 1) heuristic-guided feature selection; 2) overlap-based anchor sampling. The general… ▽ More

    Submitted 1 March, 2019; originally announced March 2019.

    Comments: CVPR 2019

  34. arXiv:1812.08196  [pdf, other

    cs.CV

    RankGAN: A Maximum Margin Ranking GAN for Generating Faces

    Authors: Rahul Dey, Felix Juefei-Xu, Vishnu Naresh Boddeti, Marios Savvides

    Abstract: We present a new stage-wise learning paradigm for training generative adversarial networks (GANs). The goal of our work is to progressively strengthen the discriminator and thus, the generators, with each subsequent stage without changing the network architecture. We call this proposed method the RankGAN. We first propose a margin-based loss for the GAN discriminator. We then extend it to a margin… ▽ More

    Submitted 19 December, 2018; originally announced December 2018.

    Comments: Best Student Paper Award at Asian Conference on Computer Vision (ACCV), 2018 at Perth, Australia. Includes main paper and supplementary material. Total 32 pages including references

  35. arXiv:1810.04752  [pdf, other

    cs.CV

    Deep Recurrent Level Set for Segmenting Brain Tumors

    Authors: T. Hoang Ngan Le, Raajitha Gummadi, Marios Savvides

    Abstract: Variational Level Set (VLS) has been a widely used method in medical segmentation. However, segmentation accuracy in the VLS method dramatically decreases when dealing with intervening factors such as lighting, shadows, colors, etc. Additionally, results are quite sensitive to initial settings and are highly dependent on the number of iterations. In order to address these limitations, the proposed… ▽ More

    Submitted 10 October, 2018; originally announced October 2018.

    Journal ref: booktitle="Medical Image Computing and Computer Assisted Intervention -- MICCAI 2018", year="2018", publisher="Springer International Publishing",

  36. arXiv:1809.08545  [pdf, other

    cs.CV

    Bounding Box Regression with Uncertainty for Accurate Object Detection

    Authors: Yihui He, Chenchen Zhu, Jianren Wang, Marios Savvides, Xiangyu Zhang

    Abstract: Large-scale object detection datasets (e.g., MS-COCO) try to define the ground truth bounding boxes as clear as possible. However, we observe that ambiguities are still introduced when labeling the bounding boxes. In this paper, we propose a novel bounding box regression loss for learning bounding box transformation and localization variance together. Our loss greatly improves the localization acc… ▽ More

    Submitted 16 April, 2019; v1 submitted 23 September, 2018; originally announced September 2018.

    Comments: CVPR 2019

  37. arXiv:1806.01817  [pdf, other

    cs.CV cs.LG

    Perturbative Neural Networks

    Authors: Felix Juefei-Xu, Vishnu Naresh Boddeti, Marios Savvides

    Abstract: Convolutional neural networks are witnessing wide adoption in computer vision systems with numerous applications across a range of visual recognition tasks. Much of this progress is fueled through advances in convolutional neural network architectures and learning algorithms even as the basic premise of a convolutional layer has remained unchanged. In this paper, we seek to revisit the convolution… ▽ More

    Submitted 5 June, 2018; originally announced June 2018.

    Comments: To appear in CVPR 2018. http://xujuefei.com/pnn.html

  38. arXiv:1803.00130  [pdf, other

    cs.CV

    Ring loss: Convex Feature Normalization for Face Recognition

    Authors: Yutong Zheng, Dipan K. Pal, Marios Savvides

    Abstract: We motivate and present Ring loss, a simple and elegant feature normalization approach for deep networks designed to augment standard loss functions such as Softmax. We argue that deep feature normalization is an important aspect of supervised classification problems where we require the model to represent each class in a multi-class problem equally well. The direct approach to feature normalizati… ▽ More

    Submitted 28 February, 2018; originally announced March 2018.

    Comments: Accepted at CVPR 2018

  39. arXiv:1802.09058  [pdf, other

    cs.CV

    Seeing Small Faces from Robust Anchor's Perspective

    Authors: Chenchen Zhu, Ran Tao, Khoa Luu, Marios Savvides

    Abstract: This paper introduces a novel anchor design to support anchor-based face detection for superior scale-invariant performance, especially on tiny faces. To achieve this, we explicitly address the problem that anchor-based detectors drop performance drastically on faces with tiny sizes, e.g. less than 16x16 pixels. In this paper, we investigate why this is the case. We discover that current anchor de… ▽ More

    Submitted 25 February, 2018; originally announced February 2018.

    Comments: CVPR 2018

  40. arXiv:1801.04520  [pdf, other

    cs.CV cs.AI cs.LG

    Non-Parametric Transformation Networks

    Authors: Dipan K. Pal, Marios Savvides

    Abstract: ConvNets, through their architecture, only enforce invariance to translation. In this paper, we introduce a new class of deep convolutional architectures called Non-Parametric Transformation Networks (NPTNs) which can learn \textit{general} invariances and symmetries directly from data. NPTNs are a natural generalization of ConvNets and can be optimized directly using gradient descent. Unlike almo… ▽ More

    Submitted 8 September, 2018; v1 submitted 14 January, 2018; originally announced January 2018.

    Comments: Preprint only

  41. arXiv:1712.00886  [pdf, other

    cs.CV

    Improving Object Detection from Scratch via Gated Feature Reuse

    Authors: Zhiqiang Shen, Honghui Shi, Jiahui Yu, Hai Phan, Rogerio Feris, Liangliang Cao, Ding Liu, Xinchao Wang, Thomas Huang, Marios Savvides

    Abstract: In this paper, we present a simple and parameter-efficient drop-in module for one-stage object detectors like SSD when learning from scratch (i.e., without pre-trained models). We call our module GFR (Gated Feature Reuse), which exhibits two main advantages. First, we introduce a novel gate-controlled prediction strategy enabled by Squeeze-and-Excitation to adaptively enhance or attenuate supervis… ▽ More

    Submitted 7 July, 2019; v1 submitted 3 December, 2017; originally announced December 2017.

    Comments: Accepted in BMVC 2019. Code: https://github.com/szq0214/GFR-DSOD

  42. arXiv:1711.10520  [pdf, other

    cs.CV

    Learning from Longitudinal Face Demonstration - Where Tractable Deep Modeling Meets Inverse Reinforcement Learning

    Authors: Chi Nhan Duong, Kha Gia Quach, Khoa Luu, T. Hoang Ngan Le, Marios Savvides, Tien D. Bui

    Abstract: This paper presents a novel Subject-dependent Deep Aging Path (SDAP), which inherits the merits of both Generative Probabilistic Modeling and Inverse Reinforcement Learning to model the facial structures and the longitudinal face aging process of a given subject. The proposed SDAP is optimized using tractable log-likelihood objective functions with Convolutional Neural Networks (CNNs) based deep f… ▽ More

    Submitted 2 February, 2019; v1 submitted 28 November, 2017; originally announced November 2017.

  43. arXiv:1710.09685  [pdf, other

    cs.CV

    Class Correlation affects Single Object Localization using Pre-trained ConvNets

    Authors: Pokkalla Harsha Vardhan, Kunal Sekhri, Dipan K. Pal, Marios Savvides

    Abstract: The problem of object localization has become one of the mainstream problems of vision. Most of the algorithms proposed involve the design for the model to be specifically for localizing objects. In this paper, we explore whether a pre-trained canonical ConvNet (without fine-tuning) trained purely for object classification on one dataset with global image level labels can be used to localize objec… ▽ More

    Submitted 27 October, 2017; v1 submitted 26 October, 2017; originally announced October 2017.

  44. arXiv:1710.08585  [pdf, other

    cs.LG cs.AI cs.CV

    Max-Margin Invariant Features from Transformed Unlabeled Data

    Authors: Dipan K. Pal, Ashwin A. Kannan, Gautam Arakalgud, Marios Savvides

    Abstract: The study of representations invariant to common transformations of the data is important to learning. Most techniques have focused on local approximate invariance implemented within expensive optimization frameworks lacking explicit theoretical guarantees. In this paper, we study kernels that are invariant to a unitary group while having theoretical guarantees in addressing the important practica… ▽ More

    Submitted 23 October, 2017; originally announced October 2017.

    Comments: Accepted at NIPS 2017

  45. arXiv:1707.05653  [pdf, other

    cs.CV

    Faster Than Real-time Facial Alignment: A 3D Spatial Transformer Network Approach in Unconstrained Poses

    Authors: Chandrasekhar Bhagavatula, Chenchen Zhu, Khoa Luu, Marios Savvides

    Abstract: Facial alignment involves finding a set of landmark points on an image with a known semantic meaning. However, this semantic meaning of landmark points is often lost in 2D approaches where landmarks are either moved to visible boundaries or ignored as the pose of the face changes. In order to extract consistent alignment points across large poses, the 3D structure of the face must be considered in… ▽ More

    Submitted 8 September, 2017; v1 submitted 18 July, 2017; originally announced July 2017.

    Comments: International Conference on Computer Vision (ICCV) 2017

  46. arXiv:1704.04865  [pdf, other

    cs.CV cs.LG

    Gang of GANs: Generative Adversarial Networks with Maximum Margin Ranking

    Authors: Felix Juefei-Xu, Vishnu Naresh Boddeti, Marios Savvides

    Abstract: Traditional generative adversarial networks (GAN) and many of its variants are trained by minimizing the KL or JS-divergence loss that measures how close the generated data distribution is from the true data distribution. A recent advance called the WGAN based on Wasserstein distance can improve on the KL and JS-divergence based GANs, and alleviate the gradient vanishing, instability, and mode col… ▽ More

    Submitted 17 April, 2017; originally announced April 2017.

    Comments: 16 pages. 11 figures

  47. arXiv:1704.03594  [pdf, other

    cs.CV

    Deep Contextual Recurrent Residual Networks for Scene Labeling

    Authors: T. Hoang Ngan Le, Chi Nhan Duong, Ligong Han, Khoa Luu, Marios Savvides, Dipan Pal

    Abstract: Designed as extremely deep architectures, deep residual networks which provide a rich visual representation and offer robust convergence behaviors have recently achieved exceptional performance in numerous computer vision problems. Being directly applied to a scene labeling problem, however, they were limited to capture long-range contextual dependence, which is a critical aspect. To address this… ▽ More

    Submitted 11 April, 2017; originally announced April 2017.

  48. arXiv:1704.03593  [pdf, other

    cs.CV

    Reformulating Level Sets as Deep Recurrent Neural Network Approach to Semantic Segmentation

    Authors: Ngan Le, Kha Gia Quach, Khoa Luu, Marios Savvides, Chenchen Zhu

    Abstract: Variational Level Set (LS) has been a widely used method in medical segmentation. However, it is limited when dealing with multi-instance objects in the real world. In addition, its segmentation results are quite sensitive to initial settings and highly depend on the number of iterations. To address these issues and boost the classic variational LS methods to a new level of the learnable deep lear… ▽ More

    Submitted 11 April, 2017; originally announced April 2017.

    Comments: 10 pages, 6 figures

  49. arXiv:1703.08617  [pdf, other

    cs.CV

    Temporal Non-Volume Preserving Approach to Facial Age-Progression and Age-Invariant Face Recognition

    Authors: Chi Nhan Duong, Kha Gia Quach, Khoa Luu, T. Hoang Ngan le, Marios Savvides

    Abstract: Modeling the long-term facial aging process is extremely challenging due to the presence of large and non-linear variations during the face development stages. In order to efficiently address the problem, this work first decomposes the aging process into multiple short-term stages. Then, a novel generative probabilistic model, named Temporal Non-Volume Preserving (TNVP) transformation, is presente… ▽ More

    Submitted 24 March, 2017; originally announced March 2017.

  50. arXiv:1702.07664  [pdf, other

    cs.CV cs.LG

    How ConvNets model Non-linear Transformations

    Authors: Dipan K. Pal, Marios Savvides

    Abstract: In this paper, we theoretically address three fundamental problems involving deep convolutional networks regarding invariance, depth and hierarchy. We introduce the paradigm of Transformation Networks (TN) which are a direct generalization of Convolutional Networks (ConvNets). Theoretically, we show that TNs (and thereby ConvNets) are can be invariant to non-linear transformations of the input des… ▽ More

    Submitted 24 February, 2017; originally announced February 2017.