Skip to main content

Showing 1–50 of 91 results for author: Kot, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.17349  [pdf, other

    cs.CR cs.CV

    Semantic Deep Hiding for Robust Unlearnable Examples

    Authors: Ruohan Meng, Chenyu Yi, Yi Yu, Siyuan Yang, Bingquan Shen, Alex C. Kot

    Abstract: Ensuring data privacy and protection has become paramount in the era of deep learning. Unlearnable examples are proposed to mislead the deep learning models and prevent data from unauthorized exploration by adding small perturbations to data. However, such perturbations (e.g., noise, texture, color change) predominantly impact low-level features, making them vulnerable to common countermeasures. I… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Accepted by TIFS 2024

  2. arXiv:2406.13227  [pdf, other

    cs.CV

    Controllable and Gradual Facial Blemishes Retouching via Physics-Based Modelling

    Authors: Chenhao Shuai, Rizhao Cai, Bandara Dissanayake, Amanda Newman, Dayan Guan, Dennis Sng, Ling Li, Alex Kot

    Abstract: Face retouching aims to remove facial blemishes, such as pigmentation and acne, and still retain fine-grain texture details. Nevertheless, existing methods just remove the blemishes but focus little on realism of the intermediate process, limiting their use more to beautifying facial images on social media rather than being effective tools for simulating changes in facial pigmentation and ance. Mo… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 7 pages, 6 figures. The paper has been accepted by the IEEE Conference on Multimedia Expo 2024

  3. arXiv:2406.09121  [pdf, other

    cs.CV

    MMRel: A Relation Understanding Dataset and Benchmark in the MLLM Era

    Authors: Jiahao Nie, Gongjie Zhang, Wenbin An, Yap-Peng Tan, Alex C. Kot, Shijian Lu

    Abstract: Despite the recent advancements in Multi-modal Large Language Models (MLLMs), understanding inter-object relations, i.e., interactions or associations between distinct objects, remains a major challenge for such models. This issue significantly hinders their advanced reasoning capabilities and is primarily due to the lack of large-scale, high-quality, and diverse multi-modal data essential for tra… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  4. arXiv:2406.08300  [pdf, other

    eess.IV cs.CV

    From Chaos to Clarity: 3DGS in the Dark

    Authors: Zhihao Li, Yufei Wang, Alex Kot, Bihan Wen

    Abstract: Novel view synthesis from raw images provides superior high dynamic range (HDR) information compared to reconstructions from low dynamic range RGB images. However, the inherent noise in unprocessed raw images compromises the accuracy of 3D scene representation. Our study reveals that 3D Gaussian Splatting (3DGS) is particularly susceptible to this noise, leading to numerous elongated Gaussian shap… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  5. arXiv:2405.20721  [pdf, other

    cs.CV cs.AI

    ContextGS: Compact 3D Gaussian Splatting with Anchor Level Context Model

    Authors: Yufei Wang, Zhihao Li, Lanqing Guo, Wenhan Yang, Alex C. Kot, Bihan Wen

    Abstract: Recently, 3D Gaussian Splatting (3DGS) has become a promising framework for novel view synthesis, offering fast rendering speeds and high fidelity. However, the large number of Gaussians and their associated attributes require effective compression techniques. Existing methods primarily compress neural Gaussians individually and independently, i.e., coding all the neural Gaussians at the same time… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  6. arXiv:2405.11852  [pdf, other

    cs.CV

    Evolving Storytelling: Benchmarks and Methods for New Character Customization with Diffusion Models

    Authors: Xiyu Wang, Yufei Wang, Satoshi Tsutsui, Weisi Lin, Bihan Wen, Alex C. Kot

    Abstract: Diffusion-based models for story visualization have shown promise in generating content-coherent images for storytelling tasks. However, how to effectively integrate new characters into existing narratives while maintaining character consistency remains an open problem, particularly with limited data. Two major limitations hinder the progress: (1) the absence of a suitable benchmark due to potenti… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  7. arXiv:2405.09487  [pdf, other

    cs.CV

    Color Space Learning for Cross-Color Person Re-Identification

    Authors: Jiahao Nie, Shan Lin, Alex C. Kot

    Abstract: The primary color profile of the same identity is assumed to remain consistent in typical Person Re-identification (Person ReID) tasks. However, this assumption may be invalid in real-world situations and images hold variant color profiles, because of cross-modality cameras or identity with different clothing. To address this issue, we propose Color Space Learning (CSL) for those Cross-Color Perso… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: Accepted by ICME 2024 (Oral)

  8. arXiv:2405.06995  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Benchmarking Cross-Domain Audio-Visual Deception Detection

    Authors: Xiaobao Guo, Zitong Yu, Nithish Muthuchamy Selvaraj, Bingquan Shen, Adams Wai-Kin Kong, Alex C. Kot

    Abstract: Automated deception detection is crucial for assisting humans in accurately assessing truthfulness and identifying deceptive behavior. Conventional contact-based techniques, like polygraph devices, rely on physiological signals to determine the authenticity of an individual's statements. Nevertheless, recent developments in automated deception detection have demonstrated that multimodal features d… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: 10 pages

  9. arXiv:2405.01825  [pdf, other

    cs.CV

    Improving Concept Alignment in Vision-Language Concept Bottleneck Models

    Authors: Nithish Muthuchamy Selvaraj, Xiaobao Guo, Bingquan Shen, Adams Wai-Kin Kong, Alex Kot

    Abstract: Concept Bottleneck Models (CBM) map the input image to a high-level human-understandable concept space and then make class predictions based on these concepts. Recent approaches automate the construction of CBM by prompting Large Language Models (LLM) to generate text concepts and then use Vision Language Models (VLM) to obtain concept scores to train a CBM. However, it is desired to build CBMs wi… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  10. arXiv:2405.01460  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders

    Authors: Yi Yu, Yufei Wang, Song Xia, Wenhan Yang, Shijian Lu, Yap-Peng Tan, Alex C. Kot

    Abstract: Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled. Defenses against these poisoning attacks can be categorized based on whether specific interventions are adopted during training. The first approach is training-time defense, such as adversarial training, which can mitigate poisoning effects but is computationall… ▽ More

    Submitted 6 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  11. arXiv:2404.13576  [pdf, other

    cs.CV cs.LG

    I2CANSAY:Inter-Class Analogical Augmentation and Intra-Class Significance Analysis for Non-Exemplar Online Task-Free Continual Learning

    Authors: Songlin Dong, Yingjie Chen, Yuhang He, Yuhan **, Alex C. Kot, Yihong Gong

    Abstract: Online task-free continual learning (OTFCL) is a more challenging variant of continual learning which emphasizes the gradual shift of task boundaries and learns in an online mode. Existing methods rely on a memory buffer composed of old samples to prevent forgetting. However,the use of memory buffers not only raises privacy concerns but also hinders the efficient learning of new samples. To addres… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  12. arXiv:2404.08452  [pdf, other

    cs.CV

    MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection

    Authors: Chenqi Kong, Anwei Luo, Peijun Bao, Yi Yu, Haoliang Li, Zengwei Zheng, Shiqi Wang, Alex C. Kot

    Abstract: Deepfakes have recently raised significant trust issues and security concerns among the public. Compared to CNN face forgery detectors, ViT-based methods take advantage of the expressivity of transformers, achieving superior detection performance. However, these approaches still exhibit the following limitations: (1) Fully fine-tuning ViT-based models from ImageNet weights demands substantial comp… ▽ More

    Submitted 7 June, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

  13. arXiv:2403.14250  [pdf, other

    eess.IV cs.CR cs.CV

    Safeguarding Medical Image Segmentation Datasets against Unauthorized Training via Contour- and Texture-Aware Perturbations

    Authors: Xun Lin, Yi Yu, Song Xia, Jue Jiang, Haoran Wang, Zitong Yu, Yizhong Liu, Ying Fu, Shuai Wang, Wenzhong Tang, Alex Kot

    Abstract: The widespread availability of publicly accessible medical images has significantly propelled advancements in various research and clinical fields. Nonetheless, concerns regarding unauthorized training of AI systems for commercial purposes and the duties of patient privacy protection have led numerous institutions to hesitate to share their images. This is particularly true for medical image segme… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  14. arXiv:2402.19298  [pdf, other

    cs.CV

    Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing

    Authors: Xun Lin, Shuai Wang, Rizhao Cai, Yizhong Liu, Ying Fu, Zitong Yu, Wenzhong Tang, Alex Kot

    Abstract: Face Anti-Spoofing (FAS) is crucial for securing face recognition systems against presentation attacks. With advancements in sensor manufacture and multi-modal learning techniques, many multi-modal FAS approaches have emerged. However, they face challenges in generalizing to unseen attacks and deployment conditions. These challenges arise from (1) modality unreliability, where some modality sensor… ▽ More

    Submitted 5 March, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepeted by CVPR 2024

  15. arXiv:2401.08407  [pdf, other

    cs.CV

    Cross-Domain Few-Shot Segmentation via Iterative Support-Query Correspondence Mining

    Authors: Jiahao Nie, Yun Xing, Gongjie Zhang, Pei Yan, Aoran Xiao, Yap-Peng Tan, Alex C. Kot, Shijian Lu

    Abstract: Cross-Domain Few-Shot Segmentation (CD-FSS) poses the challenge of segmenting novel categories from a distinct domain using only limited exemplars. In this paper, we undertake a comprehensive study of CD-FSS and uncover two crucial insights: (i) the necessity of a fine-tuning stage to effectively transfer the learned meta-knowledge across domains, and (ii) the overfitting risk during the naïve fin… ▽ More

    Submitted 13 March, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted by CVPR 2024

  16. arXiv:2401.07245  [pdf, other

    cs.CV

    MIMIC: Mask Image Pre-training with Mix Contrastive Fine-tuning for Facial Expression Recognition

    Authors: Fan Zhang, Xiaobao Guo, Xiaojiang Peng, Alex Kot

    Abstract: Cutting-edge research in facial expression recognition (FER) currently favors the utilization of convolutional neural networks (CNNs) backbone which is supervisedly pre-trained on face recognition datasets for feature extraction. However, due to the vast scale of face recognition datasets and the high cost associated with collecting facial labels, this pre-training paradigm incurs significant expe… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

  17. arXiv:2312.15490   

    cs.IR cs.AI

    Diffusion-EXR: Controllable Review Generation for Explainable Recommendation via Diffusion Models

    Authors: Ling Li, Shaohua Li, Winda Marantika, Alex C. Kot, Hui**g Zhan

    Abstract: Denoising Diffusion Probabilistic Model (DDPM) has shown great competence in image and audio generation tasks. However, there exist few attempts to employ DDPM in the text generation, especially review generation under recommendation systems. Fueled by the predicted reviews explainability that justifies recommendations could assist users better understand the recommended items and increase the tra… ▽ More

    Submitted 18 June, 2024; v1 submitted 24 December, 2023; originally announced December 2023.

    Comments: I request to withdraw my paper due to the discovery of significant errors in terms of experimental results in the manuscript that affect the validity of the paper. These errors are necessary to correct, and the current version should not be used or cited in its present form

  18. arXiv:2312.02896  [pdf, other

    cs.CV

    BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models

    Authors: Rizhao Cai, Zirui Song, Dayan Guan, Zhenhao Chen, Xing Luo, Chenyu Yi, Alex Kot

    Abstract: Large Multimodal Models (LMMs) such as GPT-4V and LLaVA have shown remarkable capabilities in visual reasoning with common image styles. However, their robustness against diverse style shifts, crucial for practical applications, remains largely unexplored. In this paper, we propose a new benchmark, BenchLMM, to assess the robustness of LMMs against three different styles: artistic image style, ima… ▽ More

    Submitted 5 December, 2023; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: Code is available at https://github.com/AIFEG/BenchLMM

  19. arXiv:2311.14760  [pdf, other

    cs.CV

    SinSR: Diffusion-Based Image Super-Resolution in a Single Step

    Authors: Yufei Wang, Wenhan Yang, Xinyuan Chen, Yaohui Wang, Lanqing Guo, Lap-Pui Chau, Ziwei Liu, Yu Qiao, Alex C. Kot, Bihan Wen

    Abstract: While super-resolution (SR) methods based on diffusion models exhibit promising results, their practical application is hindered by the substantial number of required inference steps. Recent methods utilize degraded images in the initial state, thereby shortening the Markov chain. Nevertheless, these solutions either rely on a precise formulation of the degradation process or still necessitate a r… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

  20. arXiv:2310.00234  [pdf, other

    cs.CR cs.CV eess.IV

    Pixel-Inconsistency Modeling for Image Manipulation Localization

    Authors: Chenqi Kong, Anwei Luo, Shiqi Wang, Haoliang Li, Anderson Rocha, Alex C. Kot

    Abstract: Digital image forensics plays a crucial role in image authentication and manipulation localization. Despite the progress powered by deep neural networks, existing forgery localization methodologies exhibit limitations when deployed to unseen datasets and perturbed images (i.e., lack of generalization and robustness to real-world applications). To circumvent these problems and aid image integrity,… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

  21. arXiv:2309.11092  [pdf, other

    cs.CV cs.MM

    Forgery-aware Adaptive Vision Transformer for Face Forgery Detection

    Authors: Anwei Luo, Rizhao Cai, Chenqi Kong, Xiangui Kang, Jiwu Huang, Alex C. Kot

    Abstract: With the advancement in face manipulation technologies, the importance of face forgery detection in protecting authentication integrity becomes increasingly evident. Previous Vision Transformer (ViT)-based detectors have demonstrated subpar performance in cross-database evaluations, primarily because fully fine-tuning with limited Deepfake data often leads to forgetting pre-trained knowledge and o… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  22. arXiv:2309.04038  [pdf, other

    cs.CV

    S-Adapter: Generalizing Vision Transformer for Face Anti-Spoofing with Statistical Tokens

    Authors: Rizhao Cai, Zitong Yu, Chenqi Kong, Haoliang Li, Changsheng Chen, Yongjian Hu, Alex Kot

    Abstract: Face Anti-Spoofing (FAS) aims to detect malicious attempts to invade a face recognition system by presenting spoofed faces. State-of-the-art FAS techniques predominantly rely on deep learning models but their cross-domain generalization capabilities are often hindered by the domain shift problem, which arises due to different distributions between training and testing data. In this study, we devel… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: Accepted by IEEE Transactions on Information Forensics Security (June 2024)

  23. arXiv:2308.09107  [pdf, other

    cs.CV

    Hyperbolic Face Anti-Spoofing

    Authors: Shuangpeng Han, Rizhao Cai, Yawen Cui, Zitong Yu, Yongjian Hu, Alex Kot

    Abstract: Learning generalized face anti-spoofing (FAS) models against presentation attacks is essential for the security of face recognition systems. Previous FAS methods usually encourage models to extract discriminative features, of which the distances within the same class (bonafide or attack) are pushed close while those between bonafide and attack are pulled away. However, these methods are designed b… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

  24. arXiv:2307.07710  [pdf, other

    cs.CV eess.IV

    ExposureDiffusion: Learning to Expose for Low-light Image Enhancement

    Authors: Yufei Wang, Yi Yu, Wenhan Yang, Lanqing Guo, Lap-Pui Chau, Alex C. Kot, Bihan Wen

    Abstract: Previous raw image-based low-light image enhancement methods predominantly relied on feed-forward neural networks to learn deterministic map**s from low-light to normally-exposed images. However, they failed to capture critical distribution information, leading to visually undesirable results. This work addresses the issue by seamlessly integrating a diffusion model with a physics-based exposure… ▽ More

    Submitted 15 August, 2023; v1 submitted 15 July, 2023; originally announced July 2023.

    Comments: accepted by ICCV2023

  25. arXiv:2307.07286  [pdf, other

    cs.CV cs.AI

    One-Shot Action Recognition via Multi-Scale Spatial-Temporal Skeleton Matching

    Authors: Siyuan Yang, Jun Liu, Shijian Lu, Er Meng Hwa, Alex C. Kot

    Abstract: One-shot skeleton action recognition, which aims to learn a skeleton action recognition model with a single training sample, has attracted increasing interest due to the challenge of collecting and annotating large-scale skeleton action data. However, most existing studies match skeleton sequences by comparing their feature vectors directly which neglects spatial structures and temporal orders of… ▽ More

    Submitted 6 February, 2024; v1 submitted 14 July, 2023; originally announced July 2023.

    Comments: 8 pages, 4 figures, 6 tables. Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence

  26. arXiv:2307.04122  [pdf, other

    cs.CV eess.IV

    Enhancing Low-Light Images Using Infrared-Encoded Images

    Authors: Shulin Tian, Yufei Wang, Renjie Wan, Wenhan Yang, Alex C. Kot, Bihan Wen

    Abstract: Low-light image enhancement task is essential yet challenging as it is ill-posed intrinsically. Previous arts mainly focus on the low-light images captured in the visible spectrum using pixel-wise loss, which limits the capacity of recovering the brightness, contrast, and texture details due to the small number of income photons. In this work, we propose a novel approach to increase the visibility… ▽ More

    Submitted 9 July, 2023; originally announced July 2023.

    Comments: The first two authors contribute equally. The work is accepted by ICIP 2023

  27. arXiv:2306.12058  [pdf, other

    cs.CV eess.IV

    Beyond Learned Metadata-based Raw Image Reconstruction

    Authors: Yufei Wang, Yi Yu, Wenhan Yang, Lanqing Guo, Lap-Pui Chau, Alex C. Kot, Bihan Wen

    Abstract: While raw images have distinct advantages over sRGB images, e.g., linearity and fine-grained quantization levels, they are not widely adopted by general users due to their substantial storage requirements. Very recent studies propose to compress raw images by designing sampling masks within the pixel space of the raw image. However, these approaches often leave space for pursuing more effective im… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  28. arXiv:2304.12489  [pdf, other

    cs.CV cs.CR

    Beyond the Prior Forgery Knowledge: Mining Critical Clues for General Face Forgery Detection

    Authors: Anwei Luo, Chenqi Kong, Jiwu Huang, Yongjian Hu, Xiangui Kang, Alex C. Kot

    Abstract: Face forgery detection is essential in combating malicious digital face attacks. Previous methods mainly rely on prior expert knowledge to capture specific forgery clues, such as noise patterns, blending boundaries, and frequency artifacts. However, these methods tend to get trapped in local optima, resulting in limited robustness and generalization capability. To address these issues, we propose… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

  29. arXiv:2304.08799  [pdf, other

    cs.CV cs.AI

    Self-Supervised 3D Action Representation Learning with Skeleton Cloud Colorization

    Authors: Siyuan Yang, Jun Liu, Shijian Lu, Er Meng Hwa, Yongjian Hu, Alex C. Kot

    Abstract: 3D Skeleton-based human action recognition has attracted increasing attention in recent years. Most of the existing work focuses on supervised learning which requires a large number of labeled action sequences that are often expensive and time-consuming to annotate. In this paper, we address self-supervised 3D action representation learning for skeleton-based action recognition. We investigate sel… ▽ More

    Submitted 16 October, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: Accepted by TPAMI. This work is an extension of our ICCV 2021 paper [arXiv:2108.01959] https://openaccess.thecvf.com/content/ICCV2021/html/Yang_Skeleton_Cloud_Colorization_for_Unsupervised_3D_Action_Representation_Learning_ICCV_2021_paper.html

  30. arXiv:2303.12745  [pdf, other

    cs.CV cs.AI

    Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning

    Authors: Xiaobao Guo, Nithish Muthuchamy Selvaraj, Zitong Yu, Adams Wai-Kin Kong, Bingquan Shen, Alex Kot

    Abstract: Deception detection in conversations is a challenging yet important task, having pivotal applications in many fields such as credibility assessment in business, multimedia anti-frauds, and custom security. Despite this, deception detection research is hindered by the lack of high-quality deception datasets, as well as the difficulties of learning multimodal features effectively. To address this is… ▽ More

    Submitted 3 August, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

    Comments: 11 pages, 6 figures

  31. arXiv:2303.10452  [pdf, other

    cs.CV

    Confidence Attention and Generalization Enhanced Distillation for Continuous Video Domain Adaptation

    Authors: Xiyu Wang, Yuecong Xu, Jianfei Yang, Bihan Wen, Alex C. Kot

    Abstract: Continuous Video Domain Adaptation (CVDA) is a scenario where a source model is required to adapt to a series of individually available changing target domains continuously without source data or target supervision. It has wide applications, such as robotic vision and autonomous driving. The main underlying challenge of CVDA is to learn helpful information only from the unsupervised target data wh… ▽ More

    Submitted 29 August, 2023; v1 submitted 18 March, 2023; originally announced March 2023.

    Comments: 16 pages, 9 tables, 10 figures

  32. arXiv:2303.09914  [pdf, other

    cs.CV

    Rehearsal-Free Domain Continual Face Anti-Spoofing: Generalize More and Forget Less

    Authors: Rizhao Cai, Yawen Cui, Zhi Li, Zitong Yu, Haoliang Li, Yongjian Hu, Alex Kot

    Abstract: Face Anti-Spoofing (FAS) is recently studied under the continual learning setting, where the FAS models are expected to evolve after encountering the data from new domains. However, existing methods need extra replay buffers to store previous data for rehearsal, which becomes infeasible when previous data is unavailable because of privacy issues. In this paper, we propose the first rehearsal-free… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  33. arXiv:2303.02057  [pdf, other

    eess.IV cs.CV

    Unsupervised Deep Digital Staining For Microscopic Cell Images Via Knowledge Distillation

    Authors: Ziwang Xu, Lanqing Guo, Shuyan Zhang, Alex C. Kot, Bihan Wen

    Abstract: Staining is critical to cell imaging and medical diagnosis, which is expensive, time-consuming, labor-intensive, and causes irreversible changes to cell tissues. Recent advances in deep learning enabled digital staining via supervised model training. However, it is difficult to obtain large-scale stained/unstained cell image pairs in practice, which need to be perfectly aligned with the supervisio… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  34. arXiv:2302.14677  [pdf, other

    cs.CV cs.CR eess.IV

    Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger

    Authors: Yi Yu, Yufei Wang, Wenhan Yang, Shijian Lu, Yap-peng Tan, Alex C. Kot

    Abstract: Recent deep-learning-based compression methods have achieved superior performance compared with traditional approaches. However, deep learning models have proven to be vulnerable to backdoor attacks, where some specific trigger patterns added to the input can lead to malicious behavior of the models. In this paper, we present a novel backdoor attack with multiple triggers against learned image com… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: Accepted by CVPR 2023

    ACM Class: I.4

  35. arXiv:2302.14314  [pdf, other

    cs.SD eess.AS

    Adapter Incremental Continual Learning of Efficient Audio Spectrogram Transformers

    Authors: Nithish Muthuchamy Selvaraj, Xiaobao Guo, Adams Kong, Bingquan Shen, Alex Kot

    Abstract: Continual learning involves training neural networks incrementally for new tasks while retaining the knowledge of previous tasks. However, efficiently fine-tuning the model for sequential tasks with minimal computational resources remains a challenge. In this paper, we propose Task Incremental Continual Learning (TI-CL) of audio classifiers with both parameter-efficient and compute-efficient Audio… ▽ More

    Submitted 2 January, 2024; v1 submitted 28 February, 2023; originally announced February 2023.

  36. arXiv:2302.14309  [pdf, other

    cs.CV

    Temporal Coherent Test-Time Optimization for Robust Video Classification

    Authors: Chenyu Yi, Siyuan Yang, Yufei Wang, Haoliang Li, Yap-Peng Tan, Alex C. Kot

    Abstract: Deep neural networks are likely to fail when the test data is corrupted in real-world deployment (e.g., blur, weather, etc.). Test-time optimization is an effective way that adapts models to generalize to corrupted data during testing, which has been shown in the image domain. However, the techniques for improving video classification corruption robustness remain few. In this work, we propose a Te… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

  37. arXiv:2302.12995  [pdf, other

    cs.CV eess.IV

    Raw Image Reconstruction with Learned Compact Metadata

    Authors: Yufei Wang, Yi Yu, Wenhan Yang, Lanqing Guo, Lap-Pui Chau, Alex Kot, Bihan Wen

    Abstract: While raw images exhibit advantages over sRGB images (e.g., linearity and fine-grained quantization level), they are not widely used by common users due to the large storage requirements. Very recent works propose to compress raw images by designing the sampling masks in the raw image pixel space, leading to suboptimal image representations and redundant metadata. In this paper, we propose a novel… ▽ More

    Submitted 27 February, 2023; v1 submitted 25 February, 2023; originally announced February 2023.

    Comments: Accepted by CVPR 2023

  38. Evaluating the Efficacy of Skincare Product: A Realistic Short-Term Facial Pore Simulation

    Authors: Ling Li, Bandara Dissanayake, Tatsuya Omotezako, Yunjie Zhong, Qing Zhang, Rizhao Cai, Qian Zheng, Dennis Sng, Weisi Lin, Yufei Wang, Alex C Kot

    Abstract: Simulating the effects of skincare products on face is a potential new way to communicate the efficacy of skincare products in skin diagnostics and product recommendations. Furthermore, such simulations enable one to anticipate his/her skin conditions and better manage skin health. However, there is a lack of effective simulations today. In this paper, we propose the first simulation model to reve… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: 6 pages, 7 figures

  39. arXiv:2302.05936  [pdf, other

    cs.CV

    Generalized Few-Shot Continual Learning with Contrastive Mixture of Adapters

    Authors: Yawen Cui, Zitong Yu, Rizhao Cai, Xun Wang, Alex C. Kot, Li Liu

    Abstract: The goal of Few-Shot Continual Learning (FSCL) is to incrementally learn novel tasks with limited labeled samples and preserve previous capabilities simultaneously, while current FSCL methods are all for the class-incremental purpose. Moreover, the evaluation of FSCL solutions is only the cumulative performance of all encountered tasks, but there is no work on exploring the domain generalization a… ▽ More

    Submitted 12 February, 2023; originally announced February 2023.

    Comments: Submitted to International Journal of Computer Vision (IJCV)

  40. arXiv:2302.05746  [pdf, other

    cs.CV eess.IV

    Removing Image Artifacts From Scratched Lens Protectors

    Authors: Yufei Wang, Renjie Wan, Wenhan Yang, Bihan Wen, Lap-Pui Chau, Alex C. Kot

    Abstract: A protector is placed in front of the camera lens for mobile devices to avoid damage, while the protector itself can be easily scratched accidentally, especially for plastic ones. The artifacts appear in a wide variety of patterns, making it difficult to see through them clearly. Removing image artifacts from the scratched lens protector is inherently challenging due to the occasional flare artifa… ▽ More

    Submitted 14 February, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

    Comments: Accepted by ISCAS 2023

  41. arXiv:2302.05744  [pdf, other

    cs.CV

    Rethinking Vision Transformer and Masked Autoencoder in Multimodal Face Anti-Spoofing

    Authors: Zitong Yu, Rizhao Cai, Yawen Cui, Xin Liu, Yongjian Hu, Alex Kot

    Abstract: Recently, vision transformer (ViT) based multimodal learning methods have been proposed to improve the robustness of face anti-spoofing (FAS) systems. However, there are still no works to explore the fundamental natures (\textit{e.g.}, modality-aware inputs, suitable multimodal pre-training, and efficient finetuning) in vanilla ViT for multimodal FAS. In this paper, we investigate three key factor… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

  42. arXiv:2302.05727  [pdf, other

    cs.CV

    Flexible-modal Deception Detection with Audio-Visual Adapter

    Authors: Zhaoxu Li, Zitong Yu, Nithish Muthuchamy Selvaraj, Xiaobao Guo, Bingquan Shen, Adams Wai-Kin Kong, Alex Kot

    Abstract: Detecting deception by human behaviors is vital in many fields such as custom security and multimedia anti-fraud. Recently, audio-visual deception detection attracts more attention due to its better performance than using only a single modality. However, in real-world multi-modal settings, the integrity of data can be an issue (e.g., sometimes only partial modalities are available). The missing mo… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

  43. arXiv:2211.10923  [pdf, other

    cs.CV

    Traceable and Authenticable Image Tagging for Fake News Detection

    Authors: Ruohan Meng, Zhili Zhou, Qi Cui, Kwok-Yan Lam, Alex Kot

    Abstract: To prevent fake news images from misleading the public, it is desirable not only to verify the authenticity of news images but also to trace the source of fake news, so as to provide a complete forensic chain for reliable fake news detection. To simultaneously achieve the goals of authenticity verification and source tracing, we propose a traceable and authenticable image tagging approach that is… ▽ More

    Submitted 20 November, 2022; originally announced November 2022.

  44. arXiv:2210.13810  [pdf, ps, other

    cs.LG

    Toward domain generalized pruning by scoring out-of-distribution importance

    Authors: Rizhao Cai, Haoliang Li, Alex Kot

    Abstract: Filter pruning has been widely used for compressing convolutional neural networks to reduce computation costs during the deployment stage. Recent studies have shown that filter pruning techniques can achieve lossless compression of deep neural networks, reducing redundant filters (kernels) without sacrificing accuracy performance. However, the evaluation is done when the training and testing data… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Comments: Accepted in Workshop on Distribution Shifts, 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  45. arXiv:2209.01935  [pdf, other

    cs.CV cs.MM

    Forensicability Assessment of Questioned Images in Recapturing Detection

    Authors: Changsheng Chen, Lin Zhao, Rizhao Cai, Zitong Yu, Jiwu Huang, Alex C. Kot

    Abstract: Recapture detection of face and document images is an important forensic task. With deep learning, the performances of face anti-spoofing (FAS) and recaptured document detection have been improved significantly. However, the performances are not yet satisfactory on samples with weak forensic cues. The amount of forensic cues can be quantified to allow a reliable forensic result. In this work, we p… ▽ More

    Submitted 5 September, 2022; originally announced September 2022.

    Comments: 12 pages, 10 figures, 2 tables (Submitted to TIFS July-2022)

  46. arXiv:2208.05401  [pdf, other

    cs.CV

    Benchmarking Joint Face Spoofing and Forgery Detection with Visual and Physiological Cues

    Authors: Zitong Yu, Rizhao Cai, Zhi Li, Wenhan Yang, **gang Shi, Alex C. Kot

    Abstract: Face anti-spoofing (FAS) and face forgery detection play vital roles in securing face biometric systems from presentation attacks (PAs) and vicious digital manipulation (e.g., deepfakes). Despite promising performance upon large-scale data and powerful deep models, the generalization problem of existing approaches is still an open issue. Most of recent approaches focus on 1) unimodal visual appear… ▽ More

    Submitted 8 January, 2024; v1 submitted 10 August, 2022; originally announced August 2022.

    Comments: Accepted by IEEE Transactions on Dependable and Secure Computing (TDSC). Corresponding authors: Zitong Yu and Wenhan Yang

  47. arXiv:2207.01204  [pdf, other

    cs.CV

    Adversarial Pairwise Reverse Attention for Camera Performance Imbalance in Person Re-identification: New Dataset and Metrics

    Authors: Eugene P. W. Ang, Shan Lin, Rahul Ahuja, Nemath Ahmed, Alex C. Kot

    Abstract: Existing evaluation metrics for Person Re-Identification (Person ReID) models focus on system-wide performance. However, our studies reveal weaknesses due to the uneven data distributions among cameras and different camera properties that expose the ReID system to exploitation. In this work, we raise the long-ignored ReID problem of camera performance imbalance and collect a real-world privacy-awa… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: Accepted into the IEEE International Conference on Image Processing (ICIP) 2022

  48. arXiv:2205.03792  [pdf, other

    cs.CV

    One-Class Knowledge Distillation for Face Presentation Attack Detection

    Authors: Zhi Li, Rizhao Cai, Haoliang Li, Kwok-Yan Lam, Yongjian Hu, Alex C. Kot

    Abstract: Face presentation attack detection (PAD) has been extensively studied by research communities to enhance the security of face recognition systems. Although existing methods have achieved good performance on testing data with similar distribution as the training data, their performance degrades severely in application scenarios with data of unseen distributions. In situations where the training and… ▽ More

    Submitted 8 May, 2022; originally announced May 2022.

  49. arXiv:2204.04816  [pdf, other

    cs.CR

    Distributed Hardware Accelerated Secure Joint Computation on the COPA Framework

    Authors: Rushi Patel, Pouya Haghi, Shweta Jain, Andriy Kot, Venkata Krishnan, Mayank Varia, Martin Herbordt

    Abstract: Performance of distributed data center applications can be improved through use of FPGA-based SmartNICs, which provide additional functionality and enable higher bandwidth communication. Until lately, however, the lack of a simple approach for customizing SmartNICs to application requirements has limited the potential benefits. Intel's Configurable Network Protocol Accelerator (COPA) provides a cu… ▽ More

    Submitted 10 April, 2022; originally announced April 2022.

  50. arXiv:2203.16931  [pdf, other

    cs.CV cs.CR

    Towards Robust Rain Removal Against Adversarial Attacks: A Comprehensive Benchmark Analysis and Beyond

    Authors: Yi Yu, Wenhan Yang, Yap-Peng Tan, Alex C. Kot

    Abstract: Rain removal aims to remove rain streaks from images/videos and reduce the disruptive effects caused by rain. It not only enhances image/video visibility but also allows many computer vision algorithms to function properly. This paper makes the first attempt to conduct a comprehensive study on the robustness of deep learning-based rain removal methods against adversarial attacks. Our study shows t… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

    Comments: 10 pages, 6 figures, to appear in CVPR 2022