Skip to main content

Showing 1–50 of 84 results for author: Kwong, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.06039  [pdf, other

    cs.CV

    Diving into Underwater: Segment Anything Model Guided Underwater Salient Instance Segmentation and A Large-scale Dataset

    Authors: Shijie Lian, Ziyi Zhang, Hua Li, Wenjie Li, Laurence Tianruo Yang, Sam Kwong, Runmin Cong

    Abstract: With the breakthrough of large models, Segment Anything Model (SAM) and its extensions have been attempted to apply in diverse tasks of computer vision. Underwater salient instance segmentation is a foundational and vital step for various underwater vision tasks, which often suffer from low segmentation accuracy due to the complex underwater circumstances and the adaptive ability of models. Moreov… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024, Code released at: https://github.com/LiamLian0727/USIS10K

  2. arXiv:2404.09458  [pdf, other

    cs.CV cs.GR

    CompGS: Efficient 3D Scene Representation via Compressed Gaussian Splatting

    Authors: Xiangrui Liu, Xinju Wu, **** Zhang, Shiqi Wang, Zhu Li, Sam Kwong

    Abstract: Gaussian splatting, renowned for its exceptional rendering quality and efficiency, has emerged as a prominent technique in 3D scene representation. However, the substantial data volume of Gaussian splatting impedes its practical utility in real-world applications. Herein, we propose an efficient 3D scene representation, named Compressed Gaussian Splatting (CompGS), which harnesses compact Gaussian… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Submitted to a conference

  3. arXiv:2403.07290  [pdf, other

    cs.CV

    Learning Hierarchical Color Guidance for Depth Map Super-Resolution

    Authors: Runmin Cong, Ronghui Sheng, Hao Wu, Yulan Guo, Yunchao Wei, Wangmeng Zuo, Yao Zhao, Sam Kwong

    Abstract: Color information is the most commonly used prior knowledge for depth map super-resolution (DSR), which can provide high-frequency boundary guidance for detail restoration. However, its role and functionality in DSR have not been fully developed. In this paper, we rethink the utilization of color information and propose a hierarchical color guidance network to achieve DSR. On the one hand, the low… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  4. arXiv:2401.01563  [pdf, other

    cs.NE

    Towards Multi-Objective High-Dimensional Feature Selection via Evolutionary Multitasking

    Authors: Yinglan Feng, Liang Feng, Songbai Liu, Sam Kwong, Kay Chen Tan

    Abstract: Evolutionary Multitasking (EMT) paradigm, an emerging research topic in evolutionary computation, has been successfully applied in solving high-dimensional feature selection (FS) problems recently. However, existing EMT-based FS methods suffer from several limitations, such as a single mode of multitask generation, conducting the same generic evolutionary search for all tasks, relying on implicit… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  5. arXiv:2312.09095  [pdf, other

    cs.CV

    ColNeRF: Collaboration for Generalizable Sparse Input Neural Radiance Field

    Authors: Zhangkai Ni, Peiqi Yang, Wenhan Yang, Hanli Wang, Lin Ma, Sam Kwong

    Abstract: Neural Radiance Fields (NeRF) have demonstrated impressive potential in synthesizing novel views from dense input, however, their effectiveness is challenged when dealing with sparse input. Existing approaches that incorporate additional depth or semantic supervision can alleviate this issue to an extent. However, the process of supervision collection is not only costly but also potentially inaccu… ▽ More

    Submitted 14 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

  6. arXiv:2311.16754  [pdf, other

    cs.CV cs.AI

    Towards Full-scene Domain Generalization in Multi-agent Collaborative Bird's Eye View Segmentation for Connected and Autonomous Driving

    Authors: Senkang Hu, Zhengru Fang, Xianhao Chen, Yuguang Fang, Sam Kwong

    Abstract: Collaborative perception has recently gained significant attention in autonomous driving, improving perception quality by enabling the exchange of additional information among vehicles. However, deploying collaborative perception systems can lead to domain shifts due to diverse environmental conditions and data heterogeneity among connected and autonomous vehicles (CAVs). To address these challeng… ▽ More

    Submitted 1 January, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

  7. arXiv:2308.11627  [pdf, other

    eess.SP cs.AI cs.CV eess.IV eess.SY

    Non-Intrusive Electric Load Monitoring Approach Based on Current Feature Visualization for Smart Energy Management

    Authors: Yiwen Xu, Dengfeng Liu, Liangtao Huang, Zhiquan Lin, Tiesong Zhao, Sam Kwong

    Abstract: The state-of-the-art smart city has been calling for an economic but efficient energy management over large-scale network, especially for the electric power system. It is a critical issue to monitor, analyze and control electric loads of all users in system. In this paper, we employ the popular computer vision techniques of AI to design a non-invasive load monitoring method for smart electric ener… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  8. arXiv:2308.08935  [pdf, other

    cs.CV

    SDDNet: Style-guided Dual-layer Disentanglement Network for Shadow Detection

    Authors: Runmin Cong, Yuchen Guan, **peng Chen, Wei Zhang, Yao Zhao, Sam Kwong

    Abstract: Despite significant progress in shadow detection, current methods still struggle with the adverse impact of background color, which may lead to errors when shadows are present on complex backgrounds. Drawing inspiration from the human visual system, we treat the input shadow image as a composition of a background layer and a shadow layer, and design a Style-guided Dual-layer Disentanglement Networ… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM MM 2023

  9. arXiv:2308.08930  [pdf, other

    cs.CV

    Point-aware Interaction and CNN-induced Refinement Network for RGB-D Salient Object Detection

    Authors: Runmin Cong, Hongyu Liu, Chen Zhang, Wei Zhang, Feng Zheng, Ran Song, Sam Kwong

    Abstract: By integrating complementary information from RGB image and depth map, the ability of salient object detection (SOD) for complex and challenging scenes can be improved. In recent years, the important role of Convolutional Neural Networks (CNNs) in feature extraction and cross-modality interaction has been fully explored, but it is still insufficient in modeling global long-range dependencies of se… ▽ More

    Submitted 17 August, 2023; originally announced August 2023.

    Comments: Accepted by ACM MM 2023

  10. arXiv:2306.12298  [pdf, other

    cs.CV cs.LG eess.IV

    StarVQA+: Co-training Space-Time Attention for Video Quality Assessment

    Authors: Fengchuang Xing, Yuan-Gen Wang, Weixuan Tang, Guopu Zhu, Sam Kwong

    Abstract: Self-attention based Transformer has achieved great success in many computer vision tasks. However, its application to video quality assessment (VQA) has not been satisfactory so far. Evaluating the quality of in-the-wild videos is challenging due to the unknown of pristine reference and shooting distortion. This paper presents a co-trained Space-Time Attention network for the VQA problem, termed… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  11. arXiv:2306.08918  [pdf, other

    eess.IV cs.CV

    PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with Dual-Discriminators

    Authors: Runmin Cong, Wenyu Yang, Wei Zhang, Chongyi Li, Chun-Le Guo, Qingming Huang, Sam Kwong

    Abstract: Due to the light absorption and scattering induced by the water medium, underwater images usually suffer from some degradation problems, such as low contrast, color distortion, and blurring details, which aggravate the difficulty of downstream underwater understanding tasks. Therefore, how to obtain clear and visually pleasant images has become a common concern of people, and the task of underwate… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: 8 pages, 4 figures, Accepted by IEEE Transactions on Image Processing 2023

  12. Geometric Prior Based Deep Human Point Cloud Geometry Compression

    Authors: Xinju Wu, **** Zhang, Meng Wang, Peilin Chen, Shiqi Wang, Sam Kwong

    Abstract: The emergence of digital avatars has raised an exponential increase in the demand for human point clouds with realistic and intricate details. The compression of such data becomes challenging with overwhelming data amounts comprising millions of points. Herein, we leverage the human geometric prior in geometry redundancy removal of point clouds, greatly promoting the compression performance. More… ▽ More

    Submitted 25 March, 2024; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: Accepted by TCSVT 2024

  13. arXiv:2212.12378  [pdf, other

    cs.CV

    Multi-Projection Fusion and Refinement Network for Salient Object Detection in 360° Omnidirectional Image

    Authors: Runmin Cong, Ke Huang, Jianjun Lei, Yao Zhao, Qingming Huang, Sam Kwong

    Abstract: Salient object detection (SOD) aims to determine the most visually attractive objects in an image. With the development of virtual reality technology, 360° omnidirectional image has been widely used, but the SOD task in 360° omnidirectional image is seldom studied due to its severe distortions and complex scenes. In this paper, we propose a Multi-Projection Fusion and Refinement Network (MPFR-Net)… ▽ More

    Submitted 23 December, 2022; originally announced December 2022.

    Comments: Accepted by IEEE Transactions on Neural Networks and Learning Systems 2022

  14. arXiv:2211.07891  [pdf, other

    cs.CV

    Feedback Chain Network For Hippocampus Segmentation

    Authors: Heyu Huang, Runmin Cong, Lianhe Yang, Ling Du, Cong Wang, Sam Kwong

    Abstract: The hippocampus plays a vital role in the diagnosis and treatment of many neurological disorders. Recent years, deep learning technology has made great progress in the field of medical image segmentation, and the performance of related tasks has been constantly refreshed. In this paper, we focus on the hippocampus segmentation task and propose a novel hierarchical feedback chain network. The feedb… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: Accepted by ACM TOMM 2022

  15. arXiv:2210.05912  [pdf, other

    cs.CV

    PSNet: Parallel Symmetric Network for Video Salient Object Detection

    Authors: Runmin Cong, Weiyu Song, Jianjun Lei, Guanghui Yue, Yao Zhao, Sam Kwong

    Abstract: For the video salient object detection (VSOD) task, how to excavate the information from the appearance modality and the motion modality has always been a topic of great concern. The two-stream structure, including an RGB appearance stream and an optical flow motion stream, has been widely used as a typical pipeline for VSOD tasks, but the existing methods usually only use motion features to unidi… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: Accepted by IEEE Transactions on Emerging Topics in Computational Intelligence 2022, 13 pages, 8 figures

  16. arXiv:2210.04266  [pdf, other

    cs.CV

    Does Thermal Really Always Matter for RGB-T Salient Object Detection?

    Authors: Runmin Cong, Kepu Zhang, Chen Zhang, Feng Zheng, Yao Zhao, Qingming Huang, Sam Kwong

    Abstract: In recent years, RGB-T salient object detection (SOD) has attracted continuous attention, which makes it possible to identify salient objects in environments such as low light by introducing thermal image. However, most of the existing RGB-T SOD models focus on how to perform cross-modality feature fusion, ignoring whether thermal image is really always matter in SOD task. Starting from the defini… ▽ More

    Submitted 9 October, 2022; originally announced October 2022.

    Comments: Accepted by IEEE Trans. Multimedia 2022, 13 pages, 9 figures

  17. arXiv:2210.04158  [pdf, other

    eess.IV cs.CV

    HVS Revisited: A Comprehensive Video Quality Assessment Framework

    Authors: Ao-Xiang Zhang, Yuan-Gen Wang, Weixuan Tang, Leida Li, Sam Kwong

    Abstract: Video quality is a primary concern for video service providers. In recent years, the techniques of video quality assessment (VQA) based on deep convolutional neural networks (CNNs) have been developed rapidly. Although existing works attempt to introduce the knowledge of the human visual system (HVS) into VQA, there still exhibit limitations that prevent the full exploitation of HVS, including an… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

    Comments: 13 pages, 5 figures, Journal paper

  18. arXiv:2209.05856   

    cs.CV cs.LG

    Just Noticeable Difference Modeling for Face Recognition System

    Authors: Yu Tian, Zhangkai Ni, Baoliang Chen, Shurun Wang, Shiqi Wang, Hanli Wang, Sam Kwong

    Abstract: High-quality face images are required to guarantee the stability and reliability of automatic face recognition (FR) systems in surveillance and security scenarios. However, a massive amount of face data is usually compressed before being analyzed due to limitations on transmission or storage. The compressed images may lose the powerful identity information, resulting in the performance degradation… ▽ More

    Submitted 28 September, 2023; v1 submitted 13 September, 2022; originally announced September 2022.

    Comments: MegaFace dataset we used in the manuscript are no longer publicly available

  19. arXiv:2209.05321  [pdf, other

    cs.CV

    Deep Feature Statistics Map** for Generalized Screen Content Image Quality Assessment

    Authors: Baoliang Chen, Hanwei Zhu, Lingyu Zhu, Shiqi Wang, Sam Kwong

    Abstract: The statistical regularities of natural images, referred to as natural scene statistics, play an important role in no-reference image quality assessment. However, it has been widely acknowledged that screen content images (SCIs), which are typically computer generated, do not hold such statistics. Here we make the first attempt to learn the statistics of SCIs, based upon which the quality of SCIs… ▽ More

    Submitted 21 April, 2024; v1 submitted 12 September, 2022; originally announced September 2022.

  20. arXiv:2209.02957  [pdf, other

    cs.CV

    A Weakly Supervised Learning Framework for Salient Object Detection via Hybrid Labels

    Authors: Runmin Cong, Qi Qin, Chen Zhang, Qiu** Jiang, Shiqi Wang, Yao Zhao, Sam Kwong

    Abstract: Fully-supervised salient object detection (SOD) methods have made great progress, but such methods often rely on a large number of pixel-level annotations, which are time-consuming and labour-intensive. In this paper, we focus on a new weakly-supervised SOD task under hybrid labels, where the supervision labels include a large number of coarse labels generated by the traditional unsupervised metho… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

    Comments: Accepted by IEEE Transactions on Circuits and Systems for Video Technology 2022

  21. arXiv:2209.02934  [pdf, other

    eess.IV cs.CV

    Boundary Guided Semantic Learning for Real-time COVID-19 Lung Infection Segmentation System

    Authors: Runmin Cong, Yumo Zhang, Ning Yang, Haisheng Li, Xueqi Zhang, Ruochen Li, Zewen Chen, Yao Zhao, Sam Kwong

    Abstract: The coronavirus disease 2019 (COVID-19) continues to have a negative impact on healthcare systems around the world, though the vaccines have been developed and national vaccination coverage rate is steadily increasing. At the current stage, automatically segmenting the lung infection area from CT images is essential for the diagnosis and treatment of COVID-19. Thanks to the development of deep lea… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

    Comments: Accepted by IEEE Transactions on Consumer Electronics 2022

  22. arXiv:2209.02285  [pdf, other

    cs.CV eess.IV

    High Dynamic Range Image Quality Assessment Based on Frequency Disparity

    Authors: Yue Liu, Zhangkai Ni, Shiqi Wang, Hanli Wang, Sam Kwong

    Abstract: In this paper, a novel and effective image quality assessment (IQA) algorithm based on frequency disparity for high dynamic range (HDR) images is proposed, termed as local-global frequency feature-based model (LGFM). Motivated by the assumption that the human visual system is highly adapted for extracting structural information and partial frequencies when perceiving the visual scene, the Gabor an… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

  23. arXiv:2208.10077  [pdf, other

    cs.CV cs.AI

    Identifying Auxiliary or Adversarial Tasks Using Necessary Condition Analysis for Adversarial Multi-task Video Understanding

    Authors: Stephen Su, Samuel Kwong, Qingyu Zhao, De-An Huang, Juan Carlos Niebles, Ehsan Adeli

    Abstract: There has been an increasing interest in multi-task learning for video understanding in recent years. In this work, we propose a generalized notion of multi-task learning by incorporating both auxiliary tasks that the model should perform well on and adversarial tasks that the model should not perform well on. We employ Necessary Condition Analysis (NCA) as a data-driven approach for deciding what… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

  24. arXiv:2208.08145  [pdf, other

    cs.CV

    Stereo Superpixel Segmentation Via Decoupled Dynamic Spatial-Embedding Fusion Network

    Authors: Hua Li, Junyan Liang, Ruiqi Wu, Runmin Cong, Junhui Wu, Sam Tak Wu Kwong

    Abstract: Stereo superpixel segmentation aims at grou** the discretizing pixels into perceptual regions through left and right views more collaboratively and efficiently. Existing superpixel segmentation algorithms mostly utilize color and spatial features as input, which may impose strong constraints on spatial information while utilizing the disparity information in terms of stereo image pairs. To allev… ▽ More

    Submitted 17 August, 2022; originally announced August 2022.

    Comments: 11 pages, 13 figures

  25. DeepWSD: Projecting Degradations in Perceptual Space to Wasserstein Distance in Deep Feature Space

    Authors: Xingran Liao, Baoliang Chen, Hanwei Zhu, Shiqi Wang, Mingliang Zhou, Sam Kwong

    Abstract: Existing deep learning-based full-reference IQA (FR-IQA) models usually predict the image quality in a deterministic way by explicitly comparing the features, gauging how severely distorted an image is by how far the corresponding feature lies from the space of the reference images. Herein, we look at this problem from a different viewpoint and propose to model the quality degradation in perceptua… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

    Comments: ACM Multimedia 2022 accepted thesis

  26. Consistent Quality Oriented Rate Control in HEVC via Balancing Intra and Inter Frame Coding

    Authors: Wei Gao, Qiu** Jiang, Ronggang Wang, Siwei Ma, Ge Li, Sam Kwong

    Abstract: Consistent quality oriented rate control in video coding has attracted much more attention. However, the existing efforts only focus on decreasing variations between every two adjacent frames, but neglect coding trade-off problem between intra and inter frames. In this paper, we deal with it from a new perspective, where intra frame quantization parameter (IQP) and rate control are optimized for b… ▽ More

    Submitted 29 July, 2022; originally announced August 2022.

    Comments: 10 pages

    Journal ref: in IEEE Transactions on Industrial Informatics, vol. 18, no. 3, pp. 1594-1604, March 2022

  27. arXiv:2207.08114  [pdf, other

    eess.IV cs.CV

    BCS-Net: Boundary, Context and Semantic for Automatic COVID-19 Lung Infection Segmentation from CT Images

    Authors: Runmin Cong, Haowei Yang, Qiu** Jiang, Wei Gao, Haisheng Li, Cong Wang, Yao Zhao, Sam Kwong

    Abstract: The spread of COVID-19 has brought a huge disaster to the world, and the automatic segmentation of infection regions can help doctors to make diagnosis quickly and reduce workload. However, there are several challenges for the accurate and complete segmentation, such as the scattered infection area distribution, complex background noises, and blurred segmentation boundaries. To this end, in this p… ▽ More

    Submitted 17 July, 2022; originally announced July 2022.

    Comments: Accepted by IEEE Transactions on Instrumentation and Measurement 2022, Code: https://github.com/rmcong/BCS-Net-TIM22

  28. arXiv:2207.00965  [pdf, other

    cs.CV eess.IV

    Cycle-Interactive Generative Adversarial Network for Robust Unsupervised Low-Light Enhancement

    Authors: Zhangkai Ni, Wenhan Yang, Hanli Wang, Shiqi Wang, Lin Ma, Sam Kwong

    Abstract: Getting rid of the fundamental limitations in fitting to the paired training data, recent unsupervised low-light enhancement methods excel in adjusting illumination and contrast of images. However, for unsupervised low light enhancement, the remaining noise suppression issue due to the lacking of supervision of detailed signal largely impedes the wide deployment of these methods in real-world appl… ▽ More

    Submitted 3 July, 2022; originally announced July 2022.

    Comments: 9 pages, 7 figures, accepted to ACM MM 2022

  29. arXiv:2206.03105  [pdf, other

    cs.CV

    Dual Swin-Transformer based Mutual Interactive Network for RGB-D Salient Object Detection

    Authors: Chao Zeng, Sam Kwong

    Abstract: Salient Object Detection is the task of predicting the human attended region in a given scene. Fusing depth information has been proven effective in this task. The main challenge of this problem is how to aggregate the complementary information from RGB modality and depth modality. However, conventional deep models heavily rely on CNN feature extractors, and the long-range contextual dependencies… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

  30. Recent Advances in Rate Control: From Optimisation to Implementation and Beyond

    Authors: Xuekai Wei, Mingliang Zhou, Heqiang Wang, Haoyan Yang, Lei Chen, Sam Kwong

    Abstract: Video coding is a video compression technique that compresses the original video sequence to produce a smaller archive file or reduce the transmission bandwidth under constraints on the visual quality loss. Rate control (RC) plays a critical role in video coding. It can achieve stable stream output in practical applications, especially real-time video applications such as video conferencing or gam… ▽ More

    Submitted 18 June, 2023; v1 submitted 22 May, 2022; originally announced May 2022.

    Comments: Copyright \c{opyright} 20xx IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an email to [email protected]

  31. arXiv:2205.03587  [pdf, other

    eess.IV cs.CV

    Efficient VVC Intra Prediction Based on Deep Feature Fusion and Probability Estimation

    Authors: Tiesong Zhao, Yuhang Huang, Weize Feng, Yiwen Xu, Sam Kwong

    Abstract: The ever-growing multimedia traffic has underscored the importance of effective multimedia codecs. Among them, the up-to-date lossy video coding standard, Versatile Video Coding (VVC), has been attracting attentions of video coding community. However, the gain of VVC is achieved at the cost of significant encoding complexity, which brings the need to realize fast encoder with comparable Rate Disto… ▽ More

    Submitted 7 May, 2022; originally announced May 2022.

    Comments: 10 pages, 10 figures

  32. arXiv:2204.08917  [pdf, other

    cs.CV

    Global-and-Local Collaborative Learning for Co-Salient Object Detection

    Authors: Runmin Cong, Ning Yang, Chongyi Li, Huazhu Fu, Yao Zhao, Qingming Huang, Sam Kwong

    Abstract: The goal of co-salient object detection (CoSOD) is to discover salient objects that commonly appear in a query group containing two or more relevant images. Therefore, how to effectively extract inter-image correspondence is crucial for the CoSOD task. In this paper, we propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

    Comments: Accepted by IEEE Transactions on Cybernetics 2022, project page: https://rmcong.github.io/proj_GLNet.html

  33. arXiv:2204.04059  [pdf, other

    eess.IV cs.CV cs.MM

    Deep Learning-Based Intra Mode Derivation for Versatile Video Coding

    Authors: Linwei Zhu, Yun Zhang, Na Li, Gangyi Jiang, Sam Kwong

    Abstract: In intra coding, Rate Distortion Optimization (RDO) is performed to achieve the optimal intra mode from a pre-defined candidate list. The optimal intra mode is also required to be encoded and transmitted to the decoder side besides the residual signal, where lots of coding bits are consumed. To further improve the performance of intra coding in Versatile Video Coding (VVC), an intelligent intra mo… ▽ More

    Submitted 8 April, 2022; originally announced April 2022.

    Comments: 19 pages, 7 figures, submitted to ACM TOMM

  34. arXiv:2203.05349  [pdf, other

    cs.MM cs.CV

    Two-stream Hierarchical Similarity Reasoning for Image-text Matching

    Authors: Ran Chen, Hanli Wang, Lei Wang, Sam Kwong

    Abstract: Reasoning-based approaches have demonstrated their powerful ability for the task of image-text matching. In this work, two issues are addressed for image-text matching. First, for reasoning processing, conventional approaches have no ability to find and use multi-level hierarchical similarity information. To solve this problem, a hierarchical similarity reasoning module is proposed to automaticall… ▽ More

    Submitted 10 March, 2022; originally announced March 2022.

  35. arXiv:2202.09802  [pdf, other

    cs.CV eess.IV

    Distortion-Aware Loop Filtering of Intra 360^o Video Coding with Equirectangular Projection

    Authors: **** Zhang, Xu Wang, Linwei Zhu, Yun Zhang, Shiqi Wang, Sam Kwong

    Abstract: In this paper, we propose a distortion-aware loop filtering model to improve the performance of intra coding for 360$^o$ videos projected via equirectangular projection (ERP) format. To enable the awareness of distortion, our proposed module analyzes content characteristics based on a coding unit (CU) partition mask and processes them through partial convolution to activate the specified area. The… ▽ More

    Submitted 20 February, 2022; originally announced February 2022.

  36. arXiv:2201.11975  [pdf, other

    cs.CV eess.IV

    Generalized Visual Quality Assessment of GAN-Generated Face Images

    Authors: Yu Tian, Zhangkai Ni, Baoliang Chen, Shiqi Wang, Hanli Wang, Sam Kwong

    Abstract: Recent years have witnessed the dramatically increased interest in face generation with generative adversarial networks (GANs). A number of successful GAN algorithms have been developed to produce vivid face images towards different application scenarios. However, little work has been dedicated to automatic quality assessment of such GAN-generated face images (GFIs), even less have been devoted to… ▽ More

    Submitted 28 January, 2022; originally announced January 2022.

    Comments: 12 pages, 8 figures, journal paper

  37. arXiv:2112.15299  [pdf, other

    eess.IV cs.CV

    CSformer: Bridging Convolution and Transformer for Compressive Sensing

    Authors: Dongjie Ye, Zhangkai Ni, Hanli Wang, Jian Zhang, Shiqi Wang, Sam Kwong

    Abstract: Convolution neural networks (CNNs) have succeeded in compressive image sensing. However, due to the inductive bias of locality and weight sharing, the convolution operations demonstrate the intrinsic limitations in modeling the long-range dependency. Transformer, designed initially as a sequence-to-sequence model, excels at capturing global contexts due to the self-attention-based architectures ev… ▽ More

    Submitted 30 December, 2021; originally announced December 2021.

  38. arXiv:2112.12284  [pdf, other

    cs.MM eess.IV

    A Survey on Perceptually Optimized Video Coding

    Authors: Yun Zhang, Linwei Zhu, Gangyi Jiang, Sam Kwong, C. -C. Jay Kuo

    Abstract: To provide users with more realistic visual experiences, videos are develo** in the trends of Ultra High Definition (UHD), High Frame Rate (HFR), High Dynamic Range (HDR), Wide Color Gammut (WCG) and high clarity. However, the data amount of videos increases exponentially, which requires high efficiency video compression for storage and network transmission. Perceptually optimized video coding a… ▽ More

    Submitted 15 November, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

    Comments: 36 pages, 12 figures, 6 tables, accepted by ACM Computing Surveys

  39. arXiv:2112.00485  [pdf, other

    cs.CV eess.IV

    Learning Transformer Features for Image Quality Assessment

    Authors: Chao Zeng, Sam Kwong

    Abstract: Objective image quality evaluation is a challenging task, which aims to measure the quality of a given image automatically. According to the availability of the reference images, there are Full-Reference and No-Reference IQA tasks, respectively. Most deep learning approaches use regression from deep features extracted by Convolutional Neural Networks. For the FR task, another option is conducting… ▽ More

    Submitted 23 March, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

  40. RRNet: Relational Reasoning Network with Parallel Multi-scale Attention for Salient Object Detection in Optical Remote Sensing Images

    Authors: Runmin Cong, Yumo Zhang, Leyuan Fang, Jun Li, Yao Zhao, Sam Kwong

    Abstract: Salient object detection (SOD) for optical remote sensing images (RSIs) aims at locating and extracting visually distinctive objects/regions from the optical RSIs. Despite some saliency models were proposed to solve the intrinsic problem of optical RSIs (such as complex background and scale-variant objects), the accuracy and completeness are still unsatisfactory. To this end, we propose a relation… ▽ More

    Submitted 20 February, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: 11 pages, 9 figures, Accepted by IEEE Transactions on Geoscience and Remote Sensing 2021, project: https://rmcong.github.io/proj_RRNet.html

  41. arXiv:2108.01971  [pdf, other

    cs.CV

    Cross-modality Discrepant Interaction Network for RGB-D Salient Object Detection

    Authors: Chen Zhang, Runmin Cong, Qinwei Lin, Lin Ma, Feng Li, Yao Zhao, Sam Kwong

    Abstract: The popularity and promotion of depth maps have brought new vigor and vitality into salient object detection (SOD), and a mass of RGB-D SOD algorithms have been proposed, mainly concentrating on how to better integrate cross-modality features from RGB image and depth map. For the cross-modality interaction in feature encoder, existing methods either indiscriminately treat RGB and depth modalities,… ▽ More

    Submitted 4 August, 2021; originally announced August 2021.

    Comments: 13 pages, 6 figures, Accepted by ACM MM 2021

  42. arXiv:2107.12541  [pdf, other

    cs.CV

    BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and Monocular Depth Estimation

    Authors: Qi Tang, Runmin Cong, Ronghui Sheng, Lingzhi He, Dan Zhang, Yao Zhao, Sam Kwong

    Abstract: Depth map super-resolution is a task with high practical application requirements in the industry. Existing color-guided depth map super-resolution methods usually necessitate an extra branch to extract high-frequency detail information from RGB image to guide the low-resolution depth map reconstruction. However, because there are still some differences between the two modalities, direct informati… ▽ More

    Submitted 26 July, 2021; originally announced July 2021.

    Comments: 10 pages, 7 figures, Accepted by ACM MM 2021

  43. arXiv:2107.05821  [pdf, other

    cs.CV

    Detect and Locate: Exposing Face Manipulation by Semantic- and Noise-level Telltales

    Authors: Chenqi Kong, Baoliang Chen, Haoliang Li, Shiqi Wang, Anderson Rocha, Sam Kwong

    Abstract: The technological advancements of deep learning have enabled sophisticated face manipulation schemes, raising severe trust issues and security concerns in modern society. Generally speaking, detecting manipulated faces and locating the potentially altered regions are challenging tasks. Herein, we propose a conceptually simple but effective method to efficiently detect forged faces in an image whil… ▽ More

    Submitted 6 April, 2022; v1 submitted 12 July, 2021; originally announced July 2021.

    Comments: 12 pages, 10 figures

  44. arXiv:2106.15312  [pdf, other

    cs.CV

    Contrastive Semantic Similarity Learning for Image Captioning Evaluation with Intrinsic Auto-encoder

    Authors: Chao Zeng, Tiesong Zhao, Sam Kwong

    Abstract: Automatically evaluating the quality of image captions can be very challenging since human language is quite flexible that there can be various expressions for the same meaning. Most of the current captioning metrics rely on token level matching between candidate caption and the ground truth label sentences. It usually neglects the sentence-level information. Motivated by the auto-encoder mechanis… ▽ More

    Submitted 29 June, 2021; originally announced June 2021.

  45. arXiv:2103.01689  [pdf, ps, other

    cs.LG

    Self-supervised Symmetric Nonnegative Matrix Factorization

    Authors: Yuheng Jia, Hui Liu, Junhui Hou, Sam Kwong, Qingfu Zhang

    Abstract: Symmetric nonnegative matrix factorization (SNMF) has demonstrated to be a powerful method for data clustering. However, SNMF is mathematically formulated as a non-convex optimization problem, making it sensitive to the initialization of variables. Inspired by ensemble clustering that aims to seek a better clustering result from a set of clustering results, we propose self-supervised SNMF (S$^3$NM… ▽ More

    Submitted 2 March, 2021; originally announced March 2021.

  46. On the Philosophical, Cognitive and Mathematical Foundations of Symbiotic Autonomous Systems (SAS)

    Authors: Yingxu Wang, Fakhri Karray, Sam Kwong, Konstantinos N. Plataniotis, Henry Leung, Ming Hou, Edward Tunstel, Imre J. Rudas, Ljiljana Trajkovic, Okyay Kaynak, Janusz Kacprzyk, Mengchu Zhou, Michael H. Smith, Philip Chen, Shushma Patel

    Abstract: Symbiotic Autonomous Systems (SAS) are advanced intelligent and cognitive systems exhibiting autonomous collective intelligence enabled by coherent symbiosis of human-machine interactions in hybrid societies. Basic research in the emerging field of SAS has triggered advanced general AI technologies functioning without human intervention or hybrid symbiotic systems synergizing humans and intelligen… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

    Comments: Accepted by Phil. Trans. Royal Society (A): Math, Phys & Engg Sci., 379(219x), 2021, Oxford, UK

    Journal ref: Phil. Trans. Royal Society (A): Math, Phys & Engg Sci., 379(219x), 2021, Oxford, UK

  47. arXiv:2101.10075  [pdf, other

    cs.CV cs.AI

    Camera Invariant Feature Learning for Generalized Face Anti-spoofing

    Authors: Baoliang Chen, Wenhan Yang, Haoliang Li, Shiqi Wang, Sam Kwong

    Abstract: There has been an increasing consensus in learning based face anti-spoofing that the divergence in terms of camera models is causing a large domain gap in real application scenarios. We describe a framework that eliminates the influence of inherent variance from acquisition cameras at the feature level, leading to the generalized face spoofing detection model that could be highly adaptive to diffe… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

  48. arXiv:2012.15052  [pdf, other

    eess.IV cs.CV

    Unpaired Image Enhancement with Quality-Attention Generative Adversarial Network

    Authors: Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, Sam Kwong

    Abstract: In this work, we aim to learn an unpaired image enhancement model, which can enrich low-quality images with the characteristics of high-quality images provided by users. We propose a quality attention generative adversarial network (QAGAN) trained on unpaired data based on the bidirectional Generative Adversarial Network (GAN) embedded with a quality attention module (QAM). The key novelty of the… ▽ More

    Submitted 30 December, 2020; originally announced December 2020.

  49. Towards Unsupervised Deep Image Enhancement with Generative Adversarial Network

    Authors: Zhangkai Ni, Wenhan Yang, Shiqi Wang, Lin Ma, Sam Kwong

    Abstract: Improving the aesthetic quality of images is challenging and eager for the public. To address this problem, most existing algorithms are based on supervised learning methods to learn an automatic photo enhancer for paired data, which consists of low-quality photos and corresponding expert-retouched versions. However, the style and characteristics of photos retouched by experts may not meet the nee… ▽ More

    Submitted 29 December, 2020; originally announced December 2020.

  50. arXiv:2012.07333  [pdf, other

    cs.CV

    Intrinsic Image Captioning Evaluation

    Authors: Chao Zeng, Sam Kwong

    Abstract: The image captioning task is about to generate suitable descriptions from images. For this task there can be several challenges such as accuracy, fluency and diversity. However there are few metrics that can cover all these properties while evaluating results of captioning models.In this paper we first conduct a comprehensive investigation on contemporary metrics. Motivated by the auto-encoder mec… ▽ More

    Submitted 14 December, 2020; originally announced December 2020.