Skip to main content

Showing 1–50 of 125 results for author: Xiang, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19859  [pdf, other

    cs.AI cs.HC cs.MM

    MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis

    Authors: Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, **gdong Sun, Qi He, Wangmeng Xiang, Hanyuan Chen, **-Peng Lan, Xianhui Lin, Kang Zhu, Bin Luo, Yifeng Geng, Xuansong Xie, Alexander G. Hauptmann

    Abstract: MetaDesigner revolutionizes artistic typography synthesis by leveraging the strengths of Large Language Models (LLMs) to drive a design paradigm centered around user engagement. At the core of this framework lies a multi-agent system comprising the Pipeline, Glyph, and Texture agents, which collectively enable the creation of customized WordArt, ranging from semantic enhancements to the imposition… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 18 pages, 16 figures, Project: https://modelscope.cn/studios/WordArt/WordArt

  2. arXiv:2405.20608  [pdf, other

    cs.CL

    Identifying while Learning for Document Event Causality Identification

    Authors: Cheng Liu, Wei Xiang, Bang Wang

    Abstract: Event Causality Identification (ECI) aims to detect whether there exists a causal relation between two events in a document. Existing studies adopt a kind of identifying after learning paradigm, where events' representations are first learned and then used for the identification. Furthermore, they mainly focus on the causality existence, but ignoring causal direction. In this paper, we take care o… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted at ACL 2024

  3. arXiv:2405.18974  [pdf, other

    cs.CL

    Encoding Hierarchical Schema via Concept Flow for Multifaceted Ideology Detection

    Authors: Songtao Liu, Bang Wang, Wei Xiang, Han Xu, Minghua Xu

    Abstract: Multifaceted ideology detection (MID) aims to detect the ideological leanings of texts towards multiple facets. Previous studies on ideology detection mainly focus on one generic facet and ignore label semantics and explanatory descriptions of ideologies, which are a kind of instructive information and reveal the specific concepts of ideologies. In this paper, we develop a novel concept semantics-… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 13pages, 4 figures (Accepted to Findings of ACL 2024)

  4. arXiv:2405.10512  [pdf, other

    cs.IR cs.LG

    In-context Contrastive Learning for Event Causality Identification

    Authors: Chao Liang, Wei Xiang, Bang Wang

    Abstract: Event Causality Identification (ECI) aims at determining the existence of a causal relation between two events. Although recent prompt learning-based approaches have shown promising improvements on the ECI task, their performance are often subject to the delicate design of multiple prompts and the positive correlations between the main task and derivate tasks. The in-context learning paradigm prov… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  5. arXiv:2405.09985  [pdf, other

    cs.CV

    VirtualModel: Generating Object-ID-retentive Human-object Interaction Image by Diffusion Model for E-commerce Marketing

    Authors: Binghui Chen, Chongyang Zhong, Wangmeng Xiang, Yifeng Geng, Xuansong Xie

    Abstract: Due to the significant advances in large-scale text-to-image generation by diffusion model (DM), controllable human image generation has been attracting much attention recently. Existing works, such as Controlnet [36], T2I-adapter [20] and HumanSD [10] have demonstrated good abilities in generating human images based on pose conditions, they still fail to meet the requirements of real e-commerce s… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: project page: https://aigcdesigngroup.github.io/replace-anything;

  6. arXiv:2405.09543  [pdf, other

    cs.CY cs.AI cs.IR cs.LG

    Algorithmic Fairness: A Tolerance Perspective

    Authors: Renqiang Luo, Tao Tang, Feng Xia, Jiaying Liu, Chengpei Xu, Leo Yu Zhang, Wei Xiang, Chengqi Zhang

    Abstract: Recent advancements in machine learning and deep learning have brought algorithmic fairness into sharp focus, illuminating concerns over discriminatory decision making that negatively impacts certain individuals or groups. These concerns have manifested in legal, ethical, and societal challenges, including the erosion of trust in intelligent systems. In response, this survey delves into the existi… ▽ More

    Submitted 26 April, 2024; originally announced May 2024.

    Comments: 33 pages, 4 figures

    MSC Class: 68T01; 68W40 ACM Class: I.2.6; K.4.2; H.1.2

  7. arXiv:2405.05164  [pdf, other

    cs.CV

    ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature Fusion

    Authors: Bing Zhu, Zixin He, Weiyi Xiong, Guanhua Ding, Jianan Liu, Tao Huang, Wei Chen, Wei Xiang

    Abstract: Millimeter wave (mmWave) radar is a non-intrusive privacy and relatively convenient and inexpensive device, which has been demonstrated to be applicable in place of RGB cameras in human indoor pose estimation tasks. However, mmWave radar relies on the collection of reflected signals from the target, and the radar signals containing information is difficult to be fully applied. This has been a long… ▽ More

    Submitted 28 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

  8. arXiv:2404.16223  [pdf, other

    cs.CV eess.IV

    Deep RAW Image Super-Resolution. A NTIRE 2024 Challenge Survey

    Authors: Marcos V. Conde, Florin-Alexandru Vasluianu, Radu Timofte, Jianxing Zhang, Jia Li, Fan Wang, Xiaopeng Li, Zikun Liu, Hyunhee Park, Sejun Song, Changho Kim, Zhijuan Huang, Hongyuan Yu, Cheng Wan, Wending Xiang, Jiamin Lin, Hang Zhong, Qiaosong Zhang, Yue Sun, Xuanwu Yin, Kunlong Zuo, Senyan Xu, Siyuan Jiang, Zhi**g Sun, Jiaying Zhu , et al. (10 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 RAW Image Super-Resolution Challenge, highlighting the proposed solutions and results. New methods for RAW Super-Resolution could be essential in modern Image Signal Processing (ISP) pipelines, however, this problem is not as explored as in the RGB domain. Th goal of this challenge is to upscale RAW Bayer images by 2x, considering unknown degradations such as nois… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: CVPR 2024 - NTIRE Workshop

  9. arXiv:2403.05050  [pdf, other

    cs.CV cs.AI cs.MM

    DyRoNet: Dynamic Routing and Low-Rank Adapters for Autonomous Driving Streaming Perception

    Authors: Xiang Huang, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Wangmeng Xiang, Baigui Sun, Xiao Wu

    Abstract: The advancement of autonomous driving systems hinges on the ability to achieve low-latency and high-accuracy perception. To address this critical need, this paper introduces Dynamic Routering Network (DyRoNet), a low-rank enhanced dynamic routing framework designed for streaming perception in autonomous driving systems. DyRoNet integrates a suite of pre-trained branch networks, each meticulously f… ▽ More

    Submitted 18 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

    Comments: Project: https://tastevision.github.io/DyRoNet/

  10. arXiv:2403.04435  [pdf, other

    cs.IT

    Pilot Spoofing Attack on the Downlink of Cell-Free Massive MIMO: From the Perspective of Adversaries

    Authors: Weiyang Xu, Ruiguang Wang, Yuan Zhang, Hien Quoc Ngo, Wei Xiang

    Abstract: The channel hardening effect is less pronounced in the cell-free massive multiple-input multiple-output (mMIMO) system compared to its cellular counterpart, making it necessary to estimate the downlink effective channel gains to ensure decent performance. However, the downlink training inadvertently creates an opportunity for adversarial nodes to launch pilot spoofing attacks (PSAs). First, we dem… ▽ More

    Submitted 11 April, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  11. arXiv:2402.17339  [pdf, other

    cs.CV cs.AI

    SocialCVAE: Predicting Pedestrian Trajectory via Interaction Conditioned Latents

    Authors: Wei Xiang, Haoteng Yin, He Wang, Xiaogang **

    Abstract: Pedestrian trajectory prediction is the key technology in many applications for providing insights into human behavior and anticipating human future motions. Most existing empirical models are explicitly formulated by observed human behaviors using explicable mathematical terms with a deterministic nature, while recent work has focused on develo** hybrid models combined with learning-based techn… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: Accepted by AAAI'24

  12. arXiv:2402.11739  [pdf, other

    eess.SY cs.LG

    A Transition System Abstraction Framework for Neural Network Dynamical System Models

    Authors: Yejiang Yang, Zihao Mo, Hoang-Dung Tran, Weiming Xiang

    Abstract: This paper proposes a transition system abstraction framework for neural network dynamical system models to enhance the model interpretability, with applications to complex dynamical systems such as human behavior learning and verification. To begin with, the localized working zone will be segmented into multiple localized partitions under the data-driven Maximum Entropy (ME) partitioning method.… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: ACC 2024

  13. arXiv:2402.11737  [pdf, other

    cs.LG cs.AI

    Compression Repair for Feedforward Neural Networks Based on Model Equivalence Evaluation

    Authors: Zihao Mo, Yejiang Yang, Shuaizheng Lu, Weiming Xiang

    Abstract: In this paper, we propose a method of repairing compressed Feedforward Neural Networks (FNNs) based on equivalence evaluation of two neural networks. In the repairing framework, a novel neural network equivalence evaluation method is developed to compute the output discrepancy between two neural networks. The output discrepancy can quantitatively characterize the output difference produced by comp… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: ACC 2024

  14. arXiv:2402.04584  [pdf, other

    eess.IV cs.CV

    Troublemaker Learning for Low-Light Image Enhancement

    Authors: Yinghao Song, Zhiyuan Cao, Wanhong Xiang, Sifan Long, Bo Yang, Hongwei Ge, Yanchun Liang, Chunguo Wu

    Abstract: Low-light image enhancement (LLIE) restores the color and brightness of underexposed images. Supervised methods suffer from high costs in collecting low/normal-light image pairs. Unsupervised methods invest substantial effort in crafting complex loss functions. We address these two challenges through the proposed TroubleMaker Learning (TML) strategy, which employs normal-light images as inputs for… ▽ More

    Submitted 2 March, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  15. arXiv:2401.11633  [pdf, other

    cs.CV cs.AI

    Zoom-shot: Fast and Efficient Unsupervised Zero-Shot Transfer of CLIP to Vision Encoders with Multimodal Loss

    Authors: Jordan Shipard, Arnold Wiliem, Kien Nguyen Thanh, Wei Xiang, Clinton Fookes

    Abstract: The fusion of vision and language has brought about a transformative shift in computer vision through the emergence of Vision-Language Models (VLMs). However, the resource-intensive nature of existing VLMs poses a significant challenge. We need an accessible method for develo** the next generation of VLMs. To address this issue, we propose Zoom-shot, a novel method for transferring the zero-shot… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: 15 pages

  16. arXiv:2401.05800  [pdf, other

    cs.LG cs.AI

    Graph Spatiotemporal Process for Multivariate Time Series Anomaly Detection with Missing Values

    Authors: Yu Zheng, Huan Yee Koh, Ming **, Lianhua Chi, Haishuai Wang, Khoa T. Phan, Yi-** Phoebe Chen, Shirui Pan, Wei Xiang

    Abstract: The detection of anomalies in multivariate time series data is crucial for various practical applications, including smart power grids, traffic flow forecasting, and industrial process control. However, real-world time series data is usually not well-structured, posting significant challenges to existing approaches: (1) The existence of missing values in multivariate time series data along variabl… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: Accepted by Information Fusion

  17. arXiv:2401.01699  [pdf, other

    cs.CV cs.CL cs.MM

    WordArt Designer API: User-Driven Artistic Typography Synthesis with Large Language Models on ModelScope

    Authors: Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, **gdong Sun, Wangmeng Xiang, Yusen Hu, Xianhui Lin, Xiaoyang Kang, Zengke **, Bin Luo, Yifeng Geng, Xuansong Xie, **gren Zhou

    Abstract: This paper introduces the WordArt Designer API, a novel framework for user-driven artistic typography synthesis utilizing Large Language Models (LLMs) on ModelScope. We address the challenge of simplifying artistic typography for non-professionals by offering a dynamic, adaptive, and computationally efficient alternative to traditional rigid templates. Our approach leverages the power of LLMs to u… ▽ More

    Submitted 12 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: Spotlight Paper at the Workshop on Machine Learning for Creativity and Design, 37th Conference on Neural Information Processing Systems (NeurIPS 2023). 5 pages, 5 figures

  18. arXiv:2312.10504  [pdf, other

    cs.SI

    SubAnom: Efficient Subgraph Anomaly Detection Framework over Dynamic Graphs

    Authors: Chi Zhang, Wenkai Xiang, Xingzhi Guo, Baojian Zhou, Deqing Yang

    Abstract: Given a dynamic graph, the efficient tracking of anomalous subgraphs via their node embeddings poses a significant challenge. Addressing this issue necessitates an effective scoring mechanism and an innovative anomalous subgraph strategy. Existing methods predominantly focus on designing scoring strategies or employing graph structures that consider nodes in isolation, resulting in ineffective cap… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

  19. arXiv:2312.01555  [pdf, other

    cs.AI cs.CY cs.LG

    Explainable AI is Responsible AI: How Explainability Creates Trustworthy and Socially Responsible Artificial Intelligence

    Authors: Stephanie Baker, Wei Xiang

    Abstract: Artificial intelligence (AI) has been clearly established as a technology with the potential to revolutionize fields from healthcare to finance - if developed and deployed responsibly. This is the topic of responsible AI, which emphasizes the need to develop trustworthy AI systems that minimize bias, protect privacy, support security, and enhance transparency and accountability. Explainable AI (XA… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: 35 pages, 7 figures (figures 3-6 include subfigures)

  20. arXiv:2312.00082  [pdf, other

    eess.IV cs.CV

    A Compact Implicit Neural Representation for Efficient Storage of Massive 4D Functional Magnetic Resonance Imaging

    Authors: Ruoran Li, Runzhao Yang, Wenxin Xiang, Yuxiao Cheng, Tingxiong Xiao, **li Suo

    Abstract: Functional Magnetic Resonance Imaging (fMRI) data is a widely used kind of four-dimensional biomedical data, which requires effective compression. However, fMRI compressing poses unique challenges due to its intricate temporal dynamics, low signal-to-noise ratio, and complicated underlying redundancies. This paper reports a novel compression paradigm specifically tailored for fMRI data based on Im… ▽ More

    Submitted 29 February, 2024; v1 submitted 30 November, 2023; originally announced December 2023.

  21. arXiv:2311.18214  [pdf, other

    astro-ph.IM astro-ph.GA astro-ph.SR cs.CV physics.optics

    Perception of Misalignment States for Sky Survey Telescopes with the Digital Twin and the Deep Neural Networks

    Authors: Miao Zhang, Peng Jia, Zhengyang Li, Wennan Xiang, Jiameng Lv, Rui Sun

    Abstract: Sky survey telescopes play a critical role in modern astronomy, but misalignment of their optical elements can introduce significant variations in point spread functions, leading to reduced data quality. To address this, we need a method to obtain misalignment states, aiding in the reconstruction of accurate point spread functions for data processing methods or facilitating adjustments of optical… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: The aforementioned submission has been accepted by Optics Express. We kindly request any feedback or comments to be directed to the corresponding author, Peng Jia ([email protected]), or the second corresponding author, Zhengyang Li ([email protected]). Please note that Zhengyang is currently stationed in the South Antarctica and will not be available until after February 1st, 2024

  22. arXiv:2311.03054  [pdf, other

    cs.CV

    AnyText: Multilingual Visual Text Generation And Editing

    Authors: Yuxiang Tuo, Wangmeng Xiang, Jun-Yan He, Yifeng Geng, Xuansong Xie

    Abstract: Diffusion model based Text-to-Image has achieved impressive achievements recently. Although current technology for synthesizing images is highly advanced and capable of generating images with high fidelity, it is still possible to give the show away when focusing on the text area in the generated image. To address this issue, we introduce AnyText, a diffusion-based multilingual visual text generat… ▽ More

    Submitted 21 February, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

  23. arXiv:2310.18332  [pdf, other

    cs.CL cs.AI cs.CV cs.GR

    WordArt Designer: User-Driven Artistic Typography Synthesis using Large Language Models

    Authors: Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, **gdong Sun, Wangmeng Xiang, Xianhui Lin, Xiaoyang Kang, Zengke **, Yusen Hu, Bin Luo, Yifeng Geng, Xuansong Xie, **gren Zhou

    Abstract: This paper introduces WordArt Designer, a user-driven framework for artistic typography synthesis, relying on the Large Language Model (LLM). The system incorporates four key modules: the LLM Engine, SemTypo, StyTypo, and TexTypo modules. 1) The LLM Engine, empowered by the LLM (e.g., GPT-3.5), interprets user inputs and generates actionable prompts for the other modules, thereby transforming abst… ▽ More

    Submitted 26 November, 2023; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: Accepted by EMNLP 2023, 10 pages, 11 figures, 1 table, the system is at https://www.modelscope.cn/studios/WordArt/WordArt

  24. arXiv:2310.17401  [pdf, other

    cs.IT eess.SP

    Energy Efficient Robust Beamforming for Vehicular ISAC with Imperfect Channel Estimation

    Authors: Hanwen Zhang, Haijian Sun, Tianyi He, Weiming Xiang, Rose Qingyang Hu

    Abstract: This paper investigates robust beamforming for system-centric energy efficiency (EE) optimization in the vehicular integrated sensing and communication (ISAC) system, where the mobility of vehicles poses significant challenges to channel estimation. To obtain the optimal beamforming under channel uncertainty, we first formulate an optimization problem for maximizing the system EE under bounded cha… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Submitted to IEEE for future publication

  25. arXiv:2310.07182  [pdf, other

    cs.GR

    Generate Coherent Rays Directly

    Authors: Fengqi Liu, Zaonan Tan, Weilai Xiang, Chenhao Lu, Dan Li, Xu Gong, Yulong Shi, Songnan Shi, Qilong Kou, Bo Hu

    Abstract: The path tracing method generates incoherent rays by randomly sampling directions. This randomness makes it unsuitable for modern processor architectures that rely on coherence to achieve optimal performance. Many efforts have been made to address this issue by reordering rays based on their origin, end, or direction to enhance coherence. However, a drawback of reordering methods is the need to en… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 8 pages

  26. arXiv:2310.04180  [pdf, other

    cs.CV

    Degradation-Aware Self-Attention Based Transformer for Blind Image Super-Resolution

    Authors: Qingguo Liu, Pan Gao, Kang Han, Ningzhong Liu, Wei Xiang

    Abstract: Compared to CNN-based methods, Transformer-based methods achieve impressive image restoration outcomes due to their abilities to model remote dependencies. However, how to apply Transformer-based methods to the field of blind super-resolution (SR) and further make an SR network adaptive to degradation information is still an open problem. In this paper, we propose a new degradation-aware self-atte… ▽ More

    Submitted 6 October, 2023; originally announced October 2023.

    Comments: 12 pages

  27. arXiv:2309.07561  [pdf, other

    cs.CL

    Adaptive Prompt Learning with Distilled Connective Knowledge for Implicit Discourse Relation Recognition

    Authors: Bang Wang, Zhenglin Wang, Wei Xiang, Yijun Mo

    Abstract: Implicit discourse relation recognition (IDRR) aims at recognizing the discourse relation between two text segments without an explicit connective. Recently, the prompt learning has just been applied to the IDRR task with great performance improvements over various neural network-based approaches. However, the discrete nature of the state-art-of-art prompting approach requires manual design of tem… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  28. arXiv:2309.01365  [pdf, other

    cs.CV cs.AI

    Refined Temporal Pyramidal Compression-and-Amplification Transformer for 3D Human Pose Estimation

    Authors: Hanbing Liu, Wangmeng Xiang, Jun-Yan He, Zhi-Qi Cheng, Bin Luo, Yifeng Geng, Xuansong Xie

    Abstract: Accurately estimating the 3D pose of humans in video sequences requires both accuracy and a well-structured architecture. With the success of transformers, we introduce the Refined Temporal Pyramidal Compression-and-Amplification (RTPCA) transformer. Exploiting the temporal dimension, RTPCA extends intra-block temporal modeling via its Temporal Pyramidal Compression-and-Amplification (TPCA) struct… ▽ More

    Submitted 4 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: 11 pages, 5 figures

  29. arXiv:2308.09678  [pdf, other

    cs.CV cs.AI cs.MM cs.RO

    PoSynDA: Multi-Hypothesis Pose Synthesis Domain Adaptation for Robust 3D Human Pose Estimation

    Authors: Hanbing Liu, Jun-Yan He, Zhi-Qi Cheng, Wangmeng Xiang, Qize Yang, Wenhao Chai, Gaoang Wang, Xu Bao, Bin Luo, Yifeng Geng, Xuansong Xie

    Abstract: Existing 3D human pose estimators face challenges in adapting to new datasets due to the lack of 2D-3D pose pairs in training sets. To overcome this issue, we propose \textit{Multi-Hypothesis \textbf{P}ose \textbf{Syn}thesis \textbf{D}omain \textbf{A}daptation} (\textbf{PoSynDA}) framework to bridge this data disparity gap in target domain. Typically, PoSynDA uses a diffusion-inspired structure to… ▽ More

    Submitted 16 October, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: Accepted to ACM Multimedia 2023; 10 pages, 4 figures, 8 tables; the code is at https://github.com/hbing-l/PoSynDA

  30. arXiv:2308.03262  [pdf, other

    cs.CV

    A Benchmark for Chinese-English Scene Text Image Super-resolution

    Authors: Jianqi Ma, Zhetong Liang, Wangmeng Xiang, Xi Yang, Lei Zhang

    Abstract: Scene Text Image Super-resolution (STISR) aims to recover high-resolution (HR) scene text images with visually pleasant and readable text content from the given low-resolution (LR) input. Most existing works focus on recovering English texts, which have relatively simple character structures, while little work has been done on the more challenging Chinese texts with diverse and complex character s… ▽ More

    Submitted 6 August, 2023; originally announced August 2023.

    Comments: Accepted by ICCV2023

  31. arXiv:2308.01251  [pdf, other

    cs.CV

    Hyper-pixel-wise Contrastive Learning Augmented Segmentation Network for Old Landslide Detection through Fusing High-Resolution Remote Sensing Images and Digital Elevation Model Data

    Authors: Yiming Zhou, Yuexing Peng, Wei Li, Junchuan Yu, Daqing Ge, Wei Xiang

    Abstract: As a natural disaster, landslide often brings tremendous losses to human lives, so it urgently demands reliable detection of landslide risks. When detecting old landslides that present important information for landslide risk warning, problems such as visual blur and small-sized dataset cause great challenges when using remote sensing data. To extract accurate semantic features, a hyper-pixel-wise… ▽ More

    Submitted 6 October, 2023; v1 submitted 2 August, 2023; originally announced August 2023.

  32. arXiv:2307.12327  [pdf, other

    eess.IV cs.CV

    End-to-end Hyperspectral Image Change Detection Network Based on Band Selection

    Authors: Qingren Yao, Yuan Zhou, Chang Tang, Wei Xiang

    Abstract: For hyperspectral image change detection (HSI-CD), one key challenge is to reduce band redundancy, as only a few bands are crucial for change detection while other bands may be adverse to it. However, most existing HSI-CD methods directly extract change feature from full-dimensional HSIs, suffering from a degradation of feature discrimination. To address this issue, we propose an end-to-end hypers… ▽ More

    Submitted 16 November, 2023; v1 submitted 23 July, 2023; originally announced July 2023.

  33. arXiv:2307.09977  [pdf, ps, other

    cs.LG cs.NI

    Learner Referral for Cost-Effective Federated Learning Over Hierarchical IoT Networks

    Authors: Yulan Gao, Ziqiang Ye, Yue Xiao, Wei Xiang

    Abstract: The paradigm of federated learning (FL) to address data privacy concerns by locally training parameters on resource-constrained clients in a distributed manner has garnered significant attention. Nonetheless, FL is not applicable when not all clients within the coverage of the FL server are registered with the FL network. To bridge this gap, this paper proposes joint learner referral aided federat… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  34. arXiv:2307.09813  [pdf, other

    cs.CL

    DAPrompt: Deterministic Assumption Prompt Learning for Event Causality Identification

    Authors: Wei Xiang, Chuanhong Zhan, Bang Wang

    Abstract: Event Causality Identification (ECI) aims at determining whether there is a causal relation between two event mentions. Conventional prompt learning designs a prompt template to first predict an answer word and then maps it to the final decision. Unlike conventional prompts, we argue that predicting an answer word may not be a necessary prerequisite for the ECI task. Instead, we can first make a d… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  35. Correlation-aware Spatial-Temporal Graph Learning for Multivariate Time-series Anomaly Detection

    Authors: Yu Zheng, Huan Yee Koh, Ming **, Lianhua Chi, Khoa T. Phan, Shirui Pan, Yi-** Phoebe Chen, Wei Xiang

    Abstract: Multivariate time-series anomaly detection is critically important in many applications, including retail, transportation, power grid, and water treatment plants. Existing approaches for this problem mostly employ either statistical models which cannot capture the non-linear relations well or conventional deep learning models (e.g., CNN and LSTM) that do not explicitly learn the pairwise correlati… ▽ More

    Submitted 16 November, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: 17 pages, double columns, 10 tables, 3 figures. Accepted to IEEE Transactions on Neural Networks and Learning Systems (TNNLS)

  36. arXiv:2306.09551  [pdf, other

    cs.CV

    Edit-DiffNeRF: Editing 3D Neural Radiance Fields using 2D Diffusion Model

    Authors: Lu Yu, Wei Xiang, Kang Han

    Abstract: Recent research has demonstrated that the combination of pretrained diffusion models with neural radiance fields (NeRFs) has emerged as a promising approach for text-to-3D generation. Simply coupling NeRF with diffusion models will result in cross-view inconsistency and degradation of stylized view syntheses. To address this challenge, we propose the Edit-DiffNeRF framework, which is composed of a… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  37. arXiv:2305.17916  [pdf, other

    cs.CV

    Volume Feature Rendering for Fast Neural Radiance Field Reconstruction

    Authors: Kang Han, Wei Xiang, Lu Yu

    Abstract: Neural radiance fields (NeRFs) are able to synthesize realistic novel views from multi-view images captured from distinct positions and perspectives. In NeRF's rendering pipeline, neural networks are used to represent a scene independently or transform queried learnable feature vector of a point to the expected color or density. With the aid of geometry guides either in occupancy grids or proposal… ▽ More

    Submitted 31 May, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

  38. arXiv:2305.16652  [pdf, other

    cs.GR

    Faster Ray Tracing through Hierarchy Cut Code

    Authors: WeiLai Xiang, FengQi Liu, Dan Li, ZhaoNan Tan, PengZhan Xu, MeiZhi Liu, QiLong Kou

    Abstract: We propose a novel ray reordering technique to accelerate the ray tracing process by encoding and sorting rays prior to traversal. Instead of spatial coordinates, our method encodes rays according to the cuts of the hierarchical acceleration structure, which is called the hierarchy cut code. This approach can better adapt to the acceleration structure and obtain a more reliable encoding result. We… ▽ More

    Submitted 19 July, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

  39. arXiv:2305.16437  [pdf, other

    cs.CV cs.AI cs.MM

    KeyPosS: Plug-and-Play Facial Landmark Detection through GPS-Inspired True-Range Multilateration

    Authors: Xu Bao, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Wangmeng Xiang, **gdong Sun, Hanbing Liu, Wei Liu, Bin Luo, Yifeng Geng, Xuansong Xie

    Abstract: Accurate facial landmark detection is critical for facial analysis tasks, yet prevailing heatmap and coordinate regression methods grapple with prohibitive computational costs and quantization errors. Through comprehensive theoretical analysis and experimentation, we identify and elucidate the limitations of existing techniques. To overcome these challenges, we pioneer the application of True-Rang… ▽ More

    Submitted 23 September, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: Accepted to ACM Multimedia 2023; 10 pages, 7 figures, 6 tables; the code is at https://github.com/zhiqic/KeyPosS

  40. arXiv:2305.10866  [pdf, other

    cs.CL

    TEPrompt: Task Enlightenment Prompt Learning for Implicit Discourse Relation Recognition

    Authors: Wei Xiang, Chao Liang, Bang Wang

    Abstract: Implicit Discourse Relation Recognition (IDRR) aims at classifying the relation sense between two arguments without an explicit connective. Recently, the ConnPrompt~\cite{Wei.X:et.al:2022:COLING} has leveraged the powerful prompt learning for IDRR based on the fusion of multi-prompt decisions from three different yet much similar connective prediction templates. Instead of multi-prompt ensembling,… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  41. arXiv:2304.13812  [pdf, other

    cs.LG cs.AI cs.NE

    Guaranteed Quantization Error Computation for Neural Network Model Compression

    Authors: Wesley Cooke, Zihao Mo, Weiming Xiang

    Abstract: Neural network model compression techniques can address the computation issue of deep neural networks on embedded devices in industrial systems. The guaranteed output error computation problem for neural network compression with quantization is addressed in this paper. A merged neural network is built from a feedforward neural network and its quantized version to produce the exact output differenc… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

  42. arXiv:2304.13811  [pdf, other

    eess.SY cs.LG math.DS

    A Data-Driven Hybrid Automaton Framework to Modeling Complex Dynamical Systems

    Authors: Yejiang Yang, Zihao Mo, Weiming Xiang

    Abstract: In this paper, a computationally efficient data-driven hybrid automaton model is proposed to capture unknown complex dynamical system behaviors using multiple neural networks. The sampled data of the system is divided by valid partitions into groups corresponding to their topologies and based on which, transition guards are defined. Then, a collection of small-scale neural networks that are comput… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

  43. arXiv:2303.17144  [pdf, other

    cs.CV cs.AI cs.MM cs.RO

    DAMO-StreamNet: Optimizing Streaming Perception in Autonomous Driving

    Authors: Jun-Yan He, Zhi-Qi Cheng, Chenyang Li, Wangmeng Xiang, Binghui Chen, Bin Luo, Yifeng Geng, Xuansong Xie

    Abstract: Real-time perception, or streaming perception, is a crucial aspect of autonomous driving that has yet to be thoroughly explored in existing research. To address this gap, we present DAMO-StreamNet, an optimized framework that combines recent advances from the YOLO series with a comprehensive analysis of spatial and temporal perception mechanisms, delivering a cutting-edge solution. The key innovat… ▽ More

    Submitted 20 May, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: Accepted to IJCAI 2023; 9 pages, 4 figures, 6 tables; the code is at https://github.com/zhiqic/DAMO-StreamNet

    Journal ref: In the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023)

  44. arXiv:2303.14395  [pdf, other

    cs.CV

    MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos

    Authors: Minghan Li, Shuai Li, Wangmeng Xiang, Lei Zhang

    Abstract: While impressive progress has been achieved, video instance segmentation (VIS) methods with per-clip input often fail on challenging videos with occluded objects and crowded scenes. This is mainly because instance queries in these methods cannot encode well the discriminative embeddings of instances, making the query-based segmenter difficult to distinguish those `hard' instances. To address these… ▽ More

    Submitted 25 March, 2023; originally announced March 2023.

    Journal ref: The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023

  45. arXiv:2303.09769  [pdf, other

    cs.CV cs.LG

    Denoising Diffusion Autoencoders are Unified Self-supervised Learners

    Authors: Weilai Xiang, Hongyu Yang, Di Huang, Yunhong Wang

    Abstract: Inspired by recent advances in diffusion models, which are reminiscent of denoising autoencoders, we investigate whether they can acquire discriminative representations for classification via generative pre-training. This paper shows that the networks in diffusion models, namely denoising diffusion autoencoders (DDAE), are unified self-supervised learners: by pre-training on unconditional image ge… ▽ More

    Submitted 19 August, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

    Comments: ICCV 2023 Oral

  46. arXiv:2303.08525  [pdf, other

    cs.CV eess.IV

    MRGAN360: Multi-stage Recurrent Generative Adversarial Network for 360 Degree Image Saliency Prediction

    Authors: Pan Gao, Xinlang Chen, Rong Quan, Wei Xiang

    Abstract: Thanks to the ability of providing an immersive and interactive experience, the uptake of 360 degree image content has been rapidly growing in consumer and industrial applications. Compared to planar 2D images, saliency prediction for 360 degree images is more challenging due to their high resolutions and spherical viewing ranges. Currently, most high-performance saliency prediction models for omn… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

  47. arXiv:2303.04935  [pdf, other

    cs.CV

    X-Pruner: eXplainable Pruning for Vision Transformers

    Authors: Lu Yu, Wei Xiang

    Abstract: Recently vision transformer models have become prominent models for a range of tasks. These models, however, usually suffer from intensive computational costs and heavy memory requirements, making them impractical for deployment on edge platforms. Recent studies have proposed to prune transformers in an unexplainable manner, which overlook the relationship between internal units of the model and t… ▽ More

    Submitted 5 June, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

  48. arXiv:2303.03808  [pdf, other

    cs.CV

    Multiscale Tensor Decomposition and Rendering Equation Encoding for View Synthesis

    Authors: Kang Han, Wei Xiang

    Abstract: Rendering novel views from captured multi-view images has made considerable progress since the emergence of the neural radiance field. This paper aims to further advance the quality of view synthesis by proposing a novel approach dubbed the neural radiance feature field (NRFF). We first propose a multiscale tensor decomposition scheme to organize learnable features so as to represent scenes from c… ▽ More

    Submitted 27 May, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

  49. arXiv:2302.14302  [pdf, other

    cs.CV

    Improving Model Generalization by On-manifold Adversarial Augmentation in the Frequency Domain

    Authors: Chang Liu, Wenzhao Xiang, Yuan He, Hui Xue, Shibao Zheng, Hang Su

    Abstract: Deep neural networks (DNNs) may suffer from significantly degenerated performance when the training and test data are of different underlying distributions. Despite the importance of model generalization to out-of-distribution (OOD) data, the accuracy of state-of-the-art (SOTA) models on OOD data can plummet. Recent work has demonstrated that regular or off-manifold adversarial examples, as a spec… ▽ More

    Submitted 8 June, 2024; v1 submitted 27 February, 2023; originally announced February 2023.

  50. arXiv:2302.14301  [pdf, other

    cs.CV

    A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking

    Authors: Chang Liu, Yinpeng Dong, Wenzhao Xiang, Xiao Yang, Hang Su, Jun Zhu, Yuefeng Chen, Yuan He, Hui Xue, Shibao Zheng

    Abstract: The robustness of deep neural networks is usually lacking under adversarial examples, common corruptions, and distribution shifts, which becomes an important research problem in the development of deep learning. Although new deep learning methods and robustness improvement techniques have been constantly proposed, the robustness evaluations of existing methods are often inadequate due to their rap… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: International Journal of Computer Vision (IJCV) [under review]