Skip to main content

Showing 1–50 of 178 results for author: Xiao, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15764  [pdf, other

    cs.CV

    TP-DRSeg: Improving Diabetic Retinopathy Lesion Segmentation with Explicit Text-Prompts Assisted SAM

    Authors: Wenxue Li, Xinyu Xiong, Peng Xia, Lie Ju, Zongyuan Ge

    Abstract: Recent advances in large foundation models, such as the Segment Anything Model (SAM), have demonstrated considerable promise across various tasks. Despite their progress, these models still encounter challenges in specialized medical image analysis, especially in recognizing subtle inter-class differences in Diabetic Retinopathy (DR) lesion segmentation. In this paper, we propose a novel framework… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  2. arXiv:2406.14739  [pdf, other

    cs.CL

    Learning to Retrieve Iteratively for In-Context Learning

    Authors: Yunmo Chen, Tongfei Chen, Harsh Jhamtani, Patrick Xia, Richard Shin, Jason Eisner, Benjamin Van Durme

    Abstract: We introduce iterative retrieval, a novel framework that empowers retrievers to make iterative decisions through policy optimization. Finding an optimal portfolio of retrieved items is a combinatorial optimization problem, generally considered NP-hard. This approach provides a learned approximation to such a solution, meeting specific task requirements under a given family of large language models… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2406.12463  [pdf, other

    cs.CV eess.IV

    LFMamba: Light Field Image Super-Resolution with State Space Model

    Authors: Wang xia, Yao Lu, Shunzhou Wang, Ziqi Wang, Peiqi Xia, Tianfei Zhou

    Abstract: Recent years have witnessed significant advancements in light field image super-resolution (LFSR) owing to the progress of modern neural networks. However, these methods often face challenges in capturing long-range dependencies (CNN-based) or encounter quadratic computational complexities (Transformer-based), which limit their performance. Recently, the State Space Model (SSM) with selective scan… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  4. arXiv:2406.07471  [pdf, other

    cs.CV

    OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding

    Authors: Ming Hu, Peng Xia, Lin Wang, Siyuan Yan, Feilong Tang, Zhongxing Xu, Yimin Luo, Kaimin Song, Jurgen Leitner, Xuelian Cheng, Jun Cheng, Chi Liu, Kai**g Zhou, Zongyuan Ge

    Abstract: Surgical scene perception via videos are critical for advancing robotic surgery, telesurgery, and AI-assisted surgery, particularly in ophthalmology. However, the scarcity of diverse and richly annotated video datasets has hindered the development of intelligent systems for surgical workflow analysis. Existing datasets for surgical workflow analysis, which typically face challenges such as small s… ▽ More

    Submitted 13 June, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: Version 1

  5. arXiv:2406.06603  [pdf, other

    cs.LG cs.AI

    FPN-fusion: Enhanced Linear Complexity Time Series Forecasting Model

    Authors: Chu Li, **jia Xiao, Qi** Yuan

    Abstract: This study presents a novel time series prediction model, FPN-fusion, designed with linear computational complexity, demonstrating superior predictive performance compared to DLiner without increasing parameter count or computational demands. Our model introduces two key innovations: first, a Feature Pyramid Network (FPN) is employed to effectively capture time series data characteristics, bypassi… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: FPN,time series,fusion. arXiv admin note: text overlap with arXiv:2401.03001 by other authors

  6. arXiv:2406.06384  [pdf, other

    cs.CV

    Generalizing to Unseen Domains in Diabetic Retinopathy with Disentangled Representations

    Authors: Peng Xia, Ming Hu, Feilong Tang, Wenxue Li, Wenhao Zheng, Lie Ju, Peibo Duan, Huaxiu Yao, Zongyuan Ge

    Abstract: Diabetic Retinopathy (DR), induced by diabetes, poses a significant risk of visual impairment. Accurate and effective grading of DR aids in the treatment of this condition. Yet existing models experience notable performance degradation on unseen domains due to domain shifts. Previous methods address this issue by simulating domain style through simple visual transformation and mitigating domain no… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Early Accepted by MICCAI 2024

  7. arXiv:2406.06007  [pdf, other

    cs.LG cs.CL cs.CV cs.CY

    CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

    Authors: Peng Xia, Ze Chen, Juanxi Tian, Yangrui Gong, Ruibo Hou, Yue Xu, Zhenbang Wu, Zhiyuan Fan, Yiyang Zhou, Kangyu Zhu, Wenhao Zheng, Zhaoyang Wang, Xiao Wang, Xuchao Zhang, Chetan Bansal, Marc Niethammer, Junzhou Huang, Hongtu Zhu, Yun Li, Jimeng Sun, Zongyuan Ge, Gang Li, James Zou, Huaxiu Yao

    Abstract: Artificial intelligence has significantly impacted medical applications, particularly with the advent of Medical Large Vision Language Models (Med-LVLMs), sparking optimism for the future of automated and personalized healthcare. However, the trustworthiness of Med-LVLMs remains unverified, posing significant risks for future model deployment. In this paper, we introduce CARES and aim to comprehen… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  8. arXiv:2405.19440  [pdf, other

    cs.LG math.OC stat.ML

    On the Convergence of Multi-objective Optimization under Generalized Smoothness

    Authors: Qi Zhang, Peiyao Xiao, Kaiyi Ji, Shaofeng Zou

    Abstract: Multi-objective optimization (MOO) is receiving more attention in various fields such as multi-task learning. Recent works provide some effective algorithms with theoretical analysis but they are limited by the standard $L$-smooth or bounded-gradient assumptions, which are typically unsatisfactory for neural networks, such as recurrent neural networks (RNNs) and transformers. In this paper, we stu… ▽ More

    Submitted 1 July, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

  9. arXiv:2405.16077  [pdf, ps, other

    cs.LG

    Finite-Time Analysis for Conflict-Avoidant Multi-Task Reinforcement Learning

    Authors: Yudan Wang, Peiyao Xiao, Hao Ban, Kaiyi Ji, Shaofeng Zou

    Abstract: Multi-task reinforcement learning (MTRL) has shown great promise in many real-world applications. Existing MTRL algorithms often aim to learn a policy that optimizes individual objective functions simultaneously with a given prior preference (or weights) on different tasks. However, these methods often suffer from the issue of \textit{gradient conflict} such that the tasks with larger gradients do… ▽ More

    Submitted 10 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

    Comments: Initial submission at the 41$^{st}$ International Conference on Machine Learning

  10. arXiv:2405.11289  [pdf, other

    eess.IV cs.CV

    Diffusion Model Driven Test-Time Image Adaptation for Robust Skin Lesion Classification

    Authors: Ming Hu, Siyuan Yan, Peng Xia, Feilong Tang, Wenxue Li, Peibo Duan, Lin Zhang, Zongyuan Ge

    Abstract: Deep learning-based diagnostic systems have demonstrated potential in skin disease diagnosis. However, their performance can easily degrade on test domains due to distribution shifts caused by input-level corruptions, such as imaging equipment variability, brightness changes, and image blur. This will reduce the reliability of model deployment in real-world scenarios. Most existing solutions focus… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  11. arXiv:2405.04332  [pdf, other

    cs.CR

    WALLETRADAR: Towards Automating the Detection of Vulnerabilities in Browser-based Cryptocurrency Wallets

    Authors: Pengcheng Xia, Yanhui Guo, Zhaowen Lin, Jun Wu, Pengbo Duan, Ningyu He, Kailong Wang, Tianming Liu, Yinliang Yue, Guoai Xu, Haoyu Wang

    Abstract: Cryptocurrency wallets, acting as fundamental infrastructure to the blockchain ecosystem, have seen significant user growth, particularly among browser-based wallets (i.e., browser extensions). However, this expansion accompanies security challenges, making these wallets prime targets for malicious activities. Despite a substantial user base, there is not only a significant gap in comprehensive se… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Just accepted by the Automated Software Engineering Journal

  12. arXiv:2404.16484  [pdf, other

    cs.CV eess.IV

    Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey

    Authors: Marcos V. Conde, Zhijun Lei, Wen Li, Cosmin Stejerean, Ioannis Katsavounidis, Radu Timofte, Kihwan Yoon, Ganzorig Gankhuyag, Jiangtao Lv, Long Sun, **shan Pan, Jiangxin Dong, **hui Tang, Zhiyuan Li, Hao Wei, Chenyang Ge, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi **, Menghan Zhou, Yiqiang Yan, Si Gao, Biao Wu, Shaoli Liu , et al. (50 additional authors not shown)

    Abstract: This paper introduces a novel benchmark as part of the AIS 2024 Real-Time Image Super-Resolution (RTSR) Challenge, which aims to upscale compressed images from 540p to 4K resolution (4x factor) in real-time on commercial GPUs. For this, we use a diverse test set containing a variety of 4K images ranging from digital art to gaming and photography. The images are compressed using the modern AVIF cod… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: CVPR 2024, AI for Streaming (AIS) Workshop

  13. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  14. arXiv:2404.02668  [pdf, other

    cs.CV

    RS-Mamba for Large Remote Sensing Image Dense Prediction

    Authors: Sijie Zhao, Hao Chen, Xueliang Zhang, Pengfeng Xiao, Lei Bai, Wanli Ouyang

    Abstract: Context modeling is critical for remote sensing image dense prediction tasks. Nowadays, the growing size of very-high-resolution (VHR) remote sensing images poses challenges in effectively modeling context. While transformer-based models possess global modeling capabilities, they encounter computational challenges when applied to large VHR images due to their quadratic complexity. The conventional… ▽ More

    Submitted 10 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

    Comments: 15 pages,8 figures

  15. arXiv:2404.01925  [pdf, other

    cs.CV cs.AI

    Improving Bird's Eye View Semantic Segmentation by Task Decomposition

    Authors: Tianhao Zhao, Yongcan Chen, Yu Wu, Tianyang Liu, Bo Du, Peilun Xiao, Shi Qiu, Hongda Yang, Guozhen Li, Yi Yang, Yutian Lin

    Abstract: Semantic segmentation in bird's eye view (BEV) plays a crucial role in autonomous driving. Previous methods usually follow an end-to-end pipeline, directly predicting the BEV segmentation map from monocular RGB inputs. However, the challenge arises when the RGB inputs and BEV targets from distinct perspectives, making the direct point-to-point predicting hard to optimize. In this paper, we decompo… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  16. arXiv:2403.17256  [pdf, other

    cs.IT eess.SP

    Latency-Aware Generative Semantic Communications with Pre-Trained Diffusion Models

    Authors: Li Qiao, Mahdi Boloursaz Mashhadi, Zhen Gao, Chuan Heng Foh, Pei Xiao, Mehdi Bennis

    Abstract: Generative foundation AI models have recently shown great success in synthesizing natural signals with high perceptual quality using only textual prompts and conditioning signals to guide the generation process. This enables semantic communications at extremely low data rates in future wireless networks. In this paper, we develop a latency-aware semantic communications framework with pre-trained g… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  17. arXiv:2403.16826  [pdf, ps, other

    cs.IT

    A Progressive Codebook Optimization Scheme for Sparse Code Multiple Access in Downlink Channels

    Authors: Tuofeng Lei, Qu Luo, Shuyan Ni, Shimiao Chen, Xin Song, Pei Xiao

    Abstract: Sparse code multiple access (SCMA) is a promising technique for enabling massive connectivity and high spectrum efficiency in future machine-type communication networks. However, its performance crucially depends on well-designed multi-dimensional codebooks. In this paper, we propose a novel progressive codebook optimization scheme that can achieve near-optimal performance over downlink fading cha… ▽ More

    Submitted 4 April, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  18. arXiv:2402.02544  [pdf, other

    cs.CV cs.AI cs.LG

    LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model

    Authors: Dilxat Muhtar, Zhenshi Li, Feng Gu, Xueliang Zhang, Pengfeng Xiao

    Abstract: The revolutionary capabilities of large language models (LLMs) have paved the way for multimodal large language models (MLLMs) and fostered diverse applications across various specialized domains. In the remote sensing (RS) field, however, the diverse geographical landscapes and varied objects in RS imagery are not adequately considered in recent MLLM endeavors. To bridge this gap, we construct a… ▽ More

    Submitted 18 March, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: 36 pages, 10 figures. Github https://github.com/NJU-LHRS/LHRS-Bot

  19. arXiv:2401.09127  [pdf, other

    cs.IT eess.SP

    AI Empowered Channel Semantic Acquisition for 6G Integrated Sensing and Communication Networks

    Authors: Yifei Zhang, Zhen Gao, **g**g Zhao, Ziming He, Yunsheng Zhang, Chen Lu, Pei Xiao

    Abstract: Motivated by the need for increased spectral efficiency and the proliferation of intelligent applications, the sixth-generation (6G) mobile network is anticipated to integrate the dual-functions of communication and sensing (C&S). Although the millimeter wave (mmWave) communication and mmWave radar share similar multiple-input multiple-output (MIMO) architecture for integration, the full potential… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: 9 pages, 5 figures, accepted by the IEEE journal

  20. arXiv:2401.04662  [pdf, other

    cs.CR

    The Devil Behind the Mirror: Tracking the Campaigns of Cryptocurrency Abuses on the Dark Web

    Authors: Pengcheng Xia, Zhou Yu, Kailong Wang, Kai Ma, Shuo Chen, Xiapu Luo, Ya** Zhou, Lei Wu, Guangdong Bai

    Abstract: The dark web has emerged as the state-of-the-art solution for enhanced anonymity. Just like a double-edged sword, it also inadvertently becomes the safety net and breeding ground for illicit activities. Among them, cryptocurrencies have been prevalently abused to receive illicit income while evading regulations. Despite the continuing efforts to combat illicit activities, there is still a lack of… ▽ More

    Submitted 7 April, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

  21. arXiv:2401.01140  [pdf, ps, other

    cs.IT cs.DC

    Joint Offloading and Resource Allocation for Hybrid Cloud and Edge Computing in SAGINs: A Decision Assisted Hybrid Action Space Deep Reinforcement Learning Approach

    Authors: Chong Huang, Gaojie Chen, Pei Xiao, Yue Xiao, Zhu Han, Jonathon A. Chambers

    Abstract: In recent years, the amalgamation of satellite communications and aerial platforms into space-air-ground integrated network (SAGINs) has emerged as an indispensable area of research for future communications due to the global coverage capacity of low Earth orbit (LEO) satellites and the flexible Deployment of aerial platforms. This paper presents a deep reinforcement learning (DRL)-based approach… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: 15 pages, accepted for publication in IEEE Journal on Selected Areas in Communications

  22. arXiv:2312.15653  [pdf, other

    cs.IT eess.SP

    Index Modulation for Fluid Antenna-Assisted MIMO Communications: System Design and Performance Analysis

    Authors: **g Zhu, Gaojie Chen, Pengyu Gao, Pei Xiao, Zihuai Lin, Atta Quddus

    Abstract: In this paper, we propose a transmission mechanism for fluid antennas (FAs) enabled multiple-input multiple-output (MIMO) communication systems based on index modulation (IM), named FA-IM, which incorporates the principle of IM into FAs-assisted MIMO system to improve the spectral efficiency (SE) without increasing the hardware complexity. In FA-IM, the information bits are mapped not only to the… ▽ More

    Submitted 25 December, 2023; originally announced December 2023.

    Comments: 12 pages,9 figures, publish to TWC

  23. arXiv:2312.11302  [pdf, other

    cs.IT eess.SP

    AFDM-SCMA: A Promising Waveform for Massive Connectivity over High Mobility Channels

    Authors: Qu Luo, Pei Xiao, Zilong Liu, Ziwei Wan, Thomos Nikolaos, Zhen Gao, Ziming He

    Abstract: This paper studies the affine frequency division multiplexing (AFDM)-empowered sparse code multiple access (SCMA) system, referred to as AFDM-SCMA, for supporting massive connectivity in high-mobility environments. First, by placing the sparse codewords on the AFDM chirp subcarriers, the input-output (I/O) relation of AFDM-SCMA systems is presented. Next, we delve into the generalized receiver des… ▽ More

    Submitted 11 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  24. arXiv:2312.03807  [pdf, other

    math.OC cs.LG stat.ML

    Achieving ${O}(ε^{-1.5})$ Complexity in Hessian/Jacobian-free Stochastic Bilevel Optimization

    Authors: Yifan Yang, Peiyao Xiao, Kaiyi Ji

    Abstract: In this paper, we revisit the bilevel optimization problem, in which the upper-level objective function is generally nonconvex and the lower-level objective function is strongly convex. Although this type of problem has been studied extensively, it still remains an open question how to achieve an ${O}(ε^{-1.5})$ sample complexity in Hessian/Jacobian-free stochastic bilevel optimization without any… ▽ More

    Submitted 20 December, 2023; v1 submitted 6 December, 2023; originally announced December 2023.

  25. arXiv:2312.01126  [pdf, other

    cs.IT eess.SP

    BER Analysis of SCMA-OFDM Systems in the Presence of Carrier Frequency Offset

    Authors: Haibo Liu, Qu Luo, Zilong Liu, Shan Luo, Pei Xiao, Rong** Lin

    Abstract: Sparse code multiple access (SCMA) building upon orthogonal frequency division multiplexing (OFDM) is a promising wireless technology for supporting massive connectivity in future machine-type communication networks. However, the sensitivity of OFDM to carrier frequency offset (CFO) poses a major challenge because it leads to orthogonality loss and incurs intercarrier interference (ICI). In this p… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  26. arXiv:2312.01125  [pdf, other

    cs.IT eess.SP

    Design and Performance Analysis of Index Modulation Empowered AFDM System

    Authors: **g Zhu, Qu Luo, Gaojie Chen, Pei Xiao, Lixia Xiao

    Abstract: In this letter, we incorporate index modulation (IM) into affine frequency division multiplexing (AFDM), called AFDM-IM, to enhance the bit error rate (BER) and energy efficiency (EE) performance. In this scheme, the information bits are conveyed not only by $M$-ary constellation symbols, but also by the activation of the chirp subcarriers (SCs) indices, which are determined based on the incoming… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

  27. arXiv:2311.14064  [pdf, other

    cs.CV

    HGCLIP: Exploring Vision-Language Models with Graph Representations for Hierarchical Understanding

    Authors: Peng Xia, Xingtong Yu, Ming Hu, Lie Ju, Zhiyong Wang, Peibo Duan, Zongyuan Ge

    Abstract: Object categories are typically organized into a multi-granularity taxonomic hierarchy. When classifying categories at different hierarchy levels, traditional uni-modal approaches focus primarily on image features, revealing limitations in complex scenarios. Recent studies integrating Vision-Language Models (VLMs) with class hierarchies have shown promise, yet they fall short of fully exploiting t… ▽ More

    Submitted 14 March, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

  28. arXiv:2311.13957  [pdf, other

    cs.CR cs.CL

    Efficient Trigger Word Insertion

    Authors: Yueqi Zeng, Ziqiang Li, Pengfei Xia, Lei Liu, Bin Li

    Abstract: With the boom in the natural language processing (NLP) field these years, backdoor attacks pose immense threats against deep neural network models. However, previous works hardly consider the effect of the poisoning rate. In this paper, our main objective is to reduce the number of poisoned samples while still achieving a satisfactory Attack Success Rate (ASR) in text backdoor attacks. To accompli… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

  29. Exchanging Dual Encoder-Decoder: A New Strategy for Change Detection with Semantic Guidance and Spatial Localization

    Authors: Sijie Zhao, Xueliang Zhang, Pengfeng Xiao, Guangjun He

    Abstract: Change detection is a critical task in earth observation applications. Recently, deep learning-based methods have shown promising performance and are quickly adopted in change detection. However, the widely used multiple encoder and single decoder (MESD) as well as dual encoder-decoder (DED) architectures still struggle to effectively handle change detection well. The former has problems of bitemp… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

    Journal ref: IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1-16, 2023, Art no. 4508016

  30. arXiv:2311.09796  [pdf, other

    cs.CL cs.AI

    Interpreting User Requests in the Context of Natural Language Standing Instructions

    Authors: Nikita Moghe, Patrick Xia, Jacob Andreas, Jason Eisner, Benjamin Van Durme, Harsh Jhamtani

    Abstract: Users of natural language interfaces, generally powered by Large Language Models (LLMs),often must repeat their preferences each time they make a similar request. We describe an approach to LLM-based dialogue modeling in which persistent user constraints and preferences -- collectively termed standing instructions -- as additional context for such interfaces. For example, when a user states "I'm h… ▽ More

    Submitted 7 March, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Updated with results from LLaMA-2

  31. arXiv:2311.00048  [pdf, other

    cs.CV cs.AI cs.LG

    SC-MIL: Sparsely Coded Multiple Instance Learning for Whole Slide Image Classification

    Authors: Peijie Qiu, Pan Xiao, Wenhui Zhu, Yalin Wang, Aristeidis Sotiras

    Abstract: Multiple Instance Learning (MIL) has been widely used in weakly supervised whole slide image (WSI) classification. Typical MIL methods include a feature embedding part that embeds the instances into features via a pre-trained feature extractor and the MIL aggregator that combines instance embeddings into predictions. The current focus has been directed toward improving these parts by refining the… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

  32. arXiv:2310.13347  [pdf, other

    cs.CV cs.AI

    NurViD: A Large Expert-Level Video Database for Nursing Procedure Activity Understanding

    Authors: Ming Hu, Lin Wang, Siyuan Yan, Don Ma, Qingli Ren, Peng Xia, Wei Feng, Peibo Duan, Lie Ju, Zongyuan Ge

    Abstract: The application of deep learning to nursing procedure activity understanding has the potential to greatly enhance the quality and safety of nurse-patient interactions. By utilizing the technique, we can facilitate training and education, improve quality control, and enable operational compliance monitoring. However, the development of automatic recognition systems in this field is currently hinder… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: Accepted by NeurIPS 2023 Datasets and Benchmarks Track

  33. arXiv:2310.09744  [pdf, other

    cs.CR cs.CV cs.CY

    Explore the Effect of Data Selection on Poison Efficiency in Backdoor Attacks

    Authors: Ziqiang Li, Pengfei Xia, Hong Sun, Yueqi Zeng, Wei Zhang, Bin Li

    Abstract: As the number of parameters in Deep Neural Networks (DNNs) scales, the thirst for training data also increases. To save costs, it has become common for users and enterprises to delegate time-consuming data collection to third parties. Unfortunately, recent research has shown that this practice raises the risk of DNNs being exposed to backdoor attacks. Specifically, an attacker can maliciously cont… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: Under Review

  34. arXiv:2309.13075  [pdf, other

    cs.AI cs.CL cs.LG

    SCREWS: A Modular Framework for Reasoning with Revisions

    Authors: Kumar Shridhar, Harsh Jhamtani, Hao Fang, Benjamin Van Durme, Jason Eisner, Patrick Xia

    Abstract: Large language models (LLMs) can improve their accuracy on various tasks through iteratively refining and revising their output based on feedback. We observe that these revisions can introduce errors, in which case it is better to roll back to a previous result. Further, revisions are typically homogeneous: they use the same reasoning method that produced the initial answer, which may not correct… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  35. arXiv:2309.10168  [pdf, other

    cs.CL

    Few-Shot Adaptation for Parsing Contextual Utterances with LLMs

    Authors: Kevin Lin, Patrick Xia, Hao Fang

    Abstract: We evaluate the ability of semantic parsers based on large language models (LLMs) to handle contextual utterances. In real-world settings, there typically exists only a limited number of annotated contextual utterances due to annotation cost, resulting in an imbalance compared to non-contextual utterances. Therefore, parsers must adapt to contextual utterances with a few training examples. We exam… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: Findings of IJCNLP-AACL 2023

  36. arXiv:2309.04566  [pdf, other

    cs.IT cs.CR

    STAR-RIS-Assisted-Full-Duplex Jamming Design for Secure Wireless Communications System

    Authors: Yun Wen, Gaojie Chen, Sisai Fang, Zheng Chu, Pei Xiao, Rahim Tafazolli

    Abstract: Physical layer security (PLS) technologies are expected to play an important role in the next-generation wireless networks, by providing secure communication to protect critical and sensitive information from illegitimate devices. In this paper, we propose a novel secure communication scheme where the legitimate receiver use full-duplex (FD) technology to transmit jamming signals with the assistan… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: 12 pages, 7 figures

  37. arXiv:2309.03471  [pdf, other

    cs.IT eess.SP

    Resource Management for IRS-assisted WP-MEC Networks with Practical Phase Shift Model

    Authors: Nana Li, Wanming Hao, Fuhui Zhou, Zheng Chu, Shouyi Yang, Pei Xiao

    Abstract: Wireless powered mobile edge computing (WP-MEC) has been recognized as a promising solution to enhance the computational capability and sustainable energy supply for low-power wireless devices (WDs). However, when the communication links between the hybrid access point (HAP) and WDs are hostile, the energy transfer efficiency and task offloading rate are compromised. To tackle this problem, we pro… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: 15 pages, 14 figures

  38. arXiv:2308.13330  [pdf, other

    cs.IT eess.SP

    Enhancing Signal Space Diversity for SCMA Over Rayleigh Fading Channels

    Authors: Qu Luo, Zilong Liu, Gaojie Chen, Pei Xiao

    Abstract: Sparse code multiple access (SCMA) is a promising technique for the enabling of massive connectivity in future machine-type communication networks, but it suffers from a limited diversity order which is a bottleneck for significant improvement of error performance. This paper aims for enhancing the signal space diversity of sparse code multiple access (SCMA) by introducing quadrature component del… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  39. arXiv:2308.11312  [pdf, other

    cs.AR cs.NI

    Octopus: A Heterogeneous In-network Computing Accelerator Enabling Deep Learning for network

    Authors: Dong Wen, Tao Li, Chenglong Li, Pengye Xia, Hui Yang, Zhigang Sun

    Abstract: Deep learning (DL) for network models have achieved excellent performance in the field and are becoming a promising component in future intelligent network system. Programmable in-network computing device has great potential to deploy DL for network models, however, existing device cannot afford to run a DL model. The main challenges of data-plane supporting DL-based network models lie in computin… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  40. arXiv:2308.01655  [pdf, other

    cs.CV

    DiffColor: Toward High Fidelity Text-Guided Image Colorization with Diffusion Models

    Authors: Jianxin Lin, Peng Xiao, Yijun Wang, Rongju Zhang, Xiangxiang Zeng

    Abstract: Recent data-driven image colorization methods have enabled automatic or reference-based colorization, while still suffering from unsatisfactory and inaccurate object-level color control. To address these issues, we propose a new method called DiffColor that leverages the power of pre-trained diffusion models to recover vivid colors conditioned on a prompt text, without any additional inputs. DiffC… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

  41. arXiv:2307.10837  [pdf, other

    cs.IT eess.SP

    Sensing User's Activity, Channel, and Location with Near-Field Extra-Large-Scale MIMO

    Authors: Li Qiao, Anwen Liao, Zhuoran Li, Hua Wang, Zhen Gao, Xiang Gao, Yu Su, Pei Xiao, Li You, Derrick Wing Kwan Ng

    Abstract: This paper proposes a grant-free massive access scheme based on the millimeter wave (mmWave) extra-large-scale multiple-input multiple-output (XL-MIMO) to support massive Internet-of-Things (IoT) devices with low latency, high data rate, and high localization accuracy in the upcoming sixth-generation (6G) networks. The XL-MIMO consists of multiple antenna subarrays that are widely spaced over the… ▽ More

    Submitted 16 October, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: To appear in IEEE Transactions on Communications. Codes will be open to all on https://gaozhen16.github.io/ soon

  42. arXiv:2307.09729  [pdf, other

    cs.CV cs.MM eess.IV

    NTIRE 2023 Quality Assessment of Video Enhancement Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Wei Sun, Yulun Zhang, Kai Zhang, Radu Timofte, Guangtao Zhai, Yixuan Gao, Yuqin Cao, Tengchuan Kou, Yunlong Dong, Ziheng Jia, Yilin Li, Wei Wu, Shuming Hu, Sibin Deng, Pengxiang Xiao, Ying Chen, Kai Li, Kai Zhao, Kun Yuan, Ming Sun, Heng Cong, Hao Wang, Lingzhi Fu , et al. (47 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2023 Quality Assessment of Video Enhancement Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2023. This challenge is to address a major challenge in the field of video processing, namely, video quality assessment (VQA) for enhanced videos. The challenge uses the VQA Dataset for Perceptual… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  43. arXiv:2306.08386  [pdf, other

    cs.CR cs.CV

    Efficient Backdoor Attacks for Deep Neural Networks in Real-world Scenarios

    Authors: Ziqiang Li, Hong Sun, Pengfei Xia, Heng Li, Beihao Xia, Yi Wu, Bin Li

    Abstract: Recent deep neural networks (DNNs) have came to rely on vast amounts of training data, providing an opportunity for malicious attackers to exploit and contaminate the data to carry out backdoor attacks. However, existing backdoor attack methods make unrealistic assumptions, assuming that all training data comes from a single source and that attackers have full access to the training data. In this… ▽ More

    Submitted 19 April, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: ICLR 2024

  44. arXiv:2306.08313  [pdf, other

    cs.CR cs.CV

    A Proxy Attack-Free Strategy for Practically Improving the Poisoning Efficiency in Backdoor Attacks

    Authors: Ziqiang Li, Hong Sun, Pengfei Xia, Beihao Xia, Xue Rui, Wei Zhang, Qinglang Guo, Bin Li

    Abstract: Poisoning efficiency plays a critical role in poisoning-based backdoor attacks. To evade detection, attackers aim to use the fewest poisoning samples while achieving the desired attack strength. Although efficient triggers have significantly improved poisoning efficiency, there is still room for further enhancement. Recently, selecting efficient samples has shown promise, but it often requires a p… ▽ More

    Submitted 25 April, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: Under review

  45. arXiv:2306.07669  [pdf, other

    cs.IT

    Rate-Splitting with Hybrid Messages: DoF Analysis of the Two-User MIMO Broadcast Channel with Imperfect CSIT

    Authors: Tong Zhang, Yufan Zhuang, Gaojie Chen, Shuai Wang, Bojie Lv, Rui Wang, Pei Xiao

    Abstract: Most of the existing research on degrees-of-freedom (DoF) with imperfect channel state information at the transmitter (CSIT) assume the messages are private, which may not reflect reality as the two receivers can request the same content. To overcome this limitation, we therefore consider the hybrid unicast and multicast messages. In particular, we characterize the optimal DoF region for the two-u… ▽ More

    Submitted 1 March, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: 15 pages, double column

  46. arXiv:2305.19442  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    SimFBO: Towards Simple, Flexible and Communication-efficient Federated Bilevel Learning

    Authors: Yifan Yang, Peiyao Xiao, Kaiyi Ji

    Abstract: Federated bilevel optimization (FBO) has shown great potential recently in machine learning and edge computing due to the emerging nested optimization structure in meta-learning, fine-tuning, hyperparameter tuning, etc. However, existing FBO algorithms often involve complicated computations and require multiple sub-loops per iteration, each of which contains a number of communication rounds. In th… ▽ More

    Submitted 27 December, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

  47. arXiv:2305.18617  [pdf

    cs.CY cs.AI cs.LG

    Waiting, Banning, and Embracing: An Empirical Analysis of Adapting Policies for Generative AI in Higher Education

    Authors: ** Xiao, Yuanyuan Chen, Weining Bao

    Abstract: Generative AI tools such as ChatGPT have recently gained significant attention in higher education. This study aims to understand how universities establish policies regarding the use of AI tools and explore the factors that influence their decisions. Our study examines ChatGPT policies implemented at universities around the world, including their existence, content, and issuance dates. Specifical… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 33 pages with 2 figures

    ACM Class: H.m

  48. arXiv:2305.18409  [pdf, other

    cs.LG math.OC stat.ML

    Direction-oriented Multi-objective Learning: Simple and Provable Stochastic Algorithms

    Authors: Peiyao Xiao, Hao Ban, Kaiyi Ji

    Abstract: Multi-objective optimization (MOO) has become an influential framework in many machine learning problems with multiple objectives such as learning with multiple criteria and multi-task learning (MTL). In this paper, we propose a new direction-oriented multi-objective problem by regularizing the common descent direction within a neighborhood of a direction that optimizes a linear combination of obj… ▽ More

    Submitted 28 November, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

  49. arXiv:2305.08677  [pdf, other

    cs.CL

    Natural Language Decomposition and Interpretation of Complex Utterances

    Authors: Harsh Jhamtani, Hao Fang, Patrick Xia, Eran Levy, Jacob Andreas, Ben Van Durme

    Abstract: Designing natural language interfaces has historically required collecting supervised data to translate user requests into carefully designed intent representations. This requires enumerating and labeling a long tail of user requests, which is challenging. At the same time, large language models (LLMs) encode knowledge about goals and plans that can help conversational assistants interpret user re… ▽ More

    Submitted 8 January, 2024; v1 submitted 15 May, 2023; originally announced May 2023.

  50. arXiv:2305.08527  [pdf, other

    cs.IT eess.SP

    Sum Secrecy Rate Maximization for IRS-aided Multi-Cluster MIMO-NOMA Terahertz Systems

    Authors: **lei Xu, Zhengyu Zhu, Zheng Chu, Hehao Niu, Pei Xiao, Inkyu Lee

    Abstract: Intelligent reflecting surface (IRS) is a promising technique to extend the network coverage and improve spectral efficiency. This paper investigates an IRS-assisted terahertz (THz) multiple-input multiple-output (MIMO)-nonorthogonal multiple access (NOMA) system based on hybrid precoding with the presence of eavesdropper. Two types of sparse RF chain antenna structures are adopted, i.e., sub-conn… ▽ More

    Submitted 11 June, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: 11 pages, 8 figure; references added