Skip to main content

Showing 1–50 of 513 results for author: Shi, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00987  [pdf, other

    cs.NI eess.SY

    Exploiting Dependency-Aware Priority Adjustment for Mixed-Criticality TSN Flow Scheduling

    Authors: Miao Guo, Yifei Sun, Chaojie Gu, Shibo He, Zhiguo Shi

    Abstract: Time-Sensitive Networking (TSN) serves as a one-size-fits-all solution for mixed-criticality communication, in which flow scheduling is vital to guarantee real-time transmissions. Traditional approaches statically assign priorities to flows based on their associated applications, resulting in significant queuing delays. In this paper, we observe that assigning different priorities to a flow leads… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: This paper has been accepted by IWQoS'24

  2. arXiv:2406.19043  [pdf

    eess.IV cs.AI cs.CV cs.DB

    CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI

    Authors: Zi Wang, Fanwen Wang, Chen Qin, Jun Lyu, Ouyang Cheng, Shuo Wang, Yan Li, Mengyao Yu, Haoyu Zhang, Kunyuan Guo, Zhang Shi, Qirong Li, Ziqiang Xu, Ya**g Zhang, Hao Li, Sha Hua, Binghua Chen, Longyu Sun, Mengting Sun, Qin Li, Ying-Hua Chu, Wenjia Bai, **g Qin, Xiahai Zhuang, Claudia Prieto , et al. (7 additional authors not shown)

    Abstract: Cardiac magnetic resonance imaging (MRI) has emerged as a clinically gold-standard technique for diagnosing cardiac diseases, thanks to its ability to provide diverse information with multiple modalities and anatomical views. Accelerated cardiac MRI is highly expected to achieve time-efficient and patient-friendly imaging, and then advanced image reconstruction approaches are required to recover h… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 19 pages, 3 figures, 2 tables

  3. arXiv:2406.17803  [pdf, other

    cs.CL cs.AI cs.IR

    Understanding the Role of User Profile in the Personalization of Large Language Models

    Authors: Bin Wu, Zhengyan Shi, Hossein A. Rahmani, Varsha Ramineni, Emine Yilmaz

    Abstract: Utilizing user profiles to personalize Large Language Models (LLMs) has been shown to enhance the performance on a wide range of tasks. However, the precise role of user profiles and their effect mechanism on LLMs remains unclear. This study first confirms that the effectiveness of user profiles is primarily due to personalization information rather than semantic information. Furthermore, we inves… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  4. arXiv:2406.16583  [pdf, other

    cs.LG cs.CV

    Personalized federated learning based on feature fusion

    Authors: Wolong Xing, Zhenkui Shi, Hongyan Peng, Xiantao Hu, Xianxian Li

    Abstract: Federated learning enables distributed clients to collaborate on training while storing their data locally to protect client privacy. However, due to the heterogeneity of data, models, and devices, the final global model may need to perform better for tasks on each client. Communication bottlenecks, data heterogeneity, and model heterogeneity have been common challenges in federated learning. In t… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  5. arXiv:2406.14891  [pdf, other

    cs.CL cs.IR

    Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering

    Authors: Zhengliang Shi, Shuo Zhang, Weiwei Sun, Shen Gao, Pengjie Ren, Zhumin Chen, Zhaochun Ren

    Abstract: Multi-Hop Question Answering (MHQA) tasks present a significant challenge for large language models (LLMs) due to the intensive knowledge required. Current solutions, like Retrieval-Augmented Generation, typically retrieve potential documents from an external corpus to read an answer. However, the performance of this retrieve-then-read paradigm is constrained by the retriever and the inevitable no… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: ACL 2024 (main conference)

  6. arXiv:2406.14852  [pdf, other

    cs.CV cs.AI

    Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models

    Authors: Jiayu Wang, Yifei Ming, Zhenmei Shi, Vibhav Vineet, Xin Wang, Neel Joshi

    Abstract: Large language models (LLMs) and vision-language models (VLMs) have demonstrated remarkable performance across a wide range of tasks and domains. Despite this promise, spatial understanding and reasoning -- a fundamental component of human cognition -- remains under-explored. We develop novel benchmarks that cover diverse aspects of spatial reasoning such as relationship understanding, navigation,… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  7. arXiv:2406.14036  [pdf, other

    cs.LG cs.AI cs.CL

    Toward Infinite-Long Prefix in Transformer

    Authors: Jiuxiang Gu, Yingyu Liang, Zhenmei Shi, Zhao Song, Chiwun Yang

    Abstract: Prompting and contextual-based fine-tuning methods, which we call Prefix Learning, have been proposed to enhance the performance of language models on various downstream tasks that can match full parameter fine-tuning. There remains a limited theoretical understanding of how these methods work. In this paper, we aim to relieve this limitation by studying the learning ability of Prefix Learning fro… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  8. arXiv:2406.13975  [pdf, other

    cs.CL cs.AI

    MR-BEN: A Comprehensive Meta-Reasoning Benchmark for Large Language Models

    Authors: Zhongshen Zeng, Yinhong Liu, Yingjia Wan, **gyao Li, Pengguang Chen, Jianbo Dai, Yuxuan Yao, Rongwu Xu, Zehan Qi, Wanru Zhao, Linling Shen, Jianqiao Lu, Haochen Tan, Yukang Chen, Hao Zhang, Zhan Shi, Bailin Wang, Zhijiang Guo, Jiaya Jia

    Abstract: Large language models (LLMs) have shown increasing capability in problem-solving and decision-making, largely based on the step-by-step chain-of-thought reasoning processes. However, it has been increasingly challenging to evaluate the reasoning capability of LLMs. Concretely, existing outcome-based benchmarks begin to saturate and become less sufficient to monitor the progress. To this end, we pr… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  9. arXiv:2406.13317  [pdf, other

    cs.CV

    M4Fog: A Global Multi-Regional, Multi-Modal, and Multi-Stage Dataset for Marine Fog Detection and Forecasting to Bridge Ocean and Atmosphere

    Authors: Mengqiu Xu, Ming Wu, Kaixin Chen, Yixiang Huang, Mingrui Xu, Yujia Yang, Yiqing Feng, Yiying Guo, Bin Huang, Dongliang Chang, Zhenwei Shi, Chuang Zhang, Zhanyu Ma, Jun Guo

    Abstract: Marine fog poses a significant hazard to global ship**, necessitating effective detection and forecasting to reduce economic losses. In recent years, several machine learning (ML) methods have demonstrated superior detection accuracy compared to traditional meteorological methods. However, most of these works are developed on proprietary datasets, and the few publicly accessible datasets are oft… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  10. arXiv:2406.10678  [pdf, other

    cs.CV

    A Late-Stage Bitemporal Feature Fusion Network for Semantic Change Detection

    Authors: Chenyao Zhou, Haotian Zhang, Han Guo, Zhengxia Zou, Zhenwei Shi

    Abstract: Semantic change detection is an important task in geoscience and earth observation. By producing a semantic change map for each temporal phase, both the land use land cover categories and change information can be interpreted. Recently some multi-task learning based semantic change detection methods have been proposed to decompose the task into semantic segmentation and binary change detection sub… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  11. arXiv:2406.05628  [pdf, other

    cs.LG

    Domain Generalization Guided by Large-Scale Pre-Trained Priors

    Authors: Zongbin Wang, Bin Pan, Shiyu Shen, Tianyang Shi, Zhenwei Shi

    Abstract: Domain generalization (DG) aims to train a model from limited source domains, allowing it to generalize to unknown target domains. Typically, DG models only employ large-scale pre-trained models during the initialization of fine-tuning. However, large-scale pre-trained models already possess the ability to resist domain shift. If we reference pre-trained models continuously during fine-tuning to m… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  12. arXiv:2406.05616  [pdf, other

    cs.LG

    Domain Agnostic Conditional Invariant Predictions for Domain Generalization

    Authors: Zongbin Wang, Bin Pan, Zhenwei Shi

    Abstract: Domain generalization aims to develop a model that can perform well on unseen target domains by learning from multiple source domains. However, recent-proposed domain generalization models usually rely on domain labels, which may not be available in many real-world scenarios. To address this challenge, we propose a Discriminant Risk Minimization (DRM) theory and the corresponding algorithm to capt… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  13. arXiv:2406.04207  [pdf, other

    cs.CV

    CDMamba: Remote Sensing Image Change Detection with Mamba

    Authors: Haotian Zhang, Keyan Chen, Chenyang Liu, Hao Chen, Zhengxia Zou, Zhenwei Shi

    Abstract: Recently, the Mamba architecture based on state space models has demonstrated remarkable performance in a series of natural language processing tasks and has been rapidly applied to remote sensing change detection (CD) tasks. However, most methods enhance the global receptive field by directly modifying the scanning mode of Mamba, neglecting the crucial role that local information plays in dense p… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  14. arXiv:2406.02037  [pdf

    cs.CV

    Multi-Scale Direction-Aware Network for Infrared Small Target Detection

    Authors: **miao Zhao, Zelin Shi, Chuang Yu, Yunpeng Liu

    Abstract: Infrared small target detection faces the problem that it is difficult to effectively separate the background and the target. Existing deep learning-based methods focus on appearance features and ignore high-frequency directional features. Therefore, we propose a multi-scale direction-aware network (MSDA-Net), which is the first attempt to integrate the high-frequency directional features of infra… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  15. arXiv:2406.00738  [pdf, other

    cs.LG cs.AI cs.CY

    Global Rewards in Restless Multi-Armed Bandits

    Authors: Naveen Raman, Zheyuan Ryan Shi, Fei Fang

    Abstract: Restless multi-armed bandits (RMAB) extend multi-armed bandits so pulling an arm impacts future states. Despite the success of RMABs, a key limiting assumption is the separability of rewards into a sum across arms. We address this deficiency by proposing restless-multi-armed bandit with global rewards (RMAB-G), a generalization of RMABs to global non-separable rewards. To solve RMAB-G, we develop… ▽ More

    Submitted 7 June, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

    Comments: 27 pages

  16. arXiv:2405.21063  [pdf, other

    cs.LG cs.AI

    Neural Network Verification with Branch-and-Bound for General Nonlinearities

    Authors: Zhouxing Shi, Qirui **, Zico Kolter, Suman Jana, Cho-Jui Hsieh, Huan Zhang

    Abstract: Branch-and-bound (BaB) is among the most effective methods for neural network (NN) verification. However, existing works on BaB have mostly focused on NNs with piecewise linear activations, especially ReLU networks. In this paper, we develop a general framework, named GenBaB, to conduct BaB for general nonlinearities in general computational graphs based on linear bound propagation. To decide whic… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: Preprint

  17. arXiv:2405.19592  [pdf, other

    cs.LG cs.AI cs.CL

    Why Larger Language Models Do In-context Learning Differently?

    Authors: Zhenmei Shi, Junyi Wei, Zhuoyan Xu, Yingyu Liang

    Abstract: Large language models (LLM) have emerged as a powerful tool for AI, with the key ability of in-context learning (ICL), where they can perform well on unseen tasks based on a brief series of task examples without necessitating any adjustments to the model parameters. One recent interesting mysterious observation is that models of different scales may have different ICL behaviors: larger models tend… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  18. arXiv:2405.17193  [pdf, other

    cs.GR

    Anisotropic Gauss Reconstruction for Unoriented Point Clouds

    Authors: Yueji Ma, Dong Xiao, Zuoqiang Shi, Bin Wang

    Abstract: Unoriented surface reconstructions based on the Gauss formula have attracted much attention due to their elegant mathematical formulation and excellent performance. However, the isotropic characteristics of the formulation limit their capacity to leverage the anisotropic information within the point cloud. In this work, we propose a novel anisotropic formulation by introducing a convection term in… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 17pages;14figures

  19. arXiv:2405.16634  [pdf, other

    cs.GR

    Fast and Globally Consistent Normal Orientation based on the Winding Number Normal Consistency

    Authors: Siyou Lin, Zuoqiang Shi, Yebin Liu

    Abstract: Estimating a consistently oriented normal vector field for an unoriented point cloud enables a number of important downstream applications in computer graphics. While normal estimation for a small patch of points can be done with simple techniques like principal component analysis (PCA), orienting these normals to be globally consistent has been a notoriously difficult problem. Some recent methods… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

  20. arXiv:2405.16533  [pdf, other

    cs.CL

    Chain of Tools: Large Language Model is an Automatic Multi-tool Learner

    Authors: Zhengliang Shi, Shen Gao, Xiuyi Chen, Yue Feng, Lingyong Yan, Haibo Shi, Dawei Yin, Zhumin Chen, Suzan Verberne, Zhaochun Ren

    Abstract: Augmenting large language models (LLMs) with external tools has emerged as a promising approach to extend their utility, empowering them to solve practical tasks. Existing work typically empowers LLMs as tool users with a manually designed workflow, where the LLM plans a series of tools in a step-by-step manner, and sequentially executes each tool to obtain intermediate results until deriving the… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: Work in progress

  21. arXiv:2405.16418  [pdf, other

    cs.LG cs.AI cs.CV

    Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective

    Authors: Jiuxiang Gu, Yingyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou

    Abstract: Diffusion models have made rapid progress in generating high-quality samples across various domains. However, a theoretical understanding of the Lipschitz continuity and second momentum properties of the diffusion process is still lacking. In this paper, we bridge this gap by providing a detailed examination of these smoothness properties for the case where the target data distribution is a mixtur… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  22. arXiv:2405.16411  [pdf, other

    cs.LG cs.AI cs.CL

    Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers

    Authors: Jiuxiang Gu, Yingyu Liang, Zhenmei Shi, Zhao Song, Yufa Zhou

    Abstract: Tensor Attention, a multi-view attention that is able to capture high-order correlations among multiple modalities, can overcome the representational limitations of classical matrix attention. However, the $Ω(n^3)$ time complexity of tensor attention poses a significant obstacle to its practical implementation in transformers, where $n$ is the input sequence length. In this work, we prove that the… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  23. arXiv:2405.14602  [pdf, other

    cs.LG

    Controllable Continual Test-Time Adaptation

    Authors: Ziqi Shi, Fan Lyu, Ye Liu, Fanhua Shang, Fuyuan Hu, Wei Feng, Zhang Zhang, Liang Wang

    Abstract: Continual Test-Time Adaptation (CTTA) is an emerging and challenging task where a model trained in a source domain must adapt to continuously changing conditions during testing, without access to the original source data. CTTA is prone to error accumulation due to uncontrollable domain shifts, leading to blurred decision boundaries between categories. Existing CTTA methods primarily focus on suppr… ▽ More

    Submitted 28 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

  24. arXiv:2405.14394  [pdf, other

    cs.CL cs.AI

    Instruction Tuning With Loss Over Instructions

    Authors: Zhengyan Shi, Adam X. Yang, Bin Wu, Laurence Aitchison, Emine Yilmaz, Aldo Lipani

    Abstract: Instruction tuning plays a crucial role in sha** the outputs of language models (LMs) to desired styles. In this work, we propose a simple yet effective method, Instruction Modelling (IM), which trains LMs by applying a loss function to the instruction and prompt part rather than solely to the output part. Through experiments across 21 diverse benchmarks, we show that, in many scenarios, IM can… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Code is available at https://github.com/ZhengxiangShi/InstructionModelling

  25. arXiv:2405.13570  [pdf, other

    cs.CV

    MetaEarth: A Generative Foundation Model for Global-Scale Remote Sensing Image Generation

    Authors: Zhi** Yu, Chenyang Liu, Liqin Liu, Zhenwei Shi, Zhengxia Zou

    Abstract: The recent advancement of generative foundational models has ushered in a new era of image generation in the realm of natural images, revolutionizing art design, entertainment, environment simulation, and beyond. Despite producing high-quality samples, existing methods are constrained to generating images of scenes at a limited scale. In this paper, we present MetaEarth, a generative foundation mo… ▽ More

    Submitted 28 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    Comments: Project page: https://jiupinjia.github.io/metaearth/

  26. arXiv:2405.09964  [pdf, other

    cs.CV

    KPNDepth: Depth Estimation of Lane Images under Complex Rainy Environment

    Authors: Zhengxu Shi

    Abstract: With the development of deep neural network generative models in recent years, significant progress has been made in the research of depth estimation in lane scenes. However, current research achievements are mainly focused on clear daytime scenarios. In complex rainy environments, the influence of rain streaks and local fog effects often leads to erroneous increases in the overall depth estimatio… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  27. arXiv:2405.09820  [pdf, other

    cs.LG cs.CV

    Densely Distilling Cumulative Knowledge for Continual Learning

    Authors: Zenglin Shi, Pei Liu, Tong Su, Yunpeng Wu, Kuien Liu, Yu Song, Meng Wang

    Abstract: Continual learning, involving sequential training on diverse tasks, often faces catastrophic forgetting. While knowledge distillation-based approaches exhibit notable success in preventing forgetting, we pinpoint a limitation in their ability to distill the cumulative knowledge of all the previous tasks. To remedy this, we propose Dense Knowledge Distillation (DKD). DKD uses a task pool to track t… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: 12 pages; Continual Leanrning; Class-incremental Learning; Knowledge Distillation; Forgetting

  28. Enhancing Function Name Prediction using Votes-Based Name Tokenization and Multi-Task Learning

    Authors: Xiaoling Zhang, Zhengzi Xu, Shouguo Yang, Zhi Li, Zhiqiang Shi, Limin Sun

    Abstract: Reverse engineers would acquire valuable insights from descriptive function names, which are absent in publicly released binaries. Recent advances in binary function name prediction using data-driven machine learning show promise. However, existing approaches encounter difficulties in capturing function semantics in diverse optimized binaries and fail to reserve the meaning of labels in function n… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: 24 pages, 10 figures, ACM ESEC/FSE 2024

    Journal ref: Proc. ACM Softw. Eng. 1,FSE, Article 75 (July 2024), 24 pages

  29. arXiv:2405.06683  [pdf, other

    cs.CL

    ERAGent: Enhancing Retrieval-Augmented Language Models with Improved Accuracy, Efficiency, and Personalization

    Authors: Yunxiao Shi, Xing Zi, Zi**g Shi, Haimin Zhang, Qiang Wu, Min Xu

    Abstract: Retrieval-augmented generation (RAG) for language models significantly improves language understanding systems. The basic retrieval-then-read pipeline of response generation has evolved into a more extended process due to the integration of various components, sometimes even forming loop structures. Despite its advancements in improving response accuracy, challenges like poor retrieval quality for… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Draft Paper

  30. arXiv:2405.05219  [pdf, other

    cs.LG cs.AI

    Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers

    Authors: Jiuxiang Gu, Yingyu Liang, Heshan Liu, Zhenmei Shi, Zhao Song, Junze Yin

    Abstract: Large Language Models (LLMs) have profoundly changed the world. Their self-attention mechanism is the key to the success of transformers in LLMs. However, the quadratic computational cost $O(n^2)$ to the length $n$ input sequence is the notorious obstacle for further improvement and scalability in the longer context. In this work, we leverage the convolution-like structure of attention matrices to… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 55 pages

  31. arXiv:2405.04122  [pdf, other

    cs.LG cs.DC

    Ranking-based Client Selection with Imitation Learning for Efficient Federated Learning

    Authors: Chunlin Tian, Zhan Shi, Xinpeng Qin, Li Li, Chengzhong Xu

    Abstract: Federated Learning (FL) enables multiple devices to collaboratively train a shared model while ensuring data privacy. The selection of participating devices in each training round critically affects both the model performance and training efficiency, especially given the vast heterogeneity in training capabilities and data distribution across devices. To address these challenges, we introduce a no… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024

  32. arXiv:2405.03251  [pdf, ps, other

    cs.LG cs.AI

    Exploring the Frontiers of Softmax: Provable Optimization, Applications in Diffusion Model, and Beyond

    Authors: Jiuxiang Gu, Chenyang Li, Yingyu Liang, Zhenmei Shi, Zhao Song

    Abstract: The softmax activation function plays a crucial role in the success of large language models (LLMs), particularly in the self-attention mechanism of the widely adopted Transformer architecture. However, the underlying learning dynamics that contribute to the effectiveness of softmax remain largely unexplored. As a step towards better understanding, this paper provides a theoretical study of the op… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 53 pages

  33. arXiv:2404.19245  [pdf, other

    cs.CL cs.AI

    HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning

    Authors: Chunlin Tian, Zhan Shi, Zhijiang Guo, Li Li, Chengzhong Xu

    Abstract: Adapting Large Language Models (LLMs) to new tasks through fine-tuning has been made more efficient by the introduction of Parameter-Efficient Fine-Tuning (PEFT) techniques, such as LoRA. However, these methods often underperform compared to full fine-tuning, particularly in scenarios involving complex datasets. This issue becomes even more pronounced in complex domains, highlighting the need for… ▽ More

    Submitted 23 May, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: 19 pages, 7 figures

  34. arXiv:2404.18955  [pdf, other

    cs.NE cs.AI

    GARA: A novel approach to Improve Genetic Algorithms' Accuracy and Efficiency by Utilizing Relationships among Genes

    Authors: Zhaoning Shi, Meng Xiang, Zhaoyang Hai, Xiabi Liu, Yan Pei

    Abstract: Genetic algorithms have played an important role in engineering optimization. Traditional GAs treat each gene separately. However, biophysical studies of gene regulatory networks revealed direct associations between different genes. It inspires us to propose an improvement to GA in this paper, Gene Regulatory Genetic Algorithm (GRGA), which, to our best knowledge, is the first time to utilize rela… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  35. arXiv:2404.18895  [pdf, other

    cs.CV

    RSCaMa: Remote Sensing Image Change Captioning with State Space Model

    Authors: Chenyang Liu, Keyan Chen, Bowen Chen, Haotian Zhang, Zhengxia Zou, Zhenwei Shi

    Abstract: Remote Sensing Image Change Captioning (RSICC) aims to describe surface changes between multi-temporal remote sensing images in language, including the changed object categories, locations, and dynamics of changing objects (e.g., added or disappeared). This poses challenges to spatial and temporal modeling of bi-temporal features. Despite previous methods progressing in the spatial change percepti… ▽ More

    Submitted 21 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  36. arXiv:2404.18848  [pdf, other

    cs.LG cs.AI cs.CL

    FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition

    Authors: Yuxuan Yan, Qianqian Yang, Shunpu Tang, Zhiguo Shi

    Abstract: Despite their exceptional performance on various tasks after fine-tuning, pre-trained language models (PLMs) face significant challenges due to growing privacy concerns with data in centralized training methods. We consider federated learning (FL) to fine-tune PLMs in this paper. However, the substantial number of parameters in PLMs poses significant difficulties for client devices with limited co… ▽ More

    Submitted 25 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  37. arXiv:2404.18085  [pdf, other

    cs.CL

    CRE-LLM: A Domain-Specific Chinese Relation Extraction Framework with Fine-tuned Large Language Model

    Authors: Zhengpeng Shi, Haoran Luo

    Abstract: Domain-Specific Chinese Relation Extraction (DSCRE) aims to extract relations between entities from domain-specific Chinese text. Despite the rapid development of PLMs in recent years, especially LLMs, DSCRE still faces three core challenges: complex network structure design, poor awareness, and high consumption of fine-tuning. Given the impressive performance of large language models (LLMs) in na… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: preprint

  38. arXiv:2404.18074  [pdf, other

    cs.AI cs.HC

    MMAC-Copilot: Multi-modal Agent Collaboration Operating System Copilot

    Authors: Zirui Song, Yaohang Li, Meng Fang, Zhenhao Chen, Zecheng Shi, Yuan Huang, Ling Chen

    Abstract: Autonomous virtual agents are often limited by their singular mode of interaction with real-world environments, restricting their versatility. To address this, we propose the Multi-Modal Agent Collaboration framework (MMAC-Copilot), a framework utilizes the collective expertise of diverse agents to enhance interaction ability with operating systems. The framework introduces a team collaboration ch… ▽ More

    Submitted 4 May, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

    Comments: In processing

  39. arXiv:2404.15284  [pdf, other

    eess.SP cs.AI

    Global 4D Ionospheric STEC Prediction based on DeepONet for GNSS Rays

    Authors: Dijia Cai, Zenghui Shi, Haiyang Fu, Huan Liu, Hongyi Qian, Yun Sui, Feng Xu, Ya-Qiu **

    Abstract: The ionosphere is a vitally dynamic charged particle region in the Earth's upper atmosphere, playing a crucial role in applications such as radio communication and satellite navigation. The Slant Total Electron Contents (STEC) is an important parameter for characterizing wave propagation, representing the integrated electron density along the ray of radio signals passing through the ionosphere. Th… ▽ More

    Submitted 12 March, 2024; originally announced April 2024.

  40. arXiv:2404.12814  [pdf, other

    cs.LG cs.AI cs.CV

    Generative Modelling with High-Order Langevin Dynamics

    Authors: Ziqiang Shi, Rujie Liu

    Abstract: Diffusion generative modelling (DGM) based on stochastic differential equations (SDEs) with score matching has achieved unprecedented results in data generation. In this paper, we propose a novel fast high-quality generative modelling method based on high-order Langevin dynamics (HOLD) with score matching. This motive is proved by third-order Langevin dynamics. By augmenting the previous SDEs, e.g… ▽ More

    Submitted 21 April, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

    Comments: Some of the results in this paper have been published or accepted at conferences such as wacv2024, icassp2024, and icme2024

  41. arXiv:2404.12638  [pdf, other

    cs.AI

    Learning to Cut via Hierarchical Sequence/Set Model for Efficient Mixed-Integer Programming

    Authors: Jie Wang, Zhihai Wang, Xijun Li, Yufei Kuang, Zhihao Shi, Fangzhou Zhu, Mingxuan Yuan, Jia Zeng, Yongdong Zhang, Feng Wu

    Abstract: Cutting planes (cuts) play an important role in solving mixed-integer linear programs (MILPs), which formulate many important real-world applications. Cut selection heavily depends on (P1) which cuts to prefer and (P2) how many cuts to select. Although modern MILP solvers tackle (P1)-(P2) by human-designed heuristics, machine learning carries the potential to learn more effective heuristics. Howev… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2302.00244

  42. arXiv:2404.09814  [pdf, other

    cs.IT

    A Novel HARQ-CC Assisted SCMA Scheme

    Authors: Man Wang, Zheng Shi, Yunfei Li, Xianda Wu, Weiqiang Tan, Xinrong Ye

    Abstract: This letter proposes a novel hybrid automatic repeat request with chase combining assisted sparse code multiple access (HARQ-CC-SCMA) scheme. Depending on whether the same superimposed packet are retransmitted, synchronous and asynchronous modes are considered for retransmissions. Moreover, factor graph aggregation (FGA) and Log-likelihood ratio combination (LLRC) are proposed for multi-user detec… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  43. arXiv:2404.07956  [pdf, other

    cs.LG cs.AI cs.RO eess.SY math.OC

    Lyapunov-stable Neural Control for State and Output Feedback: A Novel Formulation

    Authors: Lujie Yang, Hongkai Dai, Zhouxing Shi, Cho-Jui Hsieh, Russ Tedrake, Huan Zhang

    Abstract: Learning-based neural network (NN) control policies have shown impressive empirical performance in a wide range of tasks in robotics and control. However, formal (Lyapunov) stability guarantees over the region-of-attraction (ROA) for NN controllers with nonlinear dynamical systems are challenging to obtain, and most existing approaches rely on expensive solvers such as sums-of-squares (SOS), mixed… ▽ More

    Submitted 4 June, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

    Comments: Paper accepted by ICML 2024

  44. arXiv:2404.04155  [pdf, other

    cs.CV

    MarsSeg: Mars Surface Semantic Segmentation with Multi-level Extractor and Connector

    Authors: Junbo Li, Keyan Chen, Gengju Tian, Lu Li, Zhenwei Shi

    Abstract: The segmentation and interpretation of the Martian surface play a pivotal role in Mars exploration, providing essential data for the trajectory planning and obstacle avoidance of rovers. However, the complex topography, similar surface features, and the lack of extensive annotated data pose significant challenges to the high-precision semantic segmentation of the Martian surface. To address these… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  45. arXiv:2404.04102  [pdf, other

    cs.LG cs.AI cs.CL

    ROPO: Robust Preference Optimization for Large Language Models

    Authors: Xize Liang, Chao Chen, Shuang Qiu, Jie Wang, Yue Wu, Zhihang Fu, Zhihao Shi, Feng Wu, Jie** Ye

    Abstract: Preference alignment is pivotal for empowering large language models (LLMs) to generate helpful and harmless responses. However, the performance of preference alignment is highly sensitive to the prevalent noise in the preference data. Recent efforts for this problem either marginally alleviate the impact of noise without the ability to actually reduce its presence, or rely on costly teacher LLMs… ▽ More

    Submitted 28 May, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

  46. arXiv:2404.00938  [pdf, ps, other

    cs.HC cs.CL cs.CV cs.RO

    How Can Large Language Models Enable Better Socially Assistive Human-Robot Interaction: A Brief Survey

    Authors: Zhonghao Shi, Ellen Landrum, Amy O' Connell, Mina Kian, Leticia Pinto-Alva, Kaleen Shrestha, Xiaoyuan Zhu, Maja J Matarić

    Abstract: Socially assistive robots (SARs) have shown great success in providing personalized cognitive-affective support for user populations with special needs such as older adults, children with autism spectrum disorder (ASD), and individuals with mental health challenges. The large body of work on SAR demonstrates its potential to provide at-home support that complements clinic-based interventions deliv… ▽ More

    Submitted 5 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: 2 pages, accepted to the Proceedings of the AAAI Symposium Series, 2024

  47. arXiv:2404.00299  [pdf, other

    cs.CV

    HOI-M3:Capture Multiple Humans and Objects Interaction within Contextual Environment

    Authors: Juze Zhang, **gyan Zhang, Zining Song, Zhanhe Shi, Chengfeng Zhao, Ye Shi, **gyi Yu, Lan Xu, **gya Wang

    Abstract: Humans naturally interact with both others and the surrounding multiple objects, engaging in various social activities. However, recent advances in modeling human-object interactions mostly focus on perceiving isolated individuals and objects, due to fundamental data scarcity. In this paper, we introduce HOI-M3, a novel large-scale dataset for modeling the interactions of Multiple huMans and Multi… ▽ More

    Submitted 2 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted to CVPR 2024

  48. arXiv:2403.20198  [pdf, other

    cs.IT eess.SY

    Minimizing End-to-End Latency for Joint Source-Channel Coding Systems

    Authors: Kaiyi Chi, Qianqian Yang, Yuanchao Shu, Zhaohui Yang, Zhiguo Shi

    Abstract: While existing studies have highlighted the advantages of deep learning (DL)-based joint source-channel coding (JSCC) schemes in enhancing transmission efficiency, they often overlook the crucial aspect of resource management during the deployment phase. In this paper, we propose an approach to minimize the transmission latency in an uplink JSCC-based system. We first analyze the correlation betwe… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: 7 Pages, 5 Figures, accepted by 2024 IEEE ICC Workshop

  49. arXiv:2403.19654  [pdf, other

    cs.CV

    RSMamba: Remote Sensing Image Classification with State Space Model

    Authors: Keyan Chen, Bowen Chen, Chenyang Liu, Wenyuan Li, Zhengxia Zou, Zhenwei Shi

    Abstract: Remote sensing image classification forms the foundation of various understanding tasks, serving a crucial function in remote sensing image interpretation. The recent advancements of Convolutional Neural Networks (CNNs) and Transformers have markedly enhanced classification accuracy. Nonetheless, remote sensing scene classification remains a significant challenge, especially given the complexity a… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  50. arXiv:2403.19646  [pdf, other

    cs.CV

    Change-Agent: Towards Interactive Comprehensive Remote Sensing Change Interpretation and Analysis

    Authors: Chenyang Liu, Keyan Chen, Haotian Zhang, Zipeng Qi, Zhengxia Zou, Zhenwei Shi

    Abstract: Monitoring changes in the Earth's surface is crucial for understanding natural processes and human impacts, necessitating precise and comprehensive interpretation methodologies. Remote sensing satellite imagery offers a unique perspective for monitoring these changes, leading to the emergence of remote sensing image change interpretation (RSICI) as a significant research focus. Current RSICI techn… ▽ More

    Submitted 1 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.