Skip to main content

Showing 1–50 of 404 results for author: Yao, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00718  [pdf, other

    eess.IV cs.CV

    ASPS: Augmented Segment Anything Model for Polyp Segmentation

    Authors: Huiqian Li, Dingwen Zhang, Jieru Yao, Longfei Han, Zhongyu Li, Junwei Han

    Abstract: Polyp segmentation plays a pivotal role in colorectal cancer diagnosis. Recently, the emergence of the Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation, leveraging its powerful pre-training capability on large-scale datasets. However, due to the domain gap between natural and endoscopy images, SAM encounters two limitations in achieving effective performan… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: Accepted by MICCAI2024

  2. arXiv:2407.00336  [pdf, other

    cs.CR cs.LG

    Dual-view Aware Smart Contract Vulnerability Detection for Ethereum

    Authors: Jiacheng Yao, Maolin Wang, Wanqi Chen, Chengxiang **, Jiajun Zhou, Shanqing Yu, Qi Xuan

    Abstract: The wide application of Ethereum technology has brought technological innovation to traditional industries. As one of Ethereum's core applications, smart contracts utilize diverse contract codes to meet various functional needs and have gained widespread use. However, the non-tamperability of smart contracts, coupled with vulnerabilities caused by natural flaws or human errors, has brought unprece… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: Accepted by International Conference on Blockchain and Trustworthy Systems 2024

  3. arXiv:2406.17005  [pdf, other

    cs.CV

    PVUW 2024 Challenge on Complex Video Understanding: Methods and Results

    Authors: Henghui Ding, Chang Liu, Yunchao Wei, Nikhila Ravi, Shuting He, Song Bai, Philip Torr, Deshui Miao, Xin Li, Zhenyu He, Yaowei Wang, Ming-Hsuan Yang, Zhensong Xu, Jiangtao Yao, Cheng**g Wu, Ting Liu, Luoqi Liu, Xinyu Liu, **g Zhang, Kexin Zhang, Yuting Yang, Licheng Jiao, Shuyuan Yang, Mingqi Gao, **gnan Luo , et al. (12 additional authors not shown)

    Abstract: Pixel-level Video Understanding in the Wild Challenge (PVUW) focus on complex video understanding. In this CVPR 2024 workshop, we add two new tracks, Complex Video Object Segmentation Track based on MOSE dataset and Motion Expression guided Video Segmentation track based on MeViS dataset. In the two new tracks, we provide additional videos and annotations that feature challenging elements, such as… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: MOSE Challenge: https://henghuiding.github.io/MOSE/ChallengeCVPR2024, MeViS Challenge: https://henghuiding.github.io/MeViS/ChallengeCVPR2024

  4. arXiv:2406.16529  [pdf, other

    cs.CL

    Towards Better Graph-based Cross-document Relation Extraction via Non-bridge Entity Enhancement and Prediction Debiasing

    Authors: Hao Yue, Shaopeng Lai, Chengyi Yang, Liang Zhang, Junfeng Yao, **song Su

    Abstract: Cross-document Relation Extraction aims to predict the relation between target entities located in different documents. In this regard, the dominant models commonly retain useful information for relation prediction via bridge entities, which allows the model to elaborately capture the intrinsic interdependence between target entities. However, these studies ignore the non-bridge entities, each of… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 Findings

  5. arXiv:2406.16295  [pdf, other

    cs.LG cs.AI

    Relaxing Continuous Constraints of Equivariant Graph Neural Networks for Physical Dynamics Learning

    Authors: Zinan Zheng, Yang Liu, Jia Li, Jianhua Yao, Yu Rong

    Abstract: Incorporating Euclidean symmetries (e.g. rotation equivariance) as inductive biases into graph neural networks has improved their generalization ability and data efficiency in unbounded physical dynamics modeling. However, in various scientific and engineering applications, the symmetries of dynamics are frequently discrete due to the boundary conditions. Thus, existing GNNs either overlook necess… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  6. arXiv:2406.15047  [pdf, other

    cs.IT eess.SP

    Optimal Transmit Signal Design for Multi-Target MIMO Sensing Exploiting Prior Information

    Authors: Jiayi Yao, Shuowen Zhang

    Abstract: In this paper, we study the transmit signal optimization in a multiple-input multiple-output (MIMO) radar system for sensing the angle information of multiple targets via their reflected echo signals. We consider a challenging and practical scenario where the angles to be sensed are unknown and random, while their probability information is known a priori for exploitation. First, we establish an a… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: submitted for possible piblication

  7. arXiv:2406.09679  [pdf, other

    cs.CV

    Exploring Training on Heterogeneous Data with Mixture of Low-rank Adapters

    Authors: Yuhang Zhou, Zihua Zhao, Haolin Li, Siyuan Du, Jiangchao Yao, Ya Zhang, Yanfeng Wang

    Abstract: Training a unified model to take multiple targets into account is a trend towards artificial general intelligence. However, how to efficiently mitigate the training conflicts among heterogeneous data collected from different domains or tasks remains under-explored. In this study, we explore to leverage Mixture of Low-rank Adapters (MoLA) to mitigate conflicts in heterogeneous data training, which… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: ICML2024

  8. arXiv:2406.09098  [pdf, other

    cs.CL

    SciKnowEval: Evaluating Multi-level Scientific Knowledge of Large Language Models

    Authors: Kehua Feng, Keyan Ding, Weijie Wang, Xiang Zhuang, Zeyuan Wang, Ming Qin, Yu Zhao, Jianhua Yao, Qiang Zhang, Huajun Chen

    Abstract: The burgeoning utilization of Large Language Models (LLMs) in scientific research necessitates advanced benchmarks capable of evaluating their understanding and application of scientific knowledge comprehensively. To address this need, we introduce the SciKnowEval benchmark, a novel framework that systematically evaluates LLMs across five progressive levels of scientific knowledge: studying extens… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 48 pages, 2 figures

  9. arXiv:2406.08288  [pdf, other

    cs.LG

    Decoupling the Class Label and the Target Concept in Machine Unlearning

    Authors: Jianing Zhu, Bo Han, Jiangchao Yao, Jianliang Xu, Gang Niu, Masashi Sugiyama

    Abstract: Machine unlearning as an emerging research topic for data regulations, aims to adjust a trained model to approximate a retrained one that excludes a portion of training data. Previous studies showed that class-wise unlearning is successful in forgetting the knowledge of a target class, through gradient ascent on the forgetting data or fine-tuning with the remaining data. However, while these metho… ▽ More

    Submitted 16 June, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

  10. arXiv:2406.08192  [pdf, other

    cs.CV

    2nd Place Solution for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation

    Authors: Zhensong Xu, Jiangtao Yao, Cheng**g Wu, Ting Liu, Luoqi Liu

    Abstract: Complex video object segmentation serves as a fundamental task for a wide range of downstream applications such as video editing and automatic data annotation. Here we present the 2nd place solution in the MOSE track of PVUW 2024. To mitigate problems caused by tiny objects, similar objects and fast movements in MOSE. We use instance segmentation to generate extra pretraining data from the valid a… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 5pages, 4 figures, technique report for MOSE Track in CVPR 2024 PVUW workshop: Complex Video Object Segmentation

  11. arXiv:2406.04872  [pdf, other

    cs.LG

    Diversified Batch Selection for Training Acceleration

    Authors: Feng Hong, Yueming Lyu, Jiangchao Yao, Ya Zhang, Ivor W. Tsang, Yanfeng Wang

    Abstract: The remarkable success of modern machine learning models on large datasets often demands extensive training time and resource consumption. To save cost, a prevalent research line, known as online batch selection, explores selecting informative subsets during the training process. Although recent efforts achieve advancements by measuring the impact of each sample on generalization, their reliance o… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  12. arXiv:2406.03450  [pdf, other

    cs.CL cs.AI

    What is the Best Way for ChatGPT to Translate Poetry?

    Authors: Shanshan Wang, Derek F. Wong, **gming Yao, Lidia S. Chao

    Abstract: Machine translation (MT) has historically faced significant challenges when applied to literary works, particularly in the domain of poetry translation. The advent of Large Language Models such as ChatGPT holds potential for innovation in this field. This study examines ChatGPT's capabilities in English-Chinese poetry translation tasks, utilizing targeted prompts and small sample scenarios to asce… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 19 pages, 1 figure. The paper has been accepted by ACL 2024(Main Conference)

  13. arXiv:2406.02517  [pdf, other

    cs.CL

    Deterministic Reversible Data Augmentation for Neural Machine Translation

    Authors: Jiashu Yao, Heyan Huang, Zeming Liu, Yuhang Guo

    Abstract: Data augmentation is an effective way to diversify corpora in machine translation, but previous methods may introduce semantic inconsistency between original and augmented data because of irreversible operations and random subword sampling procedures. To generate both symbolically diverse and semantically consistent augmentation data, we propose Deterministic Reversible Data Augmentation (DRDA), a… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Findings of ACL 2024

  14. arXiv:2405.20626  [pdf, other

    cs.IR cs.IT

    Causal Distillation for Alleviating Performance Heterogeneity in Recommender Systems

    Authors: Shengyu Zhang, Ziqi Jiang, Jiangchao Yao, Fuli Feng, Kun Kuang, Zhou Zhao, Shuo Li, Hongxia Yang, Tat-Seng Chua, Fei Wu

    Abstract: Recommendation performance usually exhibits a long-tail distribution over users -- a small portion of head users enjoy much more accurate recommendation services than the others. We reveal two sources of this performance heterogeneity problem: the uneven distribution of historical interactions (a natural source); and the biased training of recommender models (a model source). As addressing this pr… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: TKDE 2023

  15. arXiv:2405.19919  [pdf, other

    cs.LG cs.SI

    Unraveling the Impact of Heterophilic Structures on Graph Positive-Unlabeled Learning

    Authors: Yuhao Wu, Jiangchao Yao, Bo Han, Lina Yao, Tongliang Liu

    Abstract: While Positive-Unlabeled (PU) learning is vital in many real-world scenarios, its application to graph data still remains under-explored. We unveil that a critical challenge for PU learning on graph lies on the edge heterophily, which directly violates the irreducibility assumption for Class-Prior Estimation (class prior is essential for building PU learning algorithms) and degenerates the latent… ▽ More

    Submitted 1 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  16. arXiv:2405.18983  [pdf, other

    cs.LG cs.DC

    Federated Learning under Partially Class-Disjoint Data via Manifold Resha**

    Authors: Ziqing Fan, Jiangchao Yao, Ruipeng Zhang, Lingjuan Lyu, Ya Zhang, Yanfeng Wang

    Abstract: Statistical heterogeneity severely limits the performance of federated learning (FL), motivating several explorations e.g., FedProx, MOON and FedDyn, to alleviate this problem. Despite effectiveness, their considered scenario generally requires samples from almost all classes during the local training of each client, although some covariate shifts may exist among clients. In fact, the natural case… ▽ More

    Submitted 3 June, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

  17. arXiv:2405.18972  [pdf, other

    cs.LG cs.DC

    Federated Learning with Bilateral Curation for Partially Class-Disjoint Data

    Authors: Ziqing Fan, Ruipeng Zhang, Jiangchao Yao, Bo Han, Ya Zhang, Yanfeng Wang

    Abstract: Partially class-disjoint data (PCDD), a common yet under-explored data formation where each client contributes a part of classes (instead of all classes) of samples, severely challenges the performance of federated algorithms. Without full classes, the local objective will contradict the global objective, yielding the angle collapse problem for locally missing classes and the space waste problem f… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  18. arXiv:2405.18890  [pdf, other

    cs.LG cs.DC

    Locally Estimated Global Perturbations are Better than Local Perturbations for Federated Sharpness-aware Minimization

    Authors: Ziqing Fan, Shengchao Hu, Jiangchao Yao, Gang Niu, Ya Zhang, Masashi Sugiyama, Yanfeng Wang

    Abstract: In federated learning (FL), the multi-step update and data heterogeneity among clients often lead to a loss landscape with sharper minima, degenerating the performance of the resulted global model. Prevalent federated approaches incorporate sharpness-aware minimization (SAM) into local training to mitigate this problem. However, the local loss landscapes may not accurately reflect the flatness of… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  19. arXiv:2405.18861  [pdf, other

    cs.CV cs.LG

    Domain-Inspired Sharpness-Aware Minimization Under Domain Shifts

    Authors: Ruipeng Zhang, Ziqing Fan, Jiangchao Yao, Ya Zhang, Yanfeng Wang

    Abstract: This paper presents a Domain-Inspired Sharpness-Aware Minimization (DISAM) algorithm for optimization under domain shifts. It is motivated by the inconsistent convergence degree of SAM across different domains, which induces optimization bias towards certain domains and thus impairs the overall convergence. To address this issue, we consider the domain-level convergence consistency in the sharpnes… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Published as a conference paper at ICLR 2024

  20. arXiv:2405.17759  [pdf, ps, other

    cs.IT

    Wireless Federated Learning over Resource-Constrained Networks: Digital versus Analog Transmissions

    Authors: Jiacheng Yao, Wei Xu, Zhaohui Yang, Xiaohu You, Mehdi Bennis, H. Vincent Poor

    Abstract: To enable wireless federated learning (FL) in communication resource-constrained networks, two communication schemes, i.e., digital and analog ones, are effective solutions. In this paper, we quantitatively compare these two techniques, highlighting their essential differences as well as respectively suitable scenarios. We first examine both digital and analog transmission schemes, together with a… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE TWC. arXiv admin note: text overlap with arXiv:2402.09657

  21. arXiv:2405.16996  [pdf, other

    cs.CV

    Mitigating Noisy Correspondence by Geometrical Structure Consistency Learning

    Authors: Zihua Zhao, Mengxi Chen, Tianjie Dai, Jiangchao Yao, Bo han, Ya Zhang, Yanfeng Wang

    Abstract: Noisy correspondence that refers to mismatches in cross-modal data pairs, is prevalent on human-annotated or web-crawled datasets. Prior approaches to leverage such data mainly consider the application of uni-modal noisy label learning without amending the impact on both cross-modal and intra-modal geometrical structures in multimodal learning. Actually, we find that both structures are effective… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 10 pages, 5 figures, received by IEEE/CVF Computer Science and Pattern Recognition

  22. arXiv:2405.16444  [pdf, other

    cs.LG

    CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion

    Authors: Jiayi Yao, Hanchen Li, Yuhan Liu, Siddhant Ray, Yihua Cheng, Qizheng Zhang, Kuntai Du, Shan Lu, Junchen Jiang

    Abstract: Large language models (LLMs) often incorporate multiple text chunks in their inputs to provide the necessary contexts. To speed up the prefill of the long LLM inputs, one can pre-compute the KV cache of a text and re-use the KV cache when the context is reused as the prefix of another LLM input. However, the reused text chunks are not always the input prefix, and when they are not, their precomput… ▽ More

    Submitted 3 June, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

  23. arXiv:2405.16283  [pdf, other

    cs.DC

    TURNIP: A "Nondeterministic" GPU Runtime with CPU RAM Offload

    Authors: Zhimin Ding, Jiawen Yao, Brianna Barrow, Tania Lorido Botran, Christopher Jermaine, Yuxin Tang, Jiehui Li, Xinyu Yao, Sleem Mahmoud Abdelghafar, Daniel Bourgeois

    Abstract: An obvious way to alleviate memory difficulties in GPU-based AI computing is via CPU offload, where data are moved between GPU and CPU RAM, so inexpensive CPU RAM is used to increase the amount of storage available. While CPU offload is an obvious idea, it can greatly slow down a computation, due to the relatively slow transfer rate between CPU RAM and GPU RAM. Thus, any system for CPU offload nee… ▽ More

    Submitted 27 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  24. arXiv:2405.16265  [pdf, other

    cs.LG

    MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Time

    Authors: Jikun Kang, Xin Zhe Li, Xi Chen, Amirreza Kazemi, Qianyi Sun, Boxing Chen, Dong Li, Xu He, Quan He, Feng Wen, Jianye Hao, Jun Yao

    Abstract: Although Large Language Models (LLMs) achieve remarkable performance across various tasks, they often struggle with complex reasoning tasks, such as answering mathematical questions. Recent efforts to address this issue have primarily focused on leveraging mathematical datasets through supervised fine-tuning or self-improvement techniques. However, these methods often depend on high-quality datase… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  25. arXiv:2405.14230  [pdf, other

    cs.CV cs.AI cs.CL

    Boosting Medical Image-based Cancer Detection via Text-guided Supervision from Reports

    Authors: Guangyu Guo, Jiawen Yao, Yingda Xia, Tony C. W. Mok, Zhilin Zheng, Junwei Han, Le Lu, Dingwen Zhang, Jian Zhou, Ling Zhang

    Abstract: The absence of adequately sufficient expert-level tumor annotations hinders the effectiveness of supervised learning based opportunistic cancer screening on medical imaging. Clinical reports (that are rich in descriptive textual details) can offer a "free lunch'' supervision information and provide tumor location as a type of weak label to cope with screening tasks, thus saving human labeling work… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  26. arXiv:2405.05237  [pdf, other

    cs.CV

    EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning

    Authors: **gfeng Yao, Xinggang Wang, Yuehao Song, Huangxuan Zhao, Jun Ma, Yajie Chen, Wenyu Liu, Bo Wang

    Abstract: The diagnosis and treatment of chest diseases play a crucial role in maintaining human health. X-ray examination has become the most common clinical examination means due to its efficiency and cost-effectiveness. Artificial intelligence analysis methods for chest X-ray images are limited by insufficient annotation data and varying levels of annotation, resulting in weak generalization ability and… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: codes available at: https://github.com/hustvl/EVA-X

  27. arXiv:2405.02673  [pdf, other

    cs.CL

    On the Information Redundancy in Non-Autoregressive Translation

    Authors: Zhihao Wang, Longyue Wang, **song Su, Junfeng Yao, Zhaopeng Tu

    Abstract: Token repetition is a typical form of multi-modal problem in fully non-autoregressive translation (NAT). In this work, we revisit the multi-modal problem in recently proposed NAT models. Our study reveals that these advanced models have introduced other types of information redundancy errors, which cannot be measured by the conventional metric - the continuous repetition ratio. By manually annotat… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 10 pages, 10 tables

  28. arXiv:2404.17184  [pdf, other

    cs.CV

    Low-Rank Knowledge Decomposition for Medical Foundation Models

    Authors: Yuhang Zhou, Haolin Li, Siyuan Du, Jiangchao Yao, Ya Zhang, Yanfeng Wang

    Abstract: The popularity of large-scale pre-training has promoted the development of medical foundation models. However, some studies have shown that although foundation models exhibit strong general feature extraction capabilities, their performance on specific tasks is still inferior to task-specific methods. In this paper, we explore a new perspective called ``Knowledge Decomposition'' to improve the per… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  29. arXiv:2404.16880  [pdf, other

    q-bio.QM cs.AI cs.CL

    Atomas: Hierarchical Alignment on Molecule-Text for Unified Molecule Understanding and Generation

    Authors: Yikun Zhang, Geyan Ye, Chaohao Yuan, Bo Han, Long-Kai Huang, Jianhua Yao, Wei Liu, Yu Rong

    Abstract: Molecule-and-text cross-modal representation learning has emerged as a promising direction for enhancing the quality of molecular representation, thereby improving performance in various scientific fields, including drug discovery and materials science. Existing studies adopt a global alignment approach to learn the knowledge from different modalities. These global alignment approaches fail to cap… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  30. arXiv:2404.16866  [pdf, other

    q-bio.QM cs.AI cs.LG

    Functional Protein Design with Local Domain Alignment

    Authors: Chaohao Yuan, Songyou Li, Geyan Ye, Yikun Zhang, Long-Kai Huang, Wenbing Huang, Wei Liu, Jianhua Yao, Yu Rong

    Abstract: The core challenge of de novo protein design lies in creating proteins with specific functions or properties, guided by certain conditions. Current models explore to generate protein using structural and evolutionary guidance, which only provide indirect conditions concerning functions and properties. However, textual annotations of proteins, especially the annotations for protein domains, which d… ▽ More

    Submitted 27 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  31. arXiv:2404.15655  [pdf, other

    cs.CV

    Multi-Modal Proxy Learning Towards Personalized Visual Multiple Clustering

    Authors: Jiawei Yao, Qi Qian, Juhua Hu

    Abstract: Multiple clustering has gained significant attention in recent years due to its potential to reveal multiple hidden structures of data from different perspectives. The advent of deep multiple clustering techniques has notably advanced the performance by uncovering complex patterns and relationships within large datasets. However, a major challenge arises as users often do not need all the clusteri… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024. Project page: https://github.com/Alexander-Yao/Multi-MaP

  32. arXiv:2404.14955  [pdf, other

    cs.CV

    A Comprehensive Survey for Hyperspectral Image Classification: The Evolution from Conventional to Transformers

    Authors: Muhammad Ahmad, Salvatore Distifano, Adil Mehmood Khan, Manuel Mazzara, Chenyu Li, **g Yao, Hao Li, Jagannath Aryal, Gemine Vivone, Danfeng Hong

    Abstract: Hyperspectral Image Classification (HSC) is a challenging task due to the high dimensionality and complex nature of Hyperspectral (HS) data. Traditional Machine Learning approaches while effective, face challenges in real-world data due to varying optimal feature sets, subjectivity in human-driven design, biases, and limitations. Traditional approaches encounter the curse of dimensionality, strugg… ▽ More

    Submitted 12 June, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  33. arXiv:2404.13601  [pdf, ps, other

    cs.FL math.CO math.NT

    Opacity complexity of automatic sequences. The general case

    Authors: J. -P. Allouche, J. -Y. Yao

    Abstract: In this work we introduce a new notion called opacity complexity to measure the complexity of automatic sequences. We study basic properties of this notion, and exhibit an algorithm to compute it. As applications, we compute the opacity complexity of some well-known automatic sequences, including in particular constant sequences, purely periodic sequences, the Thue-Morse sequence, the period-doubl… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    MSC Class: 94A17; 68Q45; 11B85

  34. arXiv:2404.12530  [pdf, other

    cs.LG cs.CR

    TrajDeleter: Enabling Trajectory Forgetting in Offline Reinforcement Learning Agents

    Authors: Chen Gong, Kecen Li, ** Yao, Tianhao Wang

    Abstract: Reinforcement learning (RL) trains an agent from experiences interacting with the environment. In scenarios where online interactions are impractical, offline RL, which trains the agent using pre-collected datasets, has become popular. While this new paradigm presents remarkable effectiveness across various real-world domains, like healthcare and energy management, there is a growing demand to ena… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: 22 pages

  35. arXiv:2404.10571  [pdf, other

    cs.CV

    CMU-Flownet: Exploring Point Cloud Scene Flow Estimation in Occluded Scenario

    Authors: **gze Chen, Junfeng Yao, Qiqin Lin, Lei Li

    Abstract: Occlusions hinder point cloud frame alignment in LiDAR data, a challenge inadequately addressed by scene flow models tested mainly on occlusion-free datasets. Attempts to integrate occlusion handling within networks often suffer accuracy issues due to two main limitations: a) the inadequate use of occlusion information, often merging it with flow estimation without an effective integration strateg… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 14 pages

  36. arXiv:2404.08489  [pdf, other

    cs.CV

    SpectralMamba: Efficient Mamba for Hyperspectral Image Classification

    Authors: **g Yao, Danfeng Hong, Chenyu Li, Jocelyn Chanussot

    Abstract: Recurrent neural networks and Transformers have recently dominated most applications in hyperspectral (HS) imaging, owing to their capability to capture long-range dependencies from spectrum sequences. However, despite the success of these sequential architectures, the non-ignorable inefficiency caused by either difficulty in parallelization or computationally prohibitive attention still hinders t… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  37. arXiv:2404.04878  [pdf, other

    eess.IV cs.CV

    CycleINR: Cycle Implicit Neural Representation for Arbitrary-Scale Volumetric Super-Resolution of Medical Data

    Authors: Wei Fang, Yuxing Tang, Heng Guo, Mingze Yuan, Tony C. W. Mok, Ke Yan, Jiawen Yao, Xin Chen, Zaiyi Liu, Le Lu, Ling Zhang, Minfeng Xu

    Abstract: In the realm of medical 3D data, such as CT and MRI images, prevalent anisotropic resolution is characterized by high intra-slice but diminished inter-slice resolution. The lowered resolution between adjacent slices poses challenges, hindering optimal viewing experiences and impeding the development of robust downstream analysis algorithms. Various volumetric super-resolution algorithms aim to sur… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: CVPR accepted paper

  38. arXiv:2404.00242  [pdf, other

    cs.CL cs.AI

    DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference

    Authors: **wei Yao, Kaiqi Chen, Kexun Zhang, Jiaxuan You, Binhang Yuan, Zeke Wang, Tao Lin

    Abstract: Given the increasing demand for tree-structured interactions with LLMs, we introduce DeFT (Decoding with Flash Tree-Attention), an IO-aware tree attention algorithm tailored for tree-structured inference. Unlike traditional sequence-based decoding, tree-structured decoding better accommodates modern task requirements, including self-consistency, few-shot prompting, multi-step reasoning, and multi-… ▽ More

    Submitted 29 May, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Update DeFT-v2. DeFT-v1 was accepted by ICLR'24 AGI Workshop ( https://openreview.net/forum?id=HqfLHoX8bR ). Code will be released soon

  39. arXiv:2403.19294  [pdf, other

    cs.CV cs.LG

    FlowDepth: Decoupling Optical Flow for Self-Supervised Monocular Depth Estimation

    Authors: Yiyang Sun, Zhiyuan Xu, Xiaonian Wang, **g Yao

    Abstract: Self-supervised multi-frame methods have currently achieved promising results in depth estimation. However, these methods often suffer from mismatch problems due to the moving objects, which break the static assumption. Additionally, unfairness can occur when calculating photometric errors in high-freq or low-texture regions of the images. To address these issues, existing approaches use additiona… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  40. arXiv:2403.17839  [pdf, other

    cs.CV cs.AI

    ReMamber: Referring Image Segmentation with Mamba Twister

    Authors: Yuhuan Yang, Chaofan Ma, Jiangchao Yao, Zhun Zhong, Ya Zhang, Yanfeng Wang

    Abstract: Referring Image Segmentation (RIS) leveraging transformers has achieved great success on the interpretation of complex visual-language tasks. However, the quadratic computation cost makes it resource-consuming in capturing long-range visual-language dependencies. Fortunately, Mamba addresses this with efficient linear complexity in processing. However, directly applying Mamba to multi-modal intera… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  41. arXiv:2403.16004  [pdf, other

    cs.LG cs.AI

    A Federated Parameter Aggregation Method for Node Classification Tasks with Different Graph Network Structures

    Authors: Hao Song, Jiacheng Yao, Zhengxi Li, Shaocong Xu, Shibo **, Jiajun Zhou, Chenbo Fu, Qi Xuan, Shanqing Yu

    Abstract: Over the past few years, federated learning has become widely used in various classical machine learning fields because of its collaborative ability to train data from multiple sources without compromising privacy. However, in the area of graph neural networks, the nodes and network structures of graphs held by clients are different in many practical applications, and the aggregation method that d… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  42. arXiv:2403.12778  [pdf, other

    cs.CV

    ViTGaze: Gaze Following with Interaction Features in Vision Transformers

    Authors: Yuehao Song, Xinggang Wang, **gfeng Yao, Wenyu Liu, **glin Zhang, Xiangmin Xu

    Abstract: Gaze following aims to interpret human-scene interactions by predicting the person's focal point of gaze. Prevailing approaches often use multi-modality inputs, most of which adopt a two-stage framework. Hence their performance highly depends on the previous prediction accuracy. Others use a single-modality approach with complex decoders, increasing network computational load. Inspired by the rema… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  43. arXiv:2403.10231  [pdf, other

    cs.LG cs.AI cs.SI

    Less is More: One-shot Subgraph Reasoning on Large-scale Knowledge Graphs

    Authors: Zhanke Zhou, Yongqi Zhang, Jiangchao Yao, Quanming Yao, Bo Han

    Abstract: To deduce new facts on a knowledge graph (KG), a link predictor learns from the graph structure and collects local evidence to find the answer to a given query. However, existing methods suffer from a severe scalability problem due to the utilization of the whole KG for prediction, which hinders their promise on large scale KGs and cannot be directly addressed by vanilla sampling methods. In this… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: 32 pages, 43 figures

  44. RecAI: Leveraging Large Language Models for Next-Generation Recommender Systems

    Authors: Jianxun Lian, Yuxuan Lei, Xu Huang, **g Yao, Wei Xu, Xing Xie

    Abstract: This paper introduces RecAI, a practical toolkit designed to augment or even revolutionize recommender systems with the advanced capabilities of Large Language Models (LLMs). RecAI provides a suite of tools, including Recommender AI Agent, Recommendation-oriented Language Models, Knowledge Plugin, RecExplainer, and Evaluator, to facilitate the integration of LLMs into recommender systems from mult… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 4 pages. Webconf 2024 demo track

    MSC Class: 68T50

  45. arXiv:2403.05021  [pdf, other

    cs.CV

    Beyond MOT: Semantic Multi-Object Tracking

    Authors: Yunhao Li, Hao Wang, Xue Ma, Jiali Yao, Shaohua Dong, Heng Fan, Libo Zhang

    Abstract: Current multi-object tracking (MOT) aims to predict trajectories of targets (i.e.,"where") in videos. Yet, knowing merely "where" is insufficient in many crucial applications. In comparison, semantic understanding such as fine-grained behaviors, interactions, and overall summarized captions (i.e., "what") from videos, associated with "where", is highly-desired for comprehensive video analysis. Thu… ▽ More

    Submitted 10 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  46. arXiv:2403.04204  [pdf, other

    cs.AI cs.CL

    On the Essence and Prospect: An Investigation of Alignment Approaches for Big Models

    Authors: Xinpeng Wang, Shitong Duan, Xiaoyuan Yi, **g Yao, Shanlin Zhou, Zhihua Wei, Peng Zhang, Dongkuan Xu, Maosong Sun, Xing Xie

    Abstract: Big models have achieved revolutionary breakthroughs in the field of AI, but they might also pose potential concerns. Addressing such concerns, alignment technologies were introduced to make these models conform to human preferences and values. Despite considerable advancements in the past year, various challenges lie in establishing the optimal alignment strategy, such as data cost and scalable o… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 23 pages, 7 figures

  47. arXiv:2403.01942  [pdf, other

    cs.LG

    Mitigating Label Noise on Graph via Topological Sample Selection

    Authors: Yuhao Wu, Jiangchao Yao, Xiaobo Xia, Jun Yu, Ruxin Wang, Bo Han, Tongliang Liu

    Abstract: Despite the success of the carefully-annotated benchmarks, the effectiveness of existing graph neural networks (GNNs) can be considerably impaired in practice when the real-world graph data is noisily labeled. Previous explorations in sample selection have been demonstrated as an effective way for robust learning with noisy labels, however, the conventional studies focus on i.i.d data, and when mo… ▽ More

    Submitted 4 June, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: ICML 2024

  48. arXiv:2403.01244  [pdf, other

    cs.CL cs.AI

    Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal

    Authors: Jianheng Huang, Leyang Cui, Ante Wang, Chengyi Yang, Xinting Liao, Linfeng Song, Junfeng Yao, **song Su

    Abstract: Large language models (LLMs) suffer from catastrophic forgetting during continual learning. Conventional rehearsal-based methods rely on previous training data to retain the model's ability, which may not be feasible in real-world applications. When conducting continual learning based on a publicly-released LLM checkpoint, the availability of the original training data may be non-existent. To addr… ▽ More

    Submitted 25 May, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

    Comments: ACL 2024 main, long paper

  49. arXiv:2403.01241  [pdf, other

    cs.CL cs.AI

    IntactKV: Improving Large Language Model Quantization by Kee** Pivot Tokens Intact

    Authors: Ruikang Liu, Haoli Bai, Haokun Lin, Yuening Li, Han Gao, Zhengzhuo Xu, Lu Hou, Jun Yao, Chun Yuan

    Abstract: Large language models (LLMs) excel in natural language processing but demand intensive computation. To mitigate this, various quantization methods have been explored, yet they compromise LLM performance. This paper unveils a previously overlooked type of outliers in LLMs. Such outliers are found to allocate most of the attention scores on initial tokens of input, termed as pivot tokens, which are… ▽ More

    Submitted 25 May, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

    Comments: Accepted by ACL 2024 findings

  50. arXiv:2403.00190  [pdf

    cs.SI cs.AI

    Identification of important nodes in the information propagation network based on the artificial intelligence method

    Authors: Bin Yuan, Tianbo Song, Jerry Yao

    Abstract: This study presents an integrated approach for identifying key nodes in information propagation networks using advanced artificial intelligence methods. We introduce a novel technique that combines the Decision-making Trial and Evaluation Laboratory (DEMATEL) method with the Global Structure Model (GSM), creating a synergistic model that effectively captures both local and global influences within… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.