Skip to main content

Showing 1–50 of 316 results for author: Shi, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01016  [pdf, other

    cs.CV

    SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection

    Authors: Dingkang Liang, Wei Hua, Chunsheng Shi, Zhikang Zou, Xiaoqing Ye, Xiang Bai

    Abstract: Semi-supervised object detection (SSOD), leveraging unlabeled data to boost object detectors, has become a hot topic recently. However, existing SSOD approaches mainly focus on horizontal objects, leaving multi-oriented objects common in aerial images unexplored. At the same time, the annotation cost of multi-oriented objects is significantly higher than that of their horizontal counterparts. Ther… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.20015  [pdf, other

    cs.CL cs.AI

    ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models

    Authors: Yuxiang Zhang, **g Chen, Junjie Wang, Yaxin Liu, Cheng Yang, Chufan Shi, Xinyu Zhu, Zihao Lin, Hanwen Wan, Yujiu Yang, Tetsuya Sakai, Tian Feng, Hayato Yamana

    Abstract: Tool-augmented large language models (LLMs) are rapidly being integrated into real-world applications. Due to the lack of benchmarks, the community still needs to fully understand the hallucination issues within these models. To address this challenge, we introduce a comprehensive diagnostic benchmark, ToolBH. Specifically, we assess the LLM's hallucinations through two perspectives: depth and bre… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  3. arXiv:2406.19531  [pdf, other

    stat.ML cs.LG

    Forward and Backward State Abstractions for Off-policy Evaluation

    Authors: Meiling Hao, **fan Su, Liyuan Hu, Zoltan Szabo, Qingyuan Zhao, Chengchun Shi

    Abstract: Off-policy evaluation (OPE) is crucial for evaluating a target policy's impact offline before its deployment. However, achieving accurate OPE in large state spaces remains challenging.This paper studies state abstractions-originally designed for policy learning-in the context of OPE. Our contributions are three-fold: (i) We define a set of irrelevance conditions central to learning state abstracti… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 42 pages, 5 figures

    ACM Class: G.3; I.2.6; G.1.2

  4. arXiv:2406.16279  [pdf, other

    cs.CV

    SegNet4D: Effective and Efficient 4D LiDAR Semantic Segmentation in Autonomous Driving Environments

    Authors: Neng Wang, Ruibin Guo, Chenghao Shi, Hui Zhang, Huimin Lu, Zhiqiang Zheng, Xieyuanli Chen

    Abstract: 4D LiDAR semantic segmentation, also referred to as multi-scan semantic segmentation, plays a crucial role in enhancing the environmental understanding capabilities of autonomous vehicles. It entails identifying the semantic category of each point in the LiDAR scan and distinguishing whether it is dynamic, a critical aspect in downstream tasks such as path planning and autonomous navigation. Exist… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures

  5. arXiv:2406.14503  [pdf, other

    cs.CL

    Overview of the CAIL 2023 Argument Mining Track

    Authors: **gcong Liang, Junlong Wang, Xinyu Zhai, Yungui Zhuang, Yiyang Zheng, Xin Xu, Xiandong Ran, Xiaozheng Dong, Honghui Rong, Yanlun Liu, Hao Chen, Yuhan Wei, Donghai Li, Jiajie Peng, Xuan**g Huang, Chongde Shi, Yansong Feng, Yun Song, Zhongyu Wei

    Abstract: We give a detailed overview of the CAIL 2023 Argument Mining Track, one of the Chinese AI and Law Challenge (CAIL) 2023 tracks. The main goal of the track is to identify and extract interacting argument pairs in trial dialogs. It mainly uses summarized judgment documents but can also refer to trial recordings. The track consists of two stages, and we introduce the tasks designed for each stage; we… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  6. arXiv:2406.11683  [pdf, other

    cs.CL

    HoLLMwood: Unleashing the Creativity of Large Language Models in Screenwriting via Role Playing

    Authors: **g Chen, Xinyu Zhu, Cheng Yang, Chufan Shi, Yadong Xi, Yuxiang Zhang, Junjie Wang, Jiashu Pu, Rongsheng Zhang, Yujiu Yang, Tian Feng

    Abstract: Generative AI has demonstrated unprecedented creativity in the field of computer vision, yet such phenomena have not been observed in natural language processing. In particular, large language models (LLMs) can hardly produce written works at the level of human experts due to the extremely high complexity of literature writing. In this paper, we present HoLLMwood, an automated framework for unleas… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  7. arXiv:2406.09961  [pdf, other

    cs.SE cs.CL cs.CV

    ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation

    Authors: Chufan Shi, Cheng Yang, Yaxin Liu, Bo Shui, Junjie Wang, Mohan **g, Linran Xu, Xinyu Zhu, Siheng Li, Yuxiang Zhang, Gongye Liu, Xiaomei Nie, Deng Cai, Yujiu Yang

    Abstract: We introduce a new benchmark, ChartMimic, aimed at assessing the visually-grounded code generation capabilities of large multimodal models (LMMs). ChartMimic utilizes information-intensive visual charts and textual instructions as inputs, requiring LMMs to generate the corresponding code for chart rendering. ChartMimic includes 1,000 human-curated (figure, instruction, code) triplets, which repres… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Data and code are available at https://github.com/ChartMimic/ChartMimic

  8. arXiv:2406.08909  [pdf, other

    cs.CV

    A Label-Free and Non-Monotonic Metric for Evaluating Denoising in Event Cameras

    Authors: Chenyang Shi, Shasha Guo, Boyi Wei, Hanxiao Liu, Yibo Zhang, Ningfang Song, **g **

    Abstract: Event cameras are renowned for their high efficiency due to outputting a sparse, asynchronous stream of events. However, they are plagued by noisy events, especially in low light conditions. Denoising is an essential task for event cameras, but evaluating denoising performance is challenging. Label-dependent denoising metrics involve artificially adding noise to clean sequences, complicating evalu… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  9. arXiv:2406.06925  [pdf, other

    cs.LG cs.IR

    Non-autoregressive Personalized Bundle Generation

    Authors: Wenchuan Yang, Cheng Yang, Jichao Li, Yue** Tan, Xin Lu, Chuan Shi

    Abstract: The personalized bundle generation problem, which aims to create a preferred bundle for user from numerous candidate items, receives increasing attention in recommendation. However, existing works ignore the order-invariant nature of the bundle and adopt sequential modeling methods as the solution, which might introduce inductive bias and cause a large latency in prediction. To address this proble… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Submitted to Information Processing & Management

  10. arXiv:2406.06391  [pdf, other

    cs.LG cs.CL

    Towards Lifelong Learning of Large Language Models: A Survey

    Authors: Junhao Zheng, Shengjie Qiu, Chengming Shi, Qianli Ma

    Abstract: As the applications of large language models (LLMs) expand across diverse fields, the ability of these models to adapt to ongoing changes in data, tasks, and user preferences becomes crucial. Traditional training methods, relying on static datasets, are increasingly inadequate for co** with the dynamic nature of real-world information. Lifelong learning, also known as continual or incremental le… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 37 pages

  11. arXiv:2406.03510  [pdf, other

    cs.SD cs.AI eess.AS

    Speech-based Clinical Depression Screening: An Empirical Study

    Authors: Yangbin Chen, Chenyang Xu, Chunfeng Liang, Yanbao Tao, Chuan Shi

    Abstract: This study investigates the utility of speech signals for AI-based depression screening across varied interaction scenarios, including psychiatric interviews, chatbot conversations, and text readings. Participants include depressed patients recruited from the outpatient clinics of Peking University Sixth Hospital and control group members from the community, all diagnosed by psychiatrists followin… ▽ More

    Submitted 12 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: 5 pages, 3 figures

  12. arXiv:2406.03488  [pdf, other

    cs.DC

    Seq1F1B: Efficient Sequence-Level Pipeline Parallelism for Large Language Model Training

    Authors: Ao Sun, Weilin Zhao, Xu Han, Cheng Yang, Zhiyuan Liu, Chuan Shi, Maosong Sun

    Abstract: The emergence of large language models (LLMs) relies heavily on distributed training strategies, among which pipeline parallelism plays a crucial role. As LLMs' training sequence length extends to 32k or even 128k, the current pipeline parallel methods face severe bottlenecks, including high memory footprints and substantial pipeline bubbles, greatly hindering model scalability and training throug… ▽ More

    Submitted 6 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

    Comments: 12 pages, 4 figures, 6 tables

  13. arXiv:2406.00317  [pdf, other

    stat.ML cs.LG stat.ME

    Combining Experimental and Historical Data for Policy Evaluation

    Authors: Ting Li, Chengchun Shi, Qianglin Wen, Yang Sui, Yongli Qin, Chunbo Lai, Hongtu Zhu

    Abstract: This paper studies policy evaluation with multiple data sources, especially in scenarios that involve one experimental dataset with two arms, complemented by a historical dataset generated under a single control arm. We propose novel data integration methods that linearly integrate base policy value estimators constructed based on the experimental and historical data, with weights optimized to min… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  14. arXiv:2405.18324  [pdf, other

    cs.RO

    Value Alignment and Trust in Human-Robot Interaction: Insights from Simulation and User Study

    Authors: Shreyas Bhat, Joseph B. Lyons, Cong Shi, X. Jessie Yang

    Abstract: With the advent of AI technologies, humans and robots are increasingly teaming up to perform collaborative tasks. To enable smooth and effective collaboration, the topic of value alignment (operationalized herein as the degree of dynamic goal alignment within a task) between the robot and the human is gaining increasing research attention. Prior literature on value alignment makes an inherent assu… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: This is a preprint of the following chapter: Bhat et al., Value Alignment and Trust in Human-Robot Interaction: Insights from Simulation and User Study, published in "Emerging Frontiers in Human-Robot Interaction", edited by Ramana Kumar Vinjamuri, 2024, Springer Nature reproduced with permission of Springer Nature. The final authenticated version is available online at: [INSERT LINK HERE]

  15. arXiv:2405.15627  [pdf, other

    physics.class-ph cs.CE

    The Scattering Matrix-Based Characteristic Mode for Structure amidst Arbitrary Background: Theory, Benchmark and Applications

    Authors: Chenbo Shi, ** Pan, Xin Gu, Shichen Liang, Le Zuo

    Abstract: This paper presents a novel approach for computing substructure characteristic modes. This method leverages electromagnetic scattering matrices and spherical wave expansion to directly decompose electromagnetic fields. Unlike conventional methods that rely on the impedance matrix generated by the method of moments (MoM), our technique simplifies the problem into a small-scale ordinary eigenvalue p… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  16. arXiv:2405.14507  [pdf, other

    cs.CL cs.LG

    Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast

    Authors: Chufan Shi, Cheng Yang, Xinyu Zhu, Jiahao Wang, Taiqiang Wu, Siheng Li, Deng Cai, Yujiu Yang, Yu Meng

    Abstract: Mixture-of-Experts (MoE) has emerged as a prominent architecture for scaling model size while maintaining computational efficiency. In MoE, each token in the input sequence activates a different subset of experts determined by a routing mechanism. However, the unchosen experts in MoE models do not contribute to the output, potentially leading to underutilization of the model's capacity. In this wo… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  17. arXiv:2405.09770  [pdf

    cs.CL cs.AI

    Optimization Techniques for Sentiment Analysis Based on LLM (GPT-3)

    Authors: Tong Zhan, Chenxi Shi, Yadong Shi, Huixiang Li, Yiyu Lin

    Abstract: With the rapid development of natural language processing (NLP) technology, large-scale pre-trained language models such as GPT-3 have become a popular research object in NLP field. This paper aims to explore sentiment analysis optimization techniques based on large pre-trained language models such as GPT-3 to improve model performance and effect and further promote the development of natural lang… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  18. arXiv:2405.06655  [pdf

    q-bio.BM cs.AI cs.LG

    RNA Secondary Structure Prediction Using Transformer-Based Deep Learning Models

    Authors: Yanlin Zhou, Tong Zhan, Yichao Wu, Bo Song, Chenxi Shi

    Abstract: The Human Genome Project has led to an exponential increase in data related to the sequence, structure, and function of biomolecules. Bioinformatics is an interdisciplinary research field that primarily uses computational methods to analyze large amounts of biological macromolecule data. Its goal is to discover hidden biological patterns and related information. Furthermore, analysing additional r… ▽ More

    Submitted 14 April, 2024; originally announced May 2024.

  19. arXiv:2405.05288  [pdf, other

    cs.SI cs.IR cs.LG

    Learning Social Graph for Inactive User Recommendation

    Authors: Nian Liu, Shen Fan, Ting Bai, Peng Wang, Mingwei Sun, Yanhu Mo, Xiaoxiao Xu, Hong Liu, Chuan Shi

    Abstract: Social relations have been widely incorporated into recommender systems to alleviate data sparsity problem. However, raw social relations don't always benefit recommendation due to their inferior quality and insufficient quantity, especially for inactive users, whose interacted items are limited. In this paper, we propose a novel social recommendation method called LSIR (\textbf{L}earning \textbf{… ▽ More

    Submitted 22 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: This paper has been received by DASFAA 2024

  20. arXiv:2405.03318  [pdf, other

    cs.CV cs.MM

    Enhancing DETRs Variants through Improved Content Query and Similar Query Aggregation

    Authors: Yingying Zhang, Chuangji Shi, Xin Guo, Jiangwei Lao, Jian Wang, Jiaotuan Wang, **gdong Chen

    Abstract: The design of the query is crucial for the performance of DETR and its variants. Each query consists of two components: a content part and a positional one. Traditionally, the content query is initialized with a zero or learnable embedding, lacking essential content information and resulting in sub-optimal performance. In this paper, we introduce a novel plug-and-play module, Self-Adaptive Content… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 11 pages, 7 figures

  21. arXiv:2405.03113  [pdf, other

    cs.RO cs.AI

    Robot Air Hockey: A Manipulation Testbed for Robot Learning with Reinforcement Learning

    Authors: Caleb Chuck, Carl Qi, Michael J. Munje, Shuozhe Li, Max Rudolph, Chang Shi, Siddhant Agarwal, Harshit Sikchi, Abhinav Peri, Sarthak Dayal, Evan Kuo, Kavan Mehta, Anthony Wang, Peter Stone, Amy Zhang, Scott Niekum

    Abstract: Reinforcement Learning is a promising tool for learning complex policies even in fast-moving and object-interactive domains where human teleoperation or hard-coded policies might fail. To effectively reflect this challenging category of tasks, we introduce a dynamic, interactive RL testbed based on robot air hockey. By augmenting air hockey with a large family of tasks ranging from easy tasks like… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  22. arXiv:2405.01102  [pdf, other

    cs.LG cs.AI

    Less is More: on the Over-Globalizing Problem in Graph Transformers

    Authors: Yujie Xing, Xiao Wang, Yibo Li, Hai Huang, Chuan Shi

    Abstract: Graph Transformer, due to its global attention mechanism, has emerged as a new tool in dealing with graph-structured data. It is well recognized that the global attention mechanism considers a wider receptive field in a fully connected graph, leading many to believe that useful information can be extracted from all the nodes. In this paper, we challenge this belief: does the globalizing property a… ▽ More

    Submitted 24 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML 2024 (Camera-Ready)

  23. arXiv:2405.00263  [pdf, other

    cs.CL cs.AI cs.LG

    Clover: Regressive Lightweight Speculative Decoding with Sequential Knowledge

    Authors: Bin Xiao, Chunan Shi, Xiaonan Nie, Fan Yang, Xiangwei Deng, Lei Su, Weipeng Chen, Bin Cui

    Abstract: Large language models (LLMs) suffer from low efficiency as the mismatch between the requirement of auto-regressive decoding and the design of most contemporary GPUs. Specifically, billions to trillions of parameters must be loaded to the GPU cache through its limited memory bandwidth for computation, but only a small batch of tokens is actually computed. Consequently, the GPU spends most of its ti… ▽ More

    Submitted 30 April, 2024; originally announced May 2024.

  24. arXiv:2404.17609  [pdf, other

    cs.LG cs.AI cs.CL

    CoSD: Collaborative Stance Detection with Contrastive Heterogeneous Topic Graph Learning

    Authors: Yinghan Cheng, Qi Zhang, Chongyang Shi, Liang Xiao, Shufeng Hao, Liang Hu

    Abstract: Stance detection seeks to identify the viewpoints of individuals either in favor or against a given target or a controversial topic. Current advanced neural models for stance detection typically employ fully parametric softmax classifiers. However, these methods suffer from several limitations, including lack of explainability, insensitivity to the latent data structure, and unimodality, which gre… ▽ More

    Submitted 19 June, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 13 pages

  25. arXiv:2404.11957  [pdf, other

    cs.CV

    The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models

    Authors: Cheng Shi, Sibei Yang

    Abstract: Foundation models, pre-trained on a large amount of data have demonstrated impressive zero-shot capabilities in various downstream tasks. However, in object detection and instance segmentation, two fundamental computer vision tasks heavily reliant on extensive human annotations, foundation models such as SAM and DINO struggle to achieve satisfactory performance. In this study, we reveal that the d… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

    Comments: ICLR2024, Code is released at https://github.com/ChengShiest/Zip-Your-CLIP

  26. arXiv:2404.06012  [pdf, other

    cs.CV cs.RO

    Diffusion-Based Point Cloud Super-Resolution for mmWave Radar Data

    Authors: Kai Luan, Chenghao Shi, Neng Wang, Yuwei Cheng, Huimin Lu, Xieyuanli Chen

    Abstract: The millimeter-wave radar sensor maintains stable performance under adverse environmental conditions, making it a promising solution for all-weather perception tasks, such as outdoor mobile robotics. However, the radar point clouds are relatively sparse and contain massive ghost points, which greatly limits the development of mmWave radar technology. In this paper, we propose a novel point cloud s… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Journal ref: Proc. of the IEEE Intl. Conf. on Robotics & Automation (ICRA), 2024

  27. arXiv:2404.00903  [pdf

    cs.IR cs.AI

    Maximizing User Experience with LLMOps-Driven Personalized Recommendation Systems

    Authors: Chenxi Shi, Penghao Liang, Yichao Wu, Tong Zhan, Zhengyu **

    Abstract: The integration of LLMOps into personalized recommendation systems marks a significant advancement in managing LLM-driven applications. This innovation presents both opportunities and challenges for enterprises, requiring specialized teams to navigate the complexity of engineering technology while prioritizing data security and model interpretability. By leveraging LLMOps, enterprises can enhance… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  28. arXiv:2403.17285  [pdf, other

    stat.ML cs.LG

    An Analysis of Switchback Designs in Reinforcement Learning

    Authors: Qianglin Wen, Chengchun Shi, Ying Yang, Niansheng Tang, Hongtu Zhu

    Abstract: This paper offers a detailed investigation of switchback designs in A/B testing, which alternate between baseline and new policies over time. Our aim is to thoroughly evaluate the effects of these designs on the accuracy of their resulting average treatment effect (ATE) estimators. We propose a novel "weak signal analysis" framework, which substantially simplifies the calculations of the mean squa… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  29. arXiv:2403.15831  [pdf, other

    cs.CV

    Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking

    Authors: Shaoyu Sun, Chunyang Wang, Xuelian Liu, Chunhao Shi, Yueyang Ding, Guan Xi

    Abstract: 3D single object tracking within LIDAR point clouds is a pivotal task in computer vision, with profound implications for autonomous driving and robotics. However, existing methods, which depend solely on appearance matching via Siamese networks or utilize motion information from successive frames, encounter significant challenges. Issues such as similar objects nearby or occlusions can result in t… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

    Comments: 18 pages,6 figures

  30. arXiv:2403.12768  [pdf, other

    cs.HC

    ContextVis: Envision Contextual Learning and Interaction with Generative Models

    Authors: Bo Shui, Chufan Shi, Yujiu Yang, Xiaomei Nie

    Abstract: ContextVis introduces a workflow by integrating generative models to create contextual learning materials. It aims to boost knowledge acquisition through the creation of resources with contextual cues. A case study on vocabulary learning demonstrates the effectiveness of generative models in develo** educational resources that enrich language understanding and aid memory retention. The system co… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted by HCII 2024

  31. arXiv:2403.12582  [pdf, other

    cs.CL

    AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework

    Authors: Xiang Li, Zhenyu Li, Chen Shi, Yong Xu, Qing Du, Mingkui Tan, Jun Huang, Wei Lin

    Abstract: The task of financial analysis primarily encompasses two key areas: stock trend prediction and the corresponding financial question answering. Currently, machine learning and deep learning algorithms (ML&DL) have been widely applied for stock trend predictions, leading to significant progress. However, these methods fail to provide reasons for predictions, lacking interpretability and reasoning pr… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: COLING 2024. The first three authors contributed equally. Project website: https://github.com/AlphaFin-proj/AlphaFin

  32. arXiv:2403.12474  [pdf, other

    cs.LG cs.CY

    FairSIN: Achieving Fairness in Graph Neural Networks through Sensitive Information Neutralization

    Authors: Cheng Yang, Jixi Liu, Yunhe Yan, Chuan Shi

    Abstract: Despite the remarkable success of graph neural networks (GNNs) in modeling graph-structured data, like other machine learning models, GNNs are also susceptible to making biased predictions based on sensitive attributes, such as race and gender. For fairness consideration, recent state-of-the-art (SOTA) methods propose to filter out sensitive information from inputs or representations, e.g., edge d… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  33. arXiv:2403.11873  [pdf, other

    cs.CL

    CO3: Low-resource Contrastive Co-training for Generative Conversational Query Rewrite

    Authors: Yifei Yuan, Chen Shi, Runze Wang, Liyi Chen, Renjun Hu, Zengming Zhang, Feijun Jiang, Wai Lam

    Abstract: Generative query rewrite generates reconstructed query rewrites using the conversation history while rely heavily on gold rewrite pairs that are expensive to obtain. Recently, few-shot learning is gaining increasing popularity for this task, whereas these methods are sensitive to the inherent noise due to limited data size. Besides, both attempts face performance degradation when there exists lang… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: Accepted to COLING 2024

  34. arXiv:2403.11841  [pdf, other

    stat.ML cs.AI cs.LG

    Pessimistic Causal Reinforcement Learning with Mediators for Confounded Offline Data

    Authors: Danyang Wang, Chengchun Shi, Shikai Luo, Will Wei Sun

    Abstract: In real-world scenarios, datasets collected from randomized experiments are often constrained by size, due to limitations in time and budget. As a result, leveraging large observational datasets becomes a more attractive option for achieving high-quality policy learning. However, most existing offline reinforcement learning (RL) methods depend on two key assumptions--unconfoundedness and positivit… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  35. arXiv:2403.11435  [pdf, other

    cs.CL

    InsCL: A Data-efficient Continual Learning Paradigm for Fine-tuning Large Language Models with Instructions

    Authors: Yifan Wang, Yafei Liu, Chufan Shi, Haoling Li, Chen Chen, Haonan Lu, Yujiu Yang

    Abstract: Instruction tuning effectively optimizes Large Language Models (LLMs) for downstream tasks. Due to the changing environment in real-life applications, LLMs necessitate continual task-specific adaptation without catastrophic forgetting. Considering the heavy computational cost, replay-based Continual Learning (CL) methods are the simplest and most widely used for LLMs to address the forgetting issu… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted by NAACL 2024

  36. arXiv:2403.09552  [pdf, other

    cs.HC

    "Are You Really Sure?" Understanding the Effects of Human Self-Confidence Calibration in AI-Assisted Decision Making

    Authors: Shuai Ma, Xinru Wang, Ying Lei, Chuhan Shi, Ming Yin, Xiaojuan Ma

    Abstract: In AI-assisted decision-making, it is crucial but challenging for humans to achieve appropriate reliance on AI. This paper approaches this problem from a human-centered perspective, "human self-confidence calibration". We begin by proposing an analytical framework to highlight the importance of calibrated human self-confidence. In our first study, we explore the relationship between human self-con… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  37. arXiv:2403.09347  [pdf, other

    cs.DC cs.LG

    BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

    Authors: Ao Sun, Weilin Zhao, Xu Han, Cheng Yang, Zhiyuan Liu, Chuan Shi, Maosong Sun

    Abstract: Effective attention modules have played a crucial role in the success of Transformer-based large language models (LLMs), but the quadratic time and memory complexities of these attention modules also pose a challenge when processing long sequences. One potential solution for the long sequence problem is to utilize distributed clusters to parallelize the computation of attention modules across mult… ▽ More

    Submitted 6 June, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: 13 pages, 7 figures

  38. arXiv:2403.08217  [pdf

    cs.CL cs.LG

    Research on the Application of Deep Learning-based BERT Model in Sentiment Analysis

    Authors: Yichao Wu, Zhengyu **, Chenxi Shi, Penghao Liang, Tong Zhan

    Abstract: This paper explores the application of deep learning techniques, particularly focusing on BERT models, in sentiment analysis. It begins by introducing the fundamental concept of sentiment analysis and how deep learning methods are utilized in this domain. Subsequently, it delves into the architecture and characteristics of BERT models. Through detailed explanation, it elucidates the application ef… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  39. arXiv:2403.07578  [pdf, other

    cs.CV

    AACP: Aesthetics assessment of children's paintings based on self-supervised learning

    Authors: Shiqi Jiang, Ning Li, Chen Shi, Li** Guo, Changbo Wang, Chenhui Li

    Abstract: The Aesthetics Assessment of Children's Paintings (AACP) is an important branch of the image aesthetics assessment (IAA), playing a significant role in children's education. This task presents unique challenges, such as limited available data and the requirement for evaluation metrics from multiple perspectives. However, previous approaches have relied on training large datasets and subsequently p… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: AAAI 2024

  40. FARPLS: A Feature-Augmented Robot Trajectory Preference Labeling System to Assist Human Labelers' Preference Elicitation

    Authors: Hanfang Lyu, Yuanchen Bai, Xin Liang, Ujaan Das, Chuhan Shi, Leiliang Gong, Yingchi Li, Mingfei Sun, Ming Ge, Xiaojuan Ma

    Abstract: Preference-based learning aims to align robot task objectives with human values. One of the most common methods to infer human preferences is by pairwise comparisons of robot task trajectories. Traditional comparison-based preference labeling systems seldom support labelers to digest and identify critical differences between complex trajectories recorded in videos. Our formative study (N = 12) sug… ▽ More

    Submitted 10 March, 2024; originally announced March 2024.

    Comments: Accepted to ACM Conference on Intelligent User Interfaces (IUI) 2024, March 18-21, 2024, Greenville, SC, USA

  41. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  42. arXiv:2403.05478  [pdf, other

    cs.RO

    HGIC: A Hand Gesture Based Interactive Control System for Efficient and Scalable Multi-UAV Operations

    Authors: Mengsha Hu, **zhou Li, Runxiang **, Chao Shi, Lei Xu, Rui Liu

    Abstract: As technological advancements continue to expand the capabilities of multi unmanned-aerial-vehicle systems (mUAV), human operators face challenges in scalability and efficiency due to the complex cognitive load and operations associated with motion adjustments and team coordination. Such cognitive demands limit the feasible size of mUAV teams and necessitate extensive operator training, impeding b… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  43. arXiv:2403.03599  [pdf, other

    cs.LG

    Learning Invariant Representations of Graph Neural Networks via Cluster Generalization

    Authors: Donglin Xia, Xiao Wang, Nian Liu, Chuan Shi

    Abstract: Graph neural networks (GNNs) have become increasingly popular in modeling graph-structured data due to their ability to learn node representations by aggregating local structure information. However, it is widely acknowledged that the test graph structure may differ from the training graph structure, resulting in a structure shift. In this paper, we experimentally find that the performance of GNNs… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  44. Minimum Topology Attacks for Graph Neural Networks

    Authors: Mengmei Zhang, Xiao Wang, Chuan Shi, Lingjuan Lyu, Tianchi Yang, Jun** Du

    Abstract: With the great popularity of Graph Neural Networks (GNNs), their robustness to adversarial topology attacks has received significant attention. Although many attack methods have been proposed, they mainly focus on fixed-budget attacks, aiming at finding the most adversarial perturbations within a fixed budget for target node. However, considering the varied robustness of each node, there is an ine… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Published on WWW 2023. Proceedings of the ACM Web Conference 2023

  45. arXiv:2402.18172  [pdf, other

    cs.CV

    NiteDR: Nighttime Image De-Raining with Cross-View Sensor Cooperative Learning for Dynamic Driving Scenes

    Authors: Cidan Shi, Lihuang Fang, Han Wu, Xiaoyu Xian, Yukai Shi, Liang Lin

    Abstract: In real-world environments, outdoor imaging systems are often affected by disturbances such as rain degradation. Especially, in nighttime driving scenes, insufficient and uneven lighting shrouds the scenes in darkness, resulting degradation of both the image quality and visibility. Particularly, in the field of autonomous driving, the visual perception ability of RGB sensors experiences a sharp de… ▽ More

    Submitted 7 April, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

  46. arXiv:2402.17263  [pdf, other

    cs.CL

    MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning

    Authors: Pengjie Ren, Chengshun Shi, Shiguang Wu, Mengqi Zhang, Zhaochun Ren, Maarten de Rijke, Zhumin Chen, Jiahuan Pei

    Abstract: Parameter-efficient fine-tuning (PEFT) is a popular method for tailoring pre-trained large language models (LLMs), especially as the models' scale and the diversity of tasks increase. Low-rank adaptation (LoRA) is based on the idea that the adaptation process is intrinsically low-dimensional, i.e., significant model changes can be represented with relatively few parameters. However, decreasing the… ▽ More

    Submitted 24 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: ACL2024

    MSC Class: 68T50 ACM Class: I.2.7

  47. arXiv:2402.14834  [pdf, other

    cs.CL cs.AI cs.IR

    MSynFD: Multi-hop Syntax aware Fake News Detection

    Authors: Liang Xiao, Qi Zhang, Chongyang Shi, Shou** Wang, Usman Naseem, Liang Hu

    Abstract: The proliferation of social media platforms has fueled the rapid dissemination of fake news, posing threats to our real-life society. Existing methods use multimodal data or contextual information to enhance the detection of fake news by analyzing news content and/or its social context. However, these methods often overlook essential textual news content (articles) and heavily rely on sequential m… ▽ More

    Submitted 19 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: 10 pages

  48. arXiv:2402.12161  [pdf, other

    cs.LG cs.AI cs.CY cs.SI

    Endowing Pre-trained Graph Models with Provable Fairness

    Authors: Zhongjian Zhang, Mengmei Zhang, Yue Yu, Cheng Yang, Jiawei Liu, Chuan Shi

    Abstract: Pre-trained graph models (PGMs) aim to capture transferable inherent structural properties and apply them to different downstream tasks. Similar to pre-trained language models, PGMs also inherit biases from human society, resulting in discriminatory behavior in downstream applications. The debiasing process of existing fair methods is generally coupled with parameter optimization of GNNs. However,… ▽ More

    Submitted 20 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted by WWW 2024

  49. arXiv:2402.10433  [pdf, other

    q-bio.BM cs.LG q-bio.QM

    Fusing Neural and Physical: Augment Protein Conformation Sampling with Tractable Simulations

    Authors: Jiarui Lu, Zuobai Zhang, Bozitao Zhong, Chence Shi, Jian Tang

    Abstract: The protein dynamics are common and important for their biological functions and properties, the study of which usually involves time-consuming molecular dynamics (MD) simulations in silico. Recently, generative models has been leveraged as a surrogate sampler to obtain conformation ensembles with orders of magnitude faster and without requiring any simulation data (a "zero-shot" inference). Howev… ▽ More

    Submitted 11 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: Published at the GEM workshop, ICLR 2024

  50. arXiv:2402.09723  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    Efficient Prompt Optimization Through the Lens of Best Arm Identification

    Authors: Chengshuai Shi, Kun Yang, Zihan Chen, Jundong Li, **g Yang, Cong Shen

    Abstract: The remarkable instruction-following capability of large language models (LLMs) has sparked a growing interest in automatically finding good prompts, i.e., prompt optimization. Most existing works follow the scheme of selecting from a pre-generated pool of candidate prompts. However, these designs mainly focus on the generation strategy, while limited attention has been paid to the selection metho… ▽ More

    Submitted 30 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.