Skip to main content

Showing 1–50 of 7,355 results for author: li, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03320  [pdf, other

    cs.CV cs.CL

    InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

    Authors: Pan Zhang, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Rui Qian, Lin Chen, Qipeng Guo, Haodong Duan, Bin Wang, Linke Ouyang, Songyang Zhang, Wenwei Zhang, Yining Li, Yang Gao, Peng Sun, Xinyue Zhang, Wei Li, **gwen Li, Wenhai Wang, Hang Yan, Conghui He, Xingcheng Zhang, Kai Chen, Jifeng Dai, Yu Qiao , et al. (2 additional authors not shown)

    Abstract: We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. IXC-2.5 excels in various text-image comprehension and composition applications, achieving GPT-4V level capabilities with merely 7B LLM backend. Trained with 24K interleaved image-text contexts, it can seamlessly extend to 96K long contexts via RoPE extrapolation. Th… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Technical Report. https://github.com/InternLM/InternLM-XComposer

  2. arXiv:2407.03217  [pdf, other

    cs.CV

    MHNet: Multi-view High-order Network for Diagnosing Neurodevelopmental Disorders Using Resting-state fMRI

    Authors: Yueyang Li, Weiming Zeng, Wenhao Dong, Luhui Cai, Lei Wang, Hongyu Chen, Hongjie Yan, Lingbin Bian, Nizhuan Wang

    Abstract: Background: Deep learning models have shown promise in diagnosing neurodevelopmental disorders (NDD) like ASD and ADHD. However, many models either use graph neural networks (GNN) to construct single-level brain functional networks (BFNs) or employ spatial convolution filtering for local information extraction from rs-fMRI data, often neglecting high-order features crucial for NDD classification.… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 18 pages

  3. arXiv:2407.03159  [pdf, other

    cs.SI eess.SY physics.soc-ph

    Protection Degree and Migration in the Stochastic SIRS Model: A Queueing System Perspective

    Authors: Yuhan Li, Ziyan Zeng, Minyu Feng, Jürgen Kurths

    Abstract: With the prevalence of COVID-19, the modeling of epidemic propagation and its analyses have played a significant role in controlling epidemics. However, individual behaviors, in particular the self-protection and migration, which have a strong influence on epidemic propagation, were always neglected in previous studies. In this paper, we mainly propose two models from the individual and population… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  4. Efficient IoT Devices Localization Through Wi-Fi CSI Feature Fusion and Anomaly Detection

    Authors: Yan Li, Jie Yang, Shang-Ling Shih, Wan-Ting Shih, Chao-Kai Wen, Shi **

    Abstract: Internet of Things (IoT) device localization is fundamental to smart home functionalities, including indoor navigation and tracking of individuals. Traditional localization relies on relative methods utilizing the positions of anchors within a home environment, yet struggles with precision due to inherent inaccuracies in these anchor positions. In response, we introduce a cutting-edge smartphone-b… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted in IEEE Internet of Things Journal, Early Access, 2024

    Journal ref: IEEE Internet of Things Journal, Early Access, 2024

  5. arXiv:2407.02827  [pdf, ps, other

    cs.LG math.OC

    Convergence of Implicit Gradient Descent for Training Two-Layer Physics-Informed Neural Networks

    Authors: Xianliang Xu, Zhongyi Huang, Ye Li

    Abstract: Optimization algorithms is crucial in training physics-informed neural networks (PINNs), unsuitable methods may lead to poor solutions. Compared to the common gradient descent algorithm, implicit gradient descent (IGD) outperforms it in handling some multi-scale problems. In this paper, we provide convergence analysis for the implicit gradient descent for training over-parametrized two-layer PINNs… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  6. arXiv:2407.02783  [pdf, ps, other

    cs.CL cs.AI

    52B to 1T: Lessons Learned via Tele-FLM Series

    Authors: Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang, Shuangyong Song, Yongxiang Li, Zheng Zhang, Bo Zhao, Aixin Sun, Yequan Wang, Zhongjiang He, Zhongyuan Wang, Xuelong Li, Tiejun Huang

    Abstract: Large Language Models (LLMs) represent a significant stride toward Artificial General Intelligence. As scaling laws underscore the potential of increasing model sizes, the academic community has intensified its investigations into LLMs with capacities exceeding 50 billion parameters. This technical report builds on our prior work with Tele-FLM (also known as FLM-2), a publicly available 52-billion… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: For the Tele-FLM-52B tech report, see also 2404.16645

  7. arXiv:2407.02490  [pdf, other

    cs.CL cs.LG

    MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention

    Authors: Huiqiang Jiang, Yucheng Li, Chengruidong Zhang, Qianhui Wu, Xufang Luo, Surin Ahn, Zhenhua Han, Amir H. Abdi, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, Lili Qiu

    Abstract: The computational challenges of Large Language Model (LLM) inference remain a significant barrier to their widespread deployment, especially as prompt lengths continue to increase. Due to the quadratic complexity of the attention computation, it takes 30 minutes for an 8B LLM to process a prompt of 1M tokens (i.e., the pre-filling stage) on a single A100 GPU. Existing methods for speeding up prefi… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  8. arXiv:2407.02489  [pdf, other

    cs.CV cs.AI cs.GR cs.HC cs.LG

    Magic Insert: Style-Aware Drag-and-Drop

    Authors: Nataniel Ruiz, Yuanzhen Li, Neal Wadhwa, Yael Pritch, Michael Rubinstein, David E. Jacobs, Shlomi Fruchter

    Abstract: We present Magic Insert, a method for dragging-and-drop** subjects from a user-provided image into a target image of a different style in a physically plausible manner while matching the style of the target image. This work formalizes the problem of style-aware drag-and-drop and presents a method for tackling it by addressing two sub-problems: style-aware personalization and realistic object ins… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Project page: https://magicinsert.github.io/

  9. arXiv:2407.02329  [pdf, other

    cs.CV

    MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis

    Authors: Dewei Zhou, You Li, Fan Ma, Zongxin Yang, Yi Yang

    Abstract: We introduce the Multi-Instance Generation (MIG) task, which focuses on generating multiple instances within a single image, each accurately placed at predefined positions with attributes such as category, color, and shape, strictly following user specifications. MIG faces three main challenges: avoiding attribute leakage between instances, supporting diverse instance descriptions, and maintaining… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  10. arXiv:2407.02056  [pdf, other

    cs.CL cs.AI

    Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation

    Authors: Xinglin Wang, Yiwei Li, Shaoxiong Feng, Peiwen Yuan, Boyuan Pan, Heda Wang, Yao Hu, Kan Li

    Abstract: Self-consistency (SC), leveraging multiple samples from LLMs, shows significant gains on various reasoning tasks but struggles with free-form generation due to the difficulty of aggregating answers. Its variants, UCS and USC, rely on sample selection or voting mechanisms to improve output quality. These methods, however, face limitations due to their inability to fully utilize the nuanced consensu… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted to ACL2024 Main Conference

  11. arXiv:2407.02052  [pdf, other

    eess.AS cs.SD

    The USTC-NERCSLIP Systems for The ICMC-ASR Challenge

    Authors: Minghui Wu, Luzhen Xu, Jie Zhang, Haitao Tang, Yanyan Yue, Ruizhi Liao, **tao Zhao, Zhengzhe Zhang, Yichi Wang, Haoyin Yan, Hongliang Yu, Tongle Ma, Jiachen Liu, Chongliang Wu, Yongchao Li, Yanyong Zhang, Xin Fang, Yue Zhang

    Abstract: This report describes the submitted system to the In-Car Multi-Channel Automatic Speech Recognition (ICMC-ASR) challenge, which considers the ASR task with multi-speaker overlap** and Mandarin accent dynamics in the ICMC case. We implement the front-end speaker diarization using the self-supervised learning representation based multi-speaker embedding and beamforming using the speaker position,… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted at ICASSP 2024

  12. arXiv:2407.02043  [pdf, other

    cs.CL

    Concise and Precise Context Compression for Tool-Using Language Models

    Authors: Yang Xu, Yunlong Feng, Honglin Mu, Yutai Hou, Yitong Li, Xinghao Wang, Wanjun Zhong, Zhongyang Li, Dandan Tu, Qingfu Zhu, Min Zhang, Wanxiang Che

    Abstract: Through reading the documentation in the context, tool-using language models can dynamically extend their capability using external tools. The cost is that we have to input lengthy documentation every time the model needs to use the tool, occupying the input window as well as slowing down the decoding process. Given the progress in general-purpose compression, soft context compression is a suita… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  13. arXiv:2407.01908  [pdf, other

    eess.IV cs.CV

    Efficient Stochastic Differential Equation for DEM Super Resolution with Void Filling

    Authors: Tongtong Zhang, Zongcheng Zuo, Yuanxiang Li

    Abstract: Digital Elevation Model (DEM) plays a fundamental role in remote sensing and photogrammetry. Enhancing the quality of DEM is crucial for various applications. Although multiple types of defects may appear simultaneously in the same DEM, they are commonly addressed separately. Most existing approaches only aim to fill the DEM voids, or apply super-resolution to the intact DEM. This paper introduces… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  14. arXiv:2407.01790  [pdf, other

    cs.CV cs.AI cs.LG

    Label-free Neural Semantic Image Synthesis

    Authors: Jiayi Wang, Kevin Alexander Laube, Yumeng Li, Jan Hendrik Metzen, Shin-I Cheng, Julio Borges, Anna Khoreva

    Abstract: Recent work has shown great progress in integrating spatial conditioning to control large, pre-trained text-to-image diffusion models. Despite these advances, existing methods describe the spatial image content using hand-crafted conditioning inputs, which are either semantically ambiguous (e.g., edges) or require expensive manual annotations (e.g., semantic segmentation). To address these limitat… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  15. arXiv:2407.01640  [pdf, other

    cs.LG

    BADM: Batch ADMM for Deep Learning

    Authors: Ouya Wang, Shenglong Zhou, Geoffrey Ye Li

    Abstract: Stochastic gradient descent-based algorithms are widely used for training deep neural networks but often suffer from slow convergence. To address the challenge, we leverage the framework of the alternating direction method of multipliers (ADMM) to develop a novel data-driven algorithm, called batch ADMM (BADM). The fundamental idea of the proposed algorithm is to split the training data into batch… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  16. arXiv:2407.01621  [pdf, other

    cs.LG q-bio.QM stat.ME stat.ML

    Deciphering interventional dynamical causality from non-intervention systems

    Authors: Jifan Shi, Yang Li, Juan Zhao, Siyang Leng, Kazuyuki Aihara, Luonan Chen, Wei Lin

    Abstract: Detecting and quantifying causality is a focal topic in the fields of science, engineering, and interdisciplinary studies. However, causal studies on non-intervention systems attract much attention but remain extremely challenging. To address this challenge, we propose a framework named Interventional Dynamical Causality (IntDC) for such non-intervention systems, along with its computational crite… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  17. arXiv:2407.01534  [pdf, other

    cs.NI

    AIGC-Assisted Digital Watermark Services in Low-Earth Orbit Satellite-Terrestrial Edge Networks

    Authors: Kongyang Chen, Yikai Li, Wenjun Lan, Bing Mi, Shaowei Wang

    Abstract: Low Earth Orbit (LEO) satellite communication is a crucial component of future 6G communication networks, contributing to the development of an integrated satellite-terrestrial network. In the forthcoming satellite-to-ground network, the idle computational resources of LEO satellites can serve as edge servers, delivering intelligent task computation services to ground users. Existing research on s… ▽ More

    Submitted 8 March, 2024; originally announced July 2024.

  18. arXiv:2407.01418  [pdf, other

    cs.RO cs.AI cs.LG

    RoboPack: Learning Tactile-Informed Dynamics Models for Dense Packing

    Authors: Bo Ai, Stephen Tian, Haochen Shi, Yixuan Wang, Cheston Tan, Yunzhu Li, Jiajun Wu

    Abstract: Tactile feedback is critical for understanding the dynamics of both rigid and deformable objects in many manipulation tasks, such as non-prehensile manipulation and dense packing. We introduce an approach that combines visual and tactile sensing for robotic manipulation by learning a neural, tactile-informed dynamics model. Our proposed framework, RoboPack, employs a recurrent graph neural network… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Robotics: Science and Systems (RSS), 2024. Project page: https://robo-pack.github.io/

    ACM Class: I.2.9; I.2.6; I.2.10

  19. arXiv:2407.01330  [pdf, other

    cs.CV

    Learning Unsigned Distance Fields from Local Shape Functions for 3D Surface Reconstruction

    Authors: Jiangbei Hu, Yanggeng Li, Fei Hou, Junhui Hou, Zhebin Zhang, Shengfa Wang, Na Lei, Ying He

    Abstract: Unsigned distance fields (UDFs) provide a versatile framework for representing a diverse array of 3D shapes, encompassing both watertight and non-watertight geometries. Traditional UDF learning methods typically require extensive training on large datasets of 3D shapes, which is costly and often necessitates hyperparameter adjustments for new datasets. This paper presents a novel neural framework,… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 14 pages, 11 figures

    ACM Class: I.3.5

  20. arXiv:2407.01142  [pdf, other

    cs.CV cs.AI

    Integrated feature analysis for deep learning interpretation and class activation maps

    Authors: Yanli Li, Tahereh Hassanzadeh, Denis P. Shamonin, Monique Reijnierse, Annette H. M. van der Helm-van Mil, Berend C. Stoel

    Abstract: Understanding the decisions of deep learning (DL) models is essential for the acceptance of DL to risk-sensitive applications. Although methods, like class activation maps (CAMs), give a glimpse into the black box, they do miss some crucial information, thereby limiting its interpretability and merely providing the considered locations of objects. To provide more insight into the models and the in… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 13 pages, 11 figures, code available: https://github.com/YanliLi27/IFA This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  21. arXiv:2407.01046  [pdf, other

    cs.AI cs.CL

    FRoG: Evaluating Fuzzy Reasoning of Generalized Quantifiers in Large Language Models

    Authors: Yiyuan Li, Shichao Sun, Pengfei Liu

    Abstract: Fuzzy reasoning is vital due to the frequent use of imprecise information in daily contexts. However, the ability of current large language models (LLMs) to handle such reasoning remains largely uncharted. In this paper, we introduce a new benchmark, FRoG, for fuzzy reasoning, featuring real-world mathematical word problems that incorporate generalized quantifiers. Our experimental findings reveal… ▽ More

    Submitted 2 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

    Comments: Under review

  22. arXiv:2407.00968  [pdf, other

    cs.LG

    How Does Overparameterization Affect Features?

    Authors: Ahmet Cagri Duzgun, Samy Jelassi, Yuanzhi Li

    Abstract: Overparameterization, the condition where models have more parameters than necessary to fit their training loss, is a crucial factor for the success of deep learning. However, the characteristics of the features learned by overparameterized networks are not well understood. In this work, we explore this question by comparing models with the same architecture but different widths. We first examine… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  23. arXiv:2407.00943  [pdf, other

    cs.DC cs.LG

    FedEx: Expediting Federated Learning over Heterogeneous Mobile Devices by Overlap** and Participant Selection

    Authors: Jiaxiang Geng, Boyu Li, Xiaoqi Qin, Yixuan Li, Liang Li, Yanzhao Hou, Miao Pan

    Abstract: Training latency is critical for the success of numerous intrigued applications ignited by federated learning (FL) over heterogeneous mobile devices. By revolutionarily overlap** local gradient transmission with continuous local computing, FL can remarkably reduce its training latency over homogeneous clients, yet encounter severe model staleness, model drifts, memory cost and straggler issues i… ▽ More

    Submitted 2 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

    Comments: 21 pages, 10 figures, Submitted to Sensys2024

  24. arXiv:2407.00942  [pdf, other

    cs.IR cs.AI cs.CL

    ProductAgent: Benchmarking Conversational Product Search Agent with Asking Clarification Questions

    Authors: **gheng Ye, Yong Jiang, Xiaobin Wang, Yinghui Li, Yangning Li, Hai-Tao Zheng, Pengjun Xie, Fei Huang

    Abstract: This paper introduces the task of product demand clarification within an e-commercial scenario, where the user commences the conversation with ambiguous queries and the task-oriented agent is designed to achieve more accurate and tailored product searching by asking clarification questions. To address this task, we propose ProductAgent, a conversational information seeking agent equipped with abil… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 17 pages, 13 tables, 6 figures. Under review

  25. arXiv:2407.00934  [pdf, other

    cs.CL

    CLEME2.0: Towards More Interpretable Evaluation by Disentangling Edits for Grammatical Error Correction

    Authors: **gheng Ye, Zishan Xu, Yinghui Li, Xuxin Cheng, Linlin Song, Qingyu Zhou, Hai-Tao Zheng, Ying Shen, Xin Su

    Abstract: The paper focuses on improving the interpretability of Grammatical Error Correction (GEC) metrics, which receives little attention in previous studies. To bridge the gap, we propose CLEME2.0, a reference-based evaluation strategy that can describe four elementary dimensions of GEC systems, namely hit-correction, error-correction, under-correction, and over-correction. They collectively contribute… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 16 pages, 8 tables, 2 figures. Under review

  26. arXiv:2407.00924  [pdf, other

    cs.CL

    EXCGEC: A Benchmark of Edit-wise Explainable Chinese Grammatical Error Correction

    Authors: **gheng Ye, Shang Qin, Yinghui Li, Xuxin Cheng, Libo Qin, Hai-Tao Zheng, Peng Xing, Zishan Xu, Guo Cheng, Zhao Wei

    Abstract: Existing studies explore the explainability of Grammatical Error Correction (GEC) in a limited scenario, where they ignore the interaction between corrections and explanations. To bridge the gap, this paper introduces the task of EXplainable GEC (EXGEC), which focuses on the integral role of both correction and explanation tasks. To facilitate the task, we propose EXCGEC, a tailored benchmark for… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 22 pages, 10 tables, 9 figures. Under review

  27. arXiv:2407.00922  [pdf

    cs.CY

    Staying vigilant in the Age of AI: From content generation to content authentication

    Authors: Yufan Li, Zhan Wang, Theo Papatheodorou

    Abstract: This paper presents the Yangtze Sea project, an initiative in the battle against Generative AI (GAI)-generated fake con-tent. Addressing a pressing issue in the digital age, we investigate public reactions to AI-created fabrications through a structured experiment on a simulated academic conference platform. Our findings indicate a profound public challenge in discerning such content, highlighted… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: ISEA 2024 full paper https://isea2024.isea-international.org/academic-program/ conference paper, 8 pages

  28. arXiv:2407.00896  [pdf, other

    eess.SP cs.AI

    Channel Modeling Aided Dataset Generation for AI-Enabled CSI Feedback: Advances, Challenges, and Solutions

    Authors: Yupeng Li, Gang Li, Zirui Wen, Shuangfeng Han, Shijian Gao, Guangyi Liu, Jiangzhou Wang

    Abstract: The AI-enabled autoencoder has demonstrated great potential in channel state information (CSI) feedback in frequency division duplex (FDD) multiple input multiple output (MIMO) systems. However, this method completely changes the existing feedback strategies, making it impractical to deploy in recent years. To address this issue, this paper proposes a channel modeling aided data augmentation metho… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  29. Self-consistent Deep Geometric Learning for Heterogeneous Multi-source Spatial Point Data Prediction

    Authors: Dazhou Yu, Xiaoyun Gong, Yun Li, Meikang Qiu, Liang Zhao

    Abstract: Multi-source spatial point data prediction is crucial in fields like environmental monitoring and natural resource management, where integrating data from various sensors is the key to achieving a holistic environmental understanding. Existing models in this area often fall short due to their domain-specific nature and lack a strategy for integrating information from various sources in the absence… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  30. PolygonGNN: Representation Learning for Polygonal Geometries with Heterogeneous Visibility Graph

    Authors: Dazhou Yu, Yuntong Hu, Yun Li, Liang Zhao

    Abstract: Polygon representation learning is essential for diverse applications, encompassing tasks such as shape coding, building pattern classification, and geographic question answering. While recent years have seen considerable advancements in this field, much of the focus has been on single polygons, overlooking the intricate inner- and inter-polygonal relationships inherent in multipolygons. To addres… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  31. arXiv:2407.00623  [pdf, other

    cs.CV

    Consistency Purification: Effective and Efficient Diffusion Purification towards Certified Robustness

    Authors: Yiquan Li, Zhongzhu Chen, Kun **, Jiongxiao Wang, Bo Li, Chaowei Xiao

    Abstract: Diffusion Purification, purifying noised images with diffusion models, has been widely used for enhancing certified robustness via randomized smoothing. However, existing frameworks often grapple with the balance between efficiency and effectiveness. While the Denoising Diffusion Probabilistic Model (DDPM) offers an efficient single-step purification, it falls short in ensuring purified images res… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  32. arXiv:2407.00567  [pdf, other

    cs.AI cs.LG

    A Contextual Combinatorial Bandit Approach to Negotiation

    Authors: Yexin Li, Zhancun Mu, Siyuan Qi

    Abstract: Learning effective negotiation strategies poses two key challenges: the exploration-exploitation dilemma and dealing with large action spaces. However, there is an absence of learning-based approaches that effectively address these challenges in negotiation. This paper introduces a comprehensive formulation to tackle various negotiation problems. Our approach leverages contextual combinatorial mul… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  33. arXiv:2407.00379  [pdf, other

    cs.AI cs.CL

    GraphArena: Benchmarking Large Language Models on Graph Computational Problems

    Authors: Jianheng Tang, Qifan Zhang, Yuhan Li, Jia Li

    Abstract: The "arms race" of Large Language Models (LLMs) demands novel, challenging, and diverse benchmarks to faithfully examine their progresses. We introduce GraphArena, a benchmarking tool designed to evaluate LLMs on graph computational problems using million-scale real-world graphs from diverse scenarios such as knowledge graphs, social networks, and molecular structures. GraphArena offers a suite of… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  34. arXiv:2407.00352  [pdf, other

    cs.CV cs.AI

    PhyTracker: An Online Tracker for Phytoplankton

    Authors: Yang Yu, Qingxuan Lv, Yuezun Li, Zhiqiang Wei, Junyu Dong

    Abstract: Phytoplankton, a crucial component of aquatic ecosystems, requires efficient monitoring to understand marine ecological processes and environmental conditions. Traditional phytoplankton monitoring methods, relying on non-in situ observations, are time-consuming and resource-intensive, limiting timely analysis. To address these limitations, we introduce PhyTracker, an intelligent in situ tracking f… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 13pages,eleven figures

  35. arXiv:2407.00299  [pdf, other

    cs.RO cs.AI cs.CV cs.HC cs.LG

    Human-Agent Joint Learning for Efficient Robot Manipulation Skill Acquisition

    Authors: Shengcheng Luo, Quanquan Peng, Jun Lv, Kaiwen Hong, Katherine Rose Driggs-Campbell, Cewu Lu, Yong-Lu Li

    Abstract: Employing a teleoperation system for gathering demonstrations offers the potential for more efficient learning of robot manipulation. However, teleoperating a robot arm equipped with a dexterous hand or gripper, via a teleoperation system poses significant challenges due to its high dimensionality, complex motions, and differences in physiological structure. In this study, we introduce a novel s… ▽ More

    Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced July 2024.

    Comments: 8 pages, 6 figures

  36. arXiv:2407.00280  [pdf, other

    eess.IV cs.CV

    IVCA: Inter-Relation-Aware Video Complexity Analyzer

    Authors: Junqi Liao, Yao Li, Zhuoyuan Li, Li Li, Dong Liu

    Abstract: To meet the real-time analysis requirements of video streaming applications, we propose an inter-relation-aware video complexity analyzer (IVCA) as an extension to VCA. The IVCA addresses the limitation of VCA by considering inter-frame relations, namely motion and reference structure. First, we enhance the accuracy of temporal features by introducing feature-domain motion estimation into the IVCA… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: The report for the solution of second prize winner in ICIP 2024 Grand Challenge on Video Complexity (Team: USTC-iVC_Team1, USTC-iVC_Team2)

  37. arXiv:2407.00132  [pdf, other

    cs.SE cs.AI

    ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents

    Authors: Haiyang Shen, Yue Li, Desong Meng, Dongqi Cai, Sheng Qi, Li Zhang, Mengwei Xu, Yun Ma

    Abstract: Recent advancements in integrating large language models (LLMs) with application programming interfaces (APIs) have gained significant interest in both academia and industry. These API-based agents, leveraging the strong autonomy and planning capabilities of LLMs, can efficiently solve problems requiring multi-step actions. However, their ability to handle multi-dimensional difficulty levels, dive… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

  38. arXiv:2407.00128  [pdf, other

    cs.IR cs.AI cs.LG

    When Search Engine Services meet Large Language Models: Visions and Challenges

    Authors: Haoyi Xiong, Jiang Bian, Yuchen Li, Xuhong Li, Mengnan Du, Shuaiqiang Wang, Dawei Yin, Sumi Helal

    Abstract: Combining Large Language Models (LLMs) with search engine services marks a significant shift in the field of services computing, opening up new possibilities to enhance how we search for and retrieve information, understand content, and interact with internet services. This paper conducts an in-depth examination of how integrating LLMs with search engines can mutually benefit both technologies. We… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Under Review

  39. arXiv:2406.20085  [pdf, other

    cs.CV

    Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language

    Authors: Yicheng Chen, Xiangtai Li, Yining Li, Yanhong Zeng, Jianzong Wu, Xiangyu Zhao, Kai Chen

    Abstract: Diffusion-based models have shown great potential in generating high-quality images with various layouts, which can benefit downstream perception tasks. However, a fully automatic layout generation driven only by language and a suitable metric for measuring multiple generated instances has not been well explored. In this work, we present Auto Cherry-Picker (ACP), a novel framework that generates h… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 19 pages, 7 figures

  40. arXiv:2406.19972  [pdf, other

    cs.RO

    HumanVLA: Towards Vision-Language Directed Object Rearrangement by Physical Humanoid

    Authors: Xinyu Xu, Yizheng Zhang, Yong-Lu Li, Lei Han, Cewu Lu

    Abstract: Physical Human-Scene Interaction (HSI) plays a crucial role in numerous applications. However, existing HSI techniques are limited to specific object dynamics and privileged information, which prevents the development of more comprehensive applications. To address this limitation, we introduce HumanVLA for general object rearrangement directed by practical vision and language. A teacher-stud… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  41. arXiv:2406.19964  [pdf, other

    cs.CR

    Secure Outsourced Decryption for HE-based Privacy-preserving Cloud Computing System

    Authors: Xirong Ma, Chuan Li, Yuchang Hu, Yunting Tao, Yali Jiang, Yanbin Li, Fanyu Kong, Chunpeng Ge

    Abstract: The demand for processing vast volumes of data has surged dramatically due to the advancement of machine learning technology. Large-scale data processing necessitates substantial computational resources, prompting individuals and enterprises to turn to cloud services. Accompanying this trend is a growing concern regarding data leakage and misuse. Homomorphic encryption (HE) is one solution for saf… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  42. arXiv:2406.19874  [pdf, other

    cs.CL cs.AI

    Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood

    Authors: Yang Xu, Yu Wang, Hao An, Zhichen Liu, Yongyuan Li

    Abstract: Human and model-generated texts can be distinguished by examining the magnitude of likelihood in language. However, it is becoming increasingly difficult as language model's capabilities of generating human-like texts keep evolving. This study provides a new perspective by using the relative likelihood values instead of absolute ones, and extracting useful features from the spectrum-view of likeli… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 13 pages, 12 figures

    ACM Class: I.2.7

  43. arXiv:2406.19781  [pdf, other

    cs.RO

    LCSim: A Large-Scale Controllable Traffic Simulator

    Authors: Yuheng Zhang, Tianjian Ouyang, Fudan Yu, Cong Ma, Lei Qiao, Wei Wu, Jian Yuan, Yong Li

    Abstract: With the rapid development of urban transportation and the continuous advancement in autonomous vehicles, the demand for safely and efficiently testing autonomous driving and traffic optimization algorithms arises, which needs accurate modeling of large-scale urban traffic scenarios. Existing traffic simulation systems encounter two significant limitations. Firstly, they often rely on open-source… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: Submitted to the 38th Conference on Neural Information Processing Systems (NeurIPS 2024) Track on Datasets and Benchmarks

  44. arXiv:2406.19774  [pdf, other

    cs.CL

    Direct Preference Knowledge Distillation for Large Language Models

    Authors: Yixing Li, Yuxian Gu, Li Dong, Dequan Wang, Yu Cheng, Furu Wei

    Abstract: In the field of large language models (LLMs), Knowledge Distillation (KD) is a critical technique for transferring capabilities from teacher models to student models. However, existing KD methods face limitations and challenges in distillation of LLMs, including efficiency and insufficient measurement capabilities of traditional KL divergence. It is shown that LLMs can serve as an implicit reward… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  45. arXiv:2406.19720  [pdf

    cs.HC cs.AI

    CUPID: Improving Battle Fairness and Position Satisfaction in Online MOBA Games with a Re-matchmaking System

    Authors: Ge Fan, Chaoyun Zhang, Kai Wang, Yingjie Li, Junyang Chen, Zenglin Xu

    Abstract: The multiplayer online battle arena (MOBA) genre has gained significant popularity and economic success, attracting considerable research interest within the Human-Computer Interaction community. Enhancing the gaming experience requires a deep understanding of player behavior, and a crucial aspect of MOBA games is matchmaking, which aims to assemble teams of comparable skill levels. However, exist… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 38 pages, accepted by CSCW 24

  46. arXiv:2406.19645  [pdf, other

    cs.NE

    Directly Training Temporal Spiking Neural Network with Sparse Surrogate Gradient

    Authors: Yang Li, Feifei Zhao, Dongcheng Zhao, Yi Zeng

    Abstract: Brain-inspired Spiking Neural Networks (SNNs) have attracted much attention due to their event-based computing and energy-efficient features. However, the spiking all-or-none nature has prevented direct training of SNNs for various applications. The surrogate gradient (SG) algorithm has recently enabled spiking neural networks to shine in neuromorphic hardware. However, introducing surrogate gradi… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  47. arXiv:2406.19602  [pdf, other

    cs.CV cs.LG

    A Survey on Deep Clustering: From the Prior Perspective

    Authors: Yiding Lu, Haobin Li, Yunfan Li, Yijie Lin, Xi Peng

    Abstract: Facilitated by the powerful feature extraction ability of neural networks, deep clustering has achieved great success in analyzing high-dimensional and complex real-world data. The performance of deep clustering methods is affected by various factors such as network structures and learning objectives. However, as pointed out in this survey, the essence of deep clustering lies in the incorporation… ▽ More

    Submitted 30 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

  48. arXiv:2406.19532  [pdf, other

    cs.DM cs.LG

    Dataless Quadratic Neural Networks for the Maximum Independent Set Problem

    Authors: Ismail Alkhouri, Cedric Le Denmat, Yingjie Li, Cunxi Yu, Jia Liu, Rongrong Wang, Alvaro Velasquez

    Abstract: Combinatorial Optimization (CO) plays a crucial role in addressing various significant problems, among them the challenging Maximum Independent Set (MIS) problem. In light of recent advancements in deep learning methods, efforts have been directed towards leveraging data-driven learning approaches, typically rooted in supervised learning and reinforcement learning, to tackle the NP-hard MIS proble… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  49. arXiv:2406.19417  [pdf, other

    cs.CR cs.AI

    "Glue pizza and eat rocks" -- Exploiting Vulnerabilities in Retrieval-Augmented Generative Models

    Authors: Zhen Tan, Chengshuai Zhao, Raha Moraffah, Yifan Li, Song Wang, Jundong Li, Tianlong Chen, Huan Liu

    Abstract: Retrieval-Augmented Generative (RAG) models enhance Large Language Models (LLMs) by integrating external knowledge bases, improving their performance in applications like fact-checking and information searching. In this paper, we demonstrate a security threat where adversaries can exploit the openness of these knowledge bases by injecting deceptive content into the retrieval database, intentionall… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Preprint

  50. arXiv:2406.19396  [pdf, other

    cs.CE

    SimLOB: Learning Representations of Limited Order Book for Financial Market Simulation

    Authors: Yuanzhe Li, Yue Wu, Peng Yang

    Abstract: Financial market simulation (FMS) serves as a promising tool for understanding market anomalies and the underlying trading behaviors. To ensure high-fidelity simulations, it is crucial to calibrate the FMS model for generating data closely resembling the observed market data. Previous efforts primarily focused on calibrating the mid-price data, leading to essential information loss of the market a… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.