Skip to main content

Showing 1–50 of 429 results for author: Peng, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01100  [pdf, other

    cs.CL cs.LG

    Eliminating Position Bias of Language Models: A Mechanistic Approach

    Authors: Ziqi Wang, Hanlin Zhang, Xiner Li, Kuan-Hao Huang, Chi Han, Shuiwang Ji, Sham M. Kakade, Hao Peng, Heng Ji

    Abstract: Position bias has proven to be a prevalent issue of modern language models (LMs), where the models prioritize content based on its position within the given context. This bias often leads to unexpected model failures and hurts performance, robustness, and reliability across various applications. Our mechanistic analysis attributes the position bias to two components employed in nearly all state-of… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 18 pages, 5 figures

  2. arXiv:2407.00615  [pdf, other

    cs.LG

    GC-Bench: An Open and Unified Benchmark for Graph Condensation

    Authors: Qingyun Sun, Ziying Chen, Beining Yang, Cheng Ji, Xingcheng Fu, Sheng Zhou, Hao Peng, Jianxin Li, Philip S. Yu

    Abstract: Graph condensation (GC) has recently garnered considerable attention due to its ability to reduce large-scale graph datasets while preserving their essential properties. The core concept of GC is to create a smaller, more manageable graph that retains the characteristics of the original graph. Despite the proliferation of graph condensation methods developed in recent years, there is no comprehens… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Preprint, under review)

  3. arXiv:2406.18379  [pdf, other

    cs.CR cs.AI cs.SE

    MALSIGHT: Exploring Malicious Source Code and Benign Pseudocode for Iterative Binary Malware Summarization

    Authors: Haolang Lu, Hongrui Peng, Guoshun Nan, Jiaoyang Cui, Cheng Wang, Weifei **

    Abstract: Binary malware summarization aims to automatically generate human-readable descriptions of malware behaviors from executable files, facilitating tasks like malware cracking and detection. Previous methods based on Large Language Models (LLMs) have shown great promise. However, they still face significant issues, including poor usability, inaccurate explanations, and incomplete summaries, primarily… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 17 pages, 14 figures

  4. arXiv:2406.16583  [pdf, other

    cs.LG cs.CV

    Personalized federated learning based on feature fusion

    Authors: Wolong Xing, Zhenkui Shi, Hongyan Peng, Xiantao Hu, Xianxian Li

    Abstract: Federated learning enables distributed clients to collaborate on training while storing their data locally to protect client privacy. However, due to the heterogeneity of data, models, and devices, the final global model may need to perform better for tasks on each client. Communication bottlenecks, data heterogeneity, and model heterogeneity have been common challenges in federated learning. In t… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  5. arXiv:2406.14449  [pdf, other

    cs.AI

    APEER: Automatic Prompt Engineering Enhances Large Language Model Reranking

    Authors: Can **, Hongwu Peng, Shiyu Zhao, Zhenting Wang, Wujiang Xu, Ligong Han, Jiahui Zhao, Kai Zhong, Sanguthevar Rajasekaran, Dimitris N. Metaxas

    Abstract: Large Language Models (LLMs) have significantly enhanced Information Retrieval (IR) across various modules, such as reranking. Despite impressive performance, current zero-shot relevance ranking with LLMs heavily relies on human prompt engineering. Existing automatic prompt engineering algorithms primarily focus on language modeling and classification tasks, leaving the domain of IR, particularly… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  6. arXiv:2406.13511  [pdf, other

    cs.DC

    Slice-Level Scheduling for High Throughput and Load Balanced LLM Serving

    Authors: Ke Cheng, Wen Hu, Zhi Wang, Hongen Peng, Jianguo Li, Sheng Zhang

    Abstract: Large language models (LLMs) iteratively generate text token by token, with memory usage increasing with the length of generated token sequences. The unpredictability of generation lengths makes it difficult to estimate the time and memory needed to process requests, posing a challenge for effective request scheduling. Conventional sequence-level scheduling (SLS) serves requests in a first-come fi… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 13 pages, 22 figures

  7. arXiv:2406.11633  [pdf, other

    cs.CV

    DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

    Authors: Renqiu Xia, Song Mao, Xiangchao Yan, Hongbin Zhou, Bo Zhang, Haoyang Peng, Jiahao Pi, Daocheng Fu, Wenjie Wu, Hancheng Ye, Shiyang Feng, Bin Wang, Chao Xu, Conghui He, Pinlong Cai, Min Dou, Botian Shi, Sheng Zhou, Yongwei Wang, Bin Wang, Junchi Yan, Fei Wu, Yu Qiao

    Abstract: Scientific documents record research findings and valuable human knowledge, comprising a vast corpus of high-quality data. Leveraging multi-modality data extracted from these documents and assessing large models' abilities to handle scientific document-oriented tasks is therefore meaningful. Despite promising advancements, large models still perform poorly on multi-page scientific document extract… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Homepage of DocGenome: https://unimodal4reasoning.github.io/DocGenome_page 22 pages, 11 figures

  8. arXiv:2406.09870  [pdf, other

    cs.LG cs.AI

    IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning

    Authors: Jiawen Qin, Haonan Yuan, Qingyun Sun, Lyu** Xu, Jiaqi Yuan, Pengfeng Huang, Zhaonan Wang, Xingcheng Fu, Hao Peng, Jianxin Li, Philip S. Yu

    Abstract: Deep graph learning has gained grand popularity over the past years due to its versatility and success in representing graph data across a wide range of domains. However, the pervasive issue of imbalanced graph data distributions, where certain parts exhibit disproportionally abundant data while others remain sparse, undermines the efficacy of conventional graph learning algorithms, leading to bia… ▽ More

    Submitted 19 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

    Comments: The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Preprint, under review)

  9. arXiv:2406.06887  [pdf, other

    cs.CL cs.AI cs.LG cs.PL cs.SE

    PLUM: Preference Learning Plus Test Cases Yields Better Code Language Models

    Authors: Dylan Zhang, Shizhe Diao, Xueyan Zou, Hao Peng

    Abstract: Instruction-finetuned code language models (LMs) have shown promise in various programming tasks. They are trained, using a language modeling objective, on natural language instructions and gold code snippet pairs. Recent evidence suggests that these models, never exposed to incorrect solutions during training, often struggle to distinguish between correct and incorrect solutions. This observation… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  10. arXiv:2406.04542  [pdf, other

    cs.CV cs.GR

    M&M VTO: Multi-Garment Virtual Try-On and Editing

    Authors: Luyang Zhu, Yingwei Li, Nan Liu, Hao Peng, Dawei Yang, Ira Kemelmacher-Shlizerman

    Abstract: We present M&M VTO, a mix and match virtual try-on method that takes as input multiple garment images, text description for garment layout and an image of a person. An example input includes: an image of a shirt, an image of a pair of pants, "rolled sleeves, shirt tucked in", and an image of a person. The output is a visualization of how those garments (in the desired layout) would look like on th… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: CVPR 2024 Highlight. Project website: https://mmvto.github.io/

  11. arXiv:2406.02798  [pdf

    cs.DL cs.CL cs.CY

    Promotional Language and the Adoption of Innovative Ideas in Science

    Authors: Hao Peng, Huilian Sophie Qiu, Henrik Barslund Fosse, Brian Uzzi

    Abstract: How are the merits of innovative ideas communicated in science? Here we conduct semantic analyses of grant application success with a focus on scientific promotional language, which has been growing in frequency in many contexts and purportedly may convey an innovative idea's originality and significance. Our analysis attempts to surmount limitations of prior studies by examining the full text of… ▽ More

    Submitted 7 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  12. arXiv:2406.02629  [pdf, other

    cs.CR cs.LG

    SSNet: A Lightweight Multi-Party Computation Scheme for Practical Privacy-Preserving Machine Learning Service in the Cloud

    Authors: Shi** Duan, Chenghong Wang, Hongwu Peng, Yukui Luo, Wujie Wen, Caiwen Ding, Xiaolin Xu

    Abstract: As privacy-preserving becomes a pivotal aspect of deep learning (DL) development, multi-party computation (MPC) has gained prominence for its efficiency and strong security. However, the practice of current MPC frameworks is limited, especially when dealing with large neural networks, exemplified by the prolonged execution time of 25.8 seconds for secure inference on ResNet-152. The primary challe… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 16 pages, 9 figures

  13. arXiv:2405.20335  [pdf, other

    cs.CL

    Xwin-LM: Strong and Scalable Alignment Practice for LLMs

    Authors: Bolin Ni, **gCheng Hu, Yixuan Wei, Houwen Peng, Zheng Zhang, Gaofeng Meng, Han Hu

    Abstract: In this work, we present Xwin-LM, a comprehensive suite of alignment methodologies for large language models (LLMs). This suite encompasses several key techniques, including supervised finetuning (SFT), reward modeling (RM), rejection sampling finetuning (RS), and direct preference optimization (DPO). The key components are as follows: (1) Xwin-LM-SFT, models initially finetuned with high-quality… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  14. arXiv:2405.19971  [pdf, other

    cs.CR cs.LG

    GasTrace: Detecting Sandwich Attack Malicious Accounts in Ethereum

    Authors: Zekai Liu, Xiaoqi Li, Hongli Peng, Wenkai Li

    Abstract: The openness and transparency of Ethereum transaction data make it easy to be exploited by any entities, executing malicious attacks. The sandwich attack manipulates the Automated Market Maker (AMM) mechanism, profiting from manipulating the market price through front or after-running transactions. To identify and prevent sandwich attacks, we propose a cascade classification framework GasTrace. Ga… ▽ More

    Submitted 9 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  15. arXiv:2405.17282  [pdf, other

    cs.SI cs.LG

    R-ODE: Ricci Curvature Tells When You Will be Informed

    Authors: Li Sun, **gbin Hu, Mengjie Li, Hao Peng

    Abstract: Information diffusion prediction is fundamental to understand the structure and organization of the online social networks, and plays a crucial role to blocking rumor spread, influence maximization, political propaganda, etc. So far, most existing solutions primarily predict the next user who will be informed with historical cascades, but ignore an important factor in the diffusion process - the t… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: Accepted by SIGIR 2024

  16. arXiv:2405.15458  [pdf, other

    cs.LG cs.DC

    FedCal: Achieving Local and Global Calibration in Federated Learning via Aggregated Parameterized Scaler

    Authors: Hongyi Peng, Han Yu, Xiaoli Tang, Xiaoxiao Li

    Abstract: Federated learning (FL) enables collaborative machine learning across distributed data owners, but data heterogeneity poses a challenge for model calibration. While prior work focused on improving accuracy for non-iid data, calibration remains under-explored. This study reveals existing FL aggregation approaches lead to sub-optimal calibration, and theoretical analysis shows despite constraining v… ▽ More

    Submitted 3 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: This paper has been accepted by ICML'24

  17. arXiv:2405.14103  [pdf, other

    cs.LG

    Online Self-Preferring Language Models

    Authors: Yuanzhao Zhai, Zhuo Zhang, Kele Xu, Hanyang Peng, Yue Yu, Dawei Feng, Cheng Yang, Bo Ding, Huaimin Wang

    Abstract: Aligning with human preference datasets has been critical to the success of large language models (LLMs). Reinforcement learning from human feedback (RLHF) employs a costly reward model to provide feedback for on-policy sampling responses. Recently, offline methods that directly fit responses with binary preferences in the dataset have emerged as alternatives. However, existing methods do not expl… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 20 pages, 9 figures

  18. arXiv:2405.12684  [pdf, other

    stat.ML cs.LG

    Model Free Prediction with Uncertainty Assessment

    Authors: Yuling Jiao, Lican Kang, ** Liu, Heng Peng, Heng Zuo

    Abstract: Deep nonparametric regression, characterized by the utilization of deep neural networks to learn target functions, has emerged as a focus of research attention in recent years. Despite considerable progress in understanding convergence rates, the absence of asymptotic properties hinders rigorous statistical inference. To address this gap, we propose a novel framework that transforms the deep estim… ▽ More

    Submitted 16 June, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

  19. arXiv:2405.11801  [pdf, other

    cs.LG

    LSEnet: Lorentz Structural Entropy Neural Network for Deep Graph Clustering

    Authors: Li Sun, Zhenhao Huang, Hao Peng, Yujie Wang, Chunyang Liu, Philip S. Yu

    Abstract: Graph clustering is a fundamental problem in machine learning. Deep learning methods achieve the state-of-the-art results in recent years, but they still cannot work without predefined cluster numbers. Such limitation motivates us to pose a more challenging problem of graph clustering with unknown cluster number. We propose to address this problem from a fresh perspective of graph information theo… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: Accepted by ICML24, 26 pages

  20. arXiv:2405.11225  [pdf, other

    cs.SI cs.AI

    SeBot: Structural Entropy Guided Multi-View Contrastive Learning for Social Bot Detection

    Authors: Yingguang Yang, Qi Wu, Buyun He, Hao Peng, Renyu Yang, Zhifeng Hao, Yong Liao

    Abstract: Recent advancements in social bot detection have been driven by the adoption of Graph Neural Networks. The social graph, constructed from social network interactions, contains benign and bot accounts that influence each other. However, previous graph-based detection methods that follow the transductive message-passing paradigm may not fully utilize hidden graph information and are vulnerable to ad… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: KDD 2024

  21. arXiv:2405.07096  [pdf, other

    cs.SI cs.IT

    Multi-Relational Structural Entropy

    Authors: Yuwei Cao, Hao Peng, Angsheng Li, Chenyu You, Zhifeng Hao, Philip S Yu

    Abstract: Structural Entropy (SE) measures the structural information contained in a graph. Minimizing or maximizing SE helps to reveal or obscure the intrinsic structural patterns underlying graphs in an interpretable manner, finding applications in various tasks driven by networked data. However, SE ignores the heterogeneity inherent in the graph relations, which is ubiquitous in modern networks. In this… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: Accepted to UAI 2024

  22. arXiv:2405.06886  [pdf, other

    cs.IR cs.AI cs.CL

    Event GDR: Event-Centric Generative Document Retrieval

    Authors: Yong Guan, Dingxiao Liu, **chen Ma, Hao Peng, Xiaozhi Wang, Lei Hou, Ru Li

    Abstract: Generative document retrieval, an emerging paradigm in information retrieval, learns to build connections between documents and identifiers within a single model, garnering significant attention. However, there are still two challenges: (1) neglecting inner-content correlation during document representation; (2) lacking explicit semantic structure during identifier construction. Nonetheless, event… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: Accepted to WWW 2024

  23. arXiv:2405.06705  [pdf, other

    cs.CL cs.AI

    LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought

    Authors: Zhuoxuan Jiang, Haoyuan Peng, Shanshan Feng, Fan Li, Dongsheng Li

    Abstract: Self-correction is emerging as a promising approach to mitigate the issue of hallucination in Large Language Models (LLMs). To facilitate effective self-correction, recent research has proposed mistake detection as its initial step. However, current literature suggests that LLMs often struggle with reliably identifying reasoning mistakes when using simplistic prompting strategies. To address this… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: To appear at IJCAI 2024

  24. arXiv:2405.05008  [pdf, other

    cs.CL

    ADELIE: Aligning Large Language Models on Information Extraction

    Authors: Yunjia Qi, Hao Peng, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li

    Abstract: Large language models (LLMs) usually fall short on information extraction (IE) tasks and struggle to follow the complex instructions of IE tasks. This primarily arises from LLMs not being aligned with humans, as mainstream alignment datasets typically do not include IE data. In this paper, we introduce ADELIE (Aligning large language moDELs on Information Extraction), an aligned LLM that effective… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  25. arXiv:2405.03188  [pdf, other

    cs.LG

    Hyperbolic Geometric Latent Diffusion Model for Graph Generation

    Authors: Xingcheng Fu, Yisen Gao, Yuecen Wei, Qingyun Sun, Hao Peng, Jianxin Li, Xianxian Li

    Abstract: Diffusion models have made significant contributions to computer vision, sparking a growing interest in the community recently regarding the application of them to graph generation. Existing discrete graph diffusion models exhibit heightened computational complexity and diminished training efficiency. A preferable and natural way is to directly diffuse the graph within the latent space. However, d… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: Accepted by the 41st International Conference on Machine Learning (ICML 2024)

  26. arXiv:2405.02686  [pdf, other

    cs.CV cs.AI

    Boosting 3D Neuron Segmentation with 2D Vision Transformer Pre-trained on Natural Images

    Authors: Yik San Cheng, Runkai Zhao, Heng Wang, Hanchuan Peng, Weidong Cai

    Abstract: Neuron reconstruction, one of the fundamental tasks in neuroscience, rebuilds neuronal morphology from 3D light microscope imaging data. It plays a critical role in analyzing the structure-function relationship of neurons in the nervous system. However, due to the scarcity of neuron datasets and high-quality SWC annotations, it is still challenging to develop robust segmentation methods for single… ▽ More

    Submitted 4 May, 2024; originally announced May 2024.

    Comments: 3 pages

  27. arXiv:2404.16687  [pdf, other

    cs.CV

    NTIRE 2024 Quality Assessment of AI-Generated Content Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Guangtao Zhai, Chunyi Li, Tengchuan Kou, Wei Sun, Haoning Wu, Yixuan Gao, Yuqin Cao, Zicheng Zhang, Xiele Wu, Radu Timofte, Fei Peng, Huiyuan Fu, Anlong Ming, Chuanming Wang, Huadong Ma, Shuai He, Zifei Dou, Shu Chen, Huacong Zhang, Haiyi Xie, Chengwei Wang, Baoying Chen, Jishen Zeng , et al. (89 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2024 Quality Assessment of AI-Generated Content Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2024. This challenge is to address a major challenge in the field of image and video processing, namely, Image Quality Assessment (IQA) and Video Quality Assessment (VQA) for AI-Generated Conte… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

  28. arXiv:2404.15574  [pdf, other

    cs.CL

    Retrieval Head Mechanistically Explains Long-Context Factuality

    Authors: Wenhao Wu, Yizhong Wang, Guangxuan Xiao, Hao Peng, Yao Fu

    Abstract: Despite the recent progress in long-context language models, it remains elusive how transformer-based models exhibit the capability to retrieve relevant information from arbitrary locations within the long context. This paper aims to address this question. Our systematic investigation across a wide spectrum of models reveals that a special type of attention heads are largely responsible for retrie… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Preprint

  29. arXiv:2404.15381  [pdf, other

    cs.LG cs.AI

    Advances and Open Challenges in Federated Learning with Foundation Models

    Authors: Chao Ren, Han Yu, Hongyi Peng, Xiaoli Tang, Anran Li, Yulan Gao, Alysa Ziying Tan, Bo Zhao, Xiaoxiao Li, Zengxiang Li, Qiang Yang

    Abstract: The integration of Foundation Models (FMs) with Federated Learning (FL) presents a transformative paradigm in Artificial Intelligence (AI), offering enhanced capabilities while addressing concerns of privacy, data decentralization, and computational efficiency. This paper provides a comprehensive survey of the emerging field of Federated Foundation Models (FedFM), elucidating their synergistic rel… ▽ More

    Submitted 29 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Survey of Federated Foundation Models (FedFM)

  30. arXiv:2404.15070  [pdf, other

    cs.SI cs.AI

    BotDGT: Dynamicity-aware Social Bot Detection with Dynamic Graph Transformers

    Authors: Buyun He, Yingguang Yang, Qi Wu, Hao Liu, Renyu Yang, Hao Peng, Xiang Wang, Yong Liao, Pengyuan Zhou

    Abstract: Detecting social bots has evolved into a pivotal yet intricate task, aimed at combating the dissemination of misinformation and preserving the authenticity of online interactions. While earlier graph-based approaches, which leverage topological structure of social networks, yielded notable outcomes, they overlooked the inherent dynamicity of social networks -- In reality, they largely depicted the… ▽ More

    Submitted 24 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: IJCAI 2024

  31. Unsupervised Social Bot Detection via Structural Information Theory

    Authors: Hao Peng, **gyun Zhang, Xiang Huang, Zhifeng Hao, Angsheng Li, Zhengtao Yu, Philip S. Yu

    Abstract: Research on social bot detection plays a crucial role in maintaining the order and reliability of information dissemination while increasing trust in social interactions. The current mainstream social bot detection models rely on black-box neural network technology, e.g., Graph Neural Network, Transformer, etc., which lacks interpretability. In this work, we present UnDBot, a novel unsupervised, i… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 42 pages, 12 figures, accepted for publication in Transactions on Information Systems

  32. arXiv:2404.12852  [pdf, other

    cs.CR cs.CV cs.LG

    LSP Framework: A Compensatory Model for Defeating Trigger Reverse Engineering via Label Smoothing Poisoning

    Authors: Beichen Li, Yuanfang Guo, Heqi Peng, Yangxi Li, Yunhong Wang

    Abstract: Deep neural networks are vulnerable to backdoor attacks. Among the existing backdoor defense methods, trigger reverse engineering based approaches, which reconstruct the backdoor triggers via optimizations, are the most versatile and effective ones compared to other types of methods. In this paper, we summarize and construct a generic paradigm for the typical trigger reverse engineering process. B… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  33. arXiv:2404.12635  [pdf, other

    cs.CV cs.CR cs.LG

    AED-PADA:Improving Generalizability of Adversarial Example Detection via Principal Adversarial Domain Adaptation

    Authors: Heqi Peng, Yunhong Wang, Ruijie Yang, Beichen Li, Rui Wang, Yuanfang Guo

    Abstract: Adversarial example detection, which can be conveniently applied in many scenarios, is important in the area of adversarial defense. Unfortunately, existing detection methods suffer from poor generalization performance, because their training process usually relies on the examples generated from a single known adversarial attack and there exists a large discrepancy between the training and unseen… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  34. arXiv:2404.09760  [pdf, other

    cs.LG cs.AI

    Effective Reinforcement Learning Based on Structural Information Principles

    Authors: Xianghua Zeng, Hao Peng, Dingli Su, Angsheng Li

    Abstract: Although Reinforcement Learning (RL) algorithms acquire sequential behavioral patterns through interactions with the environment, their effectiveness in noisy and high-dimensional scenarios typically relies on specific structural priors. In this paper, we propose a novel and general Structural Information principles-based framework for effective Decision-Making, namely SIDM, approached from an inf… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  35. arXiv:2404.09285  [pdf, other

    cs.DC

    Egret: Reinforcement Mechanism for Sequential Computation Offloading in Edge Computing

    Authors: Haosong Peng, Yufeng Zhan, DiHua Zhai, Xiaopu Zhang, Yuanqing Xia

    Abstract: As an emerging computing paradigm, edge computing offers computing resources closer to the data sources, hel** to improve the service quality of many real-time applications. A crucial problem is designing a rational pricing mechanism to maximize the revenue of the edge computing service provider (ECSP). However, prior works have considerable limitations: clients are static and are required to di… ▽ More

    Submitted 29 April, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

    Comments: Submitted to IEEE TSC

  36. arXiv:2404.09267  [pdf, other

    cs.DC

    Tangram: High-resolution Video Analytics on Serverless Platform with SLO-aware Batching

    Authors: Haosong Peng, Yufeng Zhan, Peng Li, Yuanqing Xia

    Abstract: Cloud-edge collaborative computing paradigm is a promising solution to high-resolution video analytics systems. The key lies in reducing redundant data and managing fluctuating inference workloads effectively. Previous work has focused on extracting regions of interest (RoIs) from videos and transmitting them to the cloud for processing. However, a naive Infrastructure as a Service (IaaS) resource… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: Accepted by IEEE International Conference on Distributed Computing Systems (ICDCS) 2024

  37. arXiv:2404.09245  [pdf, other

    cs.MM cs.CV

    Arena: A Patch-of-Interest ViT Inference Acceleration System for Edge-Assisted Video Analytics

    Authors: Haosong Peng, Wei Feng, Hao Li, Yufeng Zhan, Qihua Zhou, Yuanqing Xia

    Abstract: The advent of edge computing has made real-time intelligent video analytics feasible. Previous works, based on traditional model architecture (e.g., CNN, RNN, etc.), employ various strategies to filter out non-region-of-interest content to minimize bandwidth and computation consumption but show inferior performance in adverse environments. Recently, visual foundation models based on transformers h… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  38. arXiv:2404.08263  [pdf, other

    cs.CL cs.AI cs.LG cs.SI

    Relational Prompt-based Pre-trained Language Models for Social Event Detection

    Authors: Pu Li, Xiaoyan Yu, Hao Peng, Yantuan Xian, Linqin Wang, Li Sun, **gyun Zhang, Philip S. Yu

    Abstract: Social Event Detection (SED) aims to identify significant events from social streams, and has a wide application ranging from public opinion analysis to risk management. In recent years, Graph Neural Network (GNN) based solutions have achieved state-of-the-art performance. However, GNN-based methods often struggle with noisy and missing edges between messages, affecting the quality of learned mess… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: ACM TOIS Under Review

  39. arXiv:2404.02078  [pdf, other

    cs.AI cs.CL cs.LG

    Advancing LLM Reasoning Generalists with Preference Trees

    Authors: Lifan Yuan, Ganqu Cui, Hanbin Wang, Ning Ding, Xingyao Wang, Jia Deng, Boji Shan, Huimin Chen, Ruobing Xie, Yankai Lin, Zhenghao Liu, Bowen Zhou, Hao Peng, Zhiyuan Liu, Maosong Sun

    Abstract: We introduce Eurus, a suite of large language models (LLMs) optimized for reasoning. Finetuned from Mistral-7B and CodeLlama-70B, Eurus models achieve state-of-the-art results among open-source models on a diverse set of benchmarks covering mathematics, code generation, and logical reasoning problems. Notably, Eurus-70B beats GPT-3.5 Turbo in reasoning through a comprehensive benchmarking across 1… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Models and data are available at https://github.com/OpenBMB/Eurus

  40. arXiv:2404.01019  [pdf, other

    cs.CL cs.AI

    Source-Aware Training Enables Knowledge Attribution in Language Models

    Authors: Muhammad Khalifa, David Wadden, Emma Strubell, Honglak Lee, Lu Wang, Iz Beltagy, Hao Peng

    Abstract: Large language models (LLMs) learn a vast amount of knowledge during pretraining, but they are often oblivious to the source(s) of such knowledge. We investigate the problem of intrinsic source citation, where LLMs are required to cite the pretraining source supporting a generated response. Intrinsic source citation can enhance LLM transparency, interpretability, and verifiability. To give LLMs su… ▽ More

    Submitted 11 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  41. arXiv:2403.19063  [pdf, other

    cs.IR

    Instruction-based Hypergraph Pretraining

    Authors: Mingdai Yang, Zhiwei Liu, Liangwei Yang, Xiaolong Liu, Chen Wang, Hao Peng, Philip S. Yu

    Abstract: Pretraining has been widely explored to augment the adaptability of graph learning models to transfer knowledge from large datasets to a downstream task, such as link prediction or classification. However, the gap between training objectives and the discrepancy between data distributions in pretraining and downstream tasks hinders the transfer of the pretrained knowledge. Inspired by instruction-b… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted by SIGIR'24

  42. SolderlessPCB: Reusing Electronic Components in PCB Prototy** through Detachable 3D Printed Housings

    Authors: Zeyu Yan, Jiasheng Li, Zining Zhang, Huaishu Peng

    Abstract: The iterative prototy** process for printed circuit boards (PCBs) frequently employs surface-mounted device (SMD) components, which are often discarded rather than reused due to the challenges associated with desoldering, leading to unnecessary electronic waste. This paper introduces SolderlessPCB, a collection of techniques for solder-free PCB prototy**, specifically designed to promote the r… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Journal ref: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

  43. arXiv:2403.18540  [pdf, other

    stat.ML cs.LG stat.CO

    skscope: Fast Sparsity-Constrained Optimization in Python

    Authors: Zezhi Wang, ** Zhu, Peng Chen, Huiyang Peng, Xiaoke Zhang, Anran Wang, Yu Zheng, Junxian Zhu, Xueqin Wang

    Abstract: Applying iterative solvers on sparsity-constrained optimization (SCO) requires tedious mathematical deduction and careful programming/debugging that hinders these solvers' broad impact. In the paper, the library skscope is introduced to overcome such an obstacle. With skscope, users can solve the SCO by just programming the objective function. The convenience of skscope is demonstrated through two… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 4 pages

  44. arXiv:2403.14733  [pdf

    cs.AI cs.CL cs.LG

    Open Knowledge Base Canonicalization with Multi-task Learning

    Authors: Bingchen Liu, Huang Peng, Weixin Zeng, Xiang Zhao, Shijun Liu, Li Pan

    Abstract: The construction of large open knowledge bases (OKBs) is integral to many knowledge-driven applications on the world wide web such as web search. However, noun phrases and relational phrases in OKBs often suffer from redundancy and ambiguity, which calls for the investigation on OKB canonicalization. Current solutions address OKB canonicalization by devising advanced clustering algorithms and usin… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2310.16419

  45. arXiv:2403.10361  [pdf, other

    cs.CR

    Unveiling Wash Trading in Popular NFT Markets

    Authors: Yuanzheng Niu, Xiaoqi Li, Hongli Peng, Wenkai Li

    Abstract: As emerging digital assets, NFTs are susceptible to anomalous trading behaviors due to the lack of stringent regulatory mechanisms, potentially causing economic losses. In this paper, we conduct the first systematic analysis of four non-fungible tokens (NFT) markets. Specifically, we analyze more than 25 million transactions within these markets, to explore the evolution of wash trade activities.… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: This paper has been accepted by WWW 2024

  46. arXiv:2403.04706  [pdf, other

    cs.CL cs.AI

    Common 7B Language Models Already Possess Strong Math Capabilities

    Authors: Chen Li, Weiqi Wang, **gcheng Hu, Yixuan Wei, Nanning Zheng, Han Hu, Zheng Zhang, Houwen Peng

    Abstract: Mathematical capabilities were previously believed to emerge in common language models only at a very large scale or require extensive math-related pre-training. This paper shows that the LLaMA-2 7B model with common pre-training already exhibits strong mathematical abilities, as evidenced by its impressive accuracy of 97.7% and 72.0% on the GSM8K and MATH benchmarks, respectively, when selecting… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  47. arXiv:2403.04175  [pdf

    physics.med-ph cs.AI

    Understanding the PULSAR Effect in Combined Radiotherapy and Immunotherapy through Attention Mechanisms with a Transformer Model

    Authors: Hao Peng, Casey Moore, Debabrata Saha, Steve Jiang, Robert Timmerman

    Abstract: PULSAR (personalized, ultra-fractionated stereotactic adaptive radiotherapy) is the adaptation of stereotactic ablative radiotherapy towards personalized cancer management. For the first time, we applied a transformer-based attention mechanism to investigate the underlying interactions between combined PULSAR and PD-L1 blockade immunotherapy based on a murine cancer model (Lewis Lung Carcinoma, LL… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  48. arXiv:2402.17214  [pdf, other

    cs.CV

    CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose Canonicalization

    Authors: Hao-Yang Peng, Jia-Peng Zhang, Meng-Hao Guo, Yan-Pei Cao, Shi-Min Hu

    Abstract: In the field of digital content creation, generating high-quality 3D characters from single images is challenging, especially given the complexities of various body poses and the issues of self-occlusion and pose ambiguity. In this paper, we present CharacterGen, a framework developed to efficiently generate 3D characters. CharacterGen introduces a streamlined generation pipeline along with an ima… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  49. arXiv:2402.16294  [pdf, other

    cs.CR cs.AI

    Decentralized Federated Unlearning on Blockchain

    Authors: Xiao Liu, Mingyuan Li, Xu Wang, Guangsheng Yu, Wei Ni, Lixiang Li, Haipeng Peng, Ren** Liu

    Abstract: Blockchained Federated Learning (FL) has been gaining traction for ensuring the integrity and traceability of FL processes. Blockchained FL involves participants training models locally with their data and subsequently publishing the models on the blockchain, forming a Directed Acyclic Graph (DAG)-like inheritance structure that represents the model relationship. However, this particular DAG-based… ▽ More

    Submitted 25 February, 2024; originally announced February 2024.

  50. arXiv:2402.14034  [pdf, other

    cs.MA cs.AI

    AgentScope: A Flexible yet Robust Multi-Agent Platform

    Authors: Dawei Gao, Zitao Li, Xuchen Pan, Weirui Kuang, Zhijian Ma, Bingchen Qian, Fei Wei, Wenhao Zhang, Yuexiang Xie, Daoyuan Chen, Liuyi Yao, Hongyi Peng, Zeyu Zhang, Lin Zhu, Chen Cheng, Hongzhu Shi, Yaliang Li, Bolin Ding, **gren Zhou

    Abstract: With the rapid advancement of Large Language Models (LLMs), significant progress has been made in multi-agent applications. However, the complexities in coordinating agents' cooperation and LLMs' erratic performance pose notable challenges in develo** robust and efficient multi-agent applications. To tackle these challenges, we propose AgentScope, a developer-centric multi-agent platform with me… ▽ More

    Submitted 20 May, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: We have released code on https://github.com/modelscope/agentscope